PRGs Wiki - User contributions [en-gb]

Text Analysis:Bibliometrix

2019-12-23T11:50:36Z

Phil:

Page for various tools for the processing of bibliometrix data, these tools were originally developed for a project that was looking into academic communities by exploring search results from Web of Knowledge, at the time specifically Complex Thinking. Most of the tools are built in Neo4J and R. The programs build a network of papers and the work they cite.

The first bit of code that is useful is for turning data loaded from Web of Science into the R Bibliometrix package to a CSV file that is useful for then loading into

<syntaxhighlight lang=R style="border:3px dashed blue">
library(bibliometrix)
library(igraph)
library(ggplot2)

#reads in the exported
D <- readFiles("bib_export.bib")
#convert it into a matrix
M <- convert2df(D, dbsource = "isi", format = "bibtex")

#pull out the bits of the data that we want, could add more in here.
cited=M$CR
author=M$AU
key = M$SR
title = M$TI
data = cbind(key,title,author,cited)

#build a data frame for the export, there must be a better way of doing this.
E_List <- NULL;
for(i in 1:length(data[,1])){
cit_list = unlist(strsplit(data[i,4], split=". "))
for(s in 1:length(cit_list))
{
E_List <- rbind(E_List, c(data[i,1],cit_list[s]))
}
}

#save the data
write.csv2(E_List, file = "NetExport.txt", row.names = FALSE)
</syntaxhighlight>

That produces a csv file that Ne04J can load using the following command.

<syntaxhighlight lang="text" style="border:3px dashed blue">
LOAD CSV FROM "file:/NetExport.txt" AS line FIELDTERMINATOR ';' MATCH (a:Paper),(b:Reference) WHERE a.name = line[0] AND b.name = line[1] MERGE (a)-[r:Cites]->(b) RETURN r
</syntaxhighlight>

That loads the data into a Neo4J database, that could probably be done in one step. I will probably update this later. Next we can do things with the network in R using iGraph.

<syntaxhighlight lang=R style="border:3px dashed blue">
library(igraph);
library(RNeo4j); #This needs to be installed with devtools to get a new enough version to interface with Neo4J

graph = startGraph("http://localhost:7474/db/data/", username = "neo4j", password= "Nufoa23")

papersQuery = "MATCH (p:Paper) RETURN id(p) AS id, p.name AS pName, labels(p)";
refsQuery = "MATCH (r:Reference) RETURN id(r) AS id, r.name AS Name, labels(r)";

papers = cypher(graph, papersQuery)
colnames(papers) = c("ID","Name","Type")
references = cypher(graph, refsQuery)
colnames(references) = c("ID","Name","Type")
nodes = rbind(papers,references)

#Edit the whole graph
wholeGraphQ = "MATCH (p:Paper)-[r:Cites]->(s:Reference) RETURN id(p) AS pID, id(s) AS sID"
relations = cypher(graph, wholeGraphQ)
wG = graph.data.frame(relations,directed=TRUE,nodes)
#V(wG)$label.cex <- 0.5
V(wG)$color <- ifelse(V(wG)$Type == "Paper", "lightblue", "orange")
V(wG)$shape <- ifelse(V(wG)$Type == "Reference", "square", "circle")

area = vcount(wG)^2

co <- layout_with_fr(wG, grid=c("nogrid"))

#save the whole graph
pdf("~/Documents/wGraph.pdf",10,10)
plot(wG, layout=co, vertex.size=1, edge.arrow.size=0.3, vertex.label="")
dev.off()

#Papers by the cited works, weighted network
papersByRefs = "MATCH path=(n:Paper)-->(d:Reference)<--(m:Paper) WHERE NOT id(n) = id(m) AND id(n) < id(m) RETURN n.name AS Paper1, m.name AS Paper2, count(d) AS Weight"

pByRefRels = cypher(graph, papersByRefs)
prG = graph.data.frame(pByRefRels,directed = FALSE)
area = vcount(prG)^2

co <- layout_with_fr(prG, grid=c("nogrid"))

papByRClust = cluster_fast_greedy(prG, merges = TRUE, modularity = TRUE,
membership = TRUE, weights = E(prG)$weight)
V(prG)$color <- papByRClust$membership + 1

PapCl_out = cbind(V(prG)$name,papByRClust$membership)

write_graph(prG, file ="~/Documents/PapersNet.graphml", format = c("graphml"))
write.csv2(cl_out, file ="~/Documents/PapersClusters.txt")

#Save paper-paper graph as pdf
pdf("~/Documents/ppGraph.pdf",10,10)
plot(clusters, prG, layout=co, vertex.size=2, edge.arrow.size=0.3, vertex.label="")
dev.off()

#build references by papers weighted
refsByPapers="MATCH path=(r1:Reference)<--(p:Paper)-->(r2:Reference) WHERE NOT id(r1) = id(r2) AND id(r1) < id(r2) RETURN r1.name AS Ref1, r2.name AS Ref2, count(p) AS Weight"

rByPapRels = cypher(graph, refsByPapers)
refG = graph.data.frame(rByPapRels,directed = FALSE)
area = vcount(refG)^2

co <- layout_with_fr(refG, grid=c("nogrid"))

refByPClust = cluster_fast_greedy(refG, merges = TRUE, modularity = TRUE,
membership = TRUE, weights = E(refG)$weight)
V(refG)$color <- refByPClust$membership + 1

RefCl_out = cbind(V(refG)$name,refByPClust$membership)

write_graph(refG, file ="~/Documents/RefsNet.graphml", format = c("graphml"))
write.csv2(RefCl_out, file ="~/Documents/RefClusters.txt")

#Save Ref-Ref graph as pdf
pdf("~/Documents/rrGraph.pdf",10,10)
plot(refG, layout=co, vertex.size=2, edge.arrow.size=0.3, vertex.label="")
dev.off()
</syntaxhighlight>

CypherCode:LineReader

2019-12-23T11:49:26Z

Phil:

This is a really simple project that loads data from a file and put it into a Neo4J data. It assumes that each line of the file is an individual cypher command. It's great for building a network offline in a text file, and then you can simply run the program and it will put the data in neo4j.

Obviously you need to provide it with the server address, and a username/password if needed. It just runs so make sure you don't send all the data to the wrong place!

<syntaxhighlight lang=java style="border:3px dashed blue">
package net.prgarnett.cypherlinereader;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import org.neo4j.driver.v1.AuthTokens;
import org.neo4j.driver.v1.Driver;
import org.neo4j.driver.v1.GraphDatabase;
import org.neo4j.driver.v1.Session;
import org.neo4j.driver.v1.Transaction;

/**
*
* @author philip
*/
public class LineReader {

private final Driver driver;

/**
* Load up the driver, using the address, pword, and username.
*
* @param serveradd
* @param password
* @param username
*/
public LineReader(String serveradd, String password, String username)
{
this.driver = GraphDatabase.driver(serveradd, AuthTokens.basic(username, password));
}

/**
* Give the method the path to the file, it will read any line that does not
* start with a '#'.
*
* @param path
*/
public void ExecuteLines(String path)
{
try{
File targetFile = new File(path);

try(Session session = driver.session())
{
Scanner lineScanner = new Scanner(targetFile);
while(lineScanner.hasNext())
{
String line = lineScanner.nextLine();
if(!line.startsWith("#"))
{
this.processLine(line, session);
}
}
}
}
catch (FileNotFoundException e)
{
System.err.println(e.getMessage());
}
}

/**
* Private method that processes the line in a transaction, requires the
* session to be past.
*
* @param line
* @param session
*/
private void processLine(String line, Session session)
{
try (Transaction tx = session.beginTransaction())
{

System.out.println(line);
tx.run(line);
tx.success();
}
}
}
</syntaxhighlight>

Database Tools:Neo4J

2019-12-23T11:49:03Z

Phil:

Neo4J[http://www.neo4j.com] is a great graph database tool, it is really flexible and perfect for storing relationship data where objects are nodes and the relationships between the objects are the edges of the network. This allows for a scalable and flexible way of storing both data and the relationships between the data.

I use neo4j for numerous projects storing the relationships between companies and people, and also mapping inquiries and the documents released during an inquiry, see the [[Projects]] pages. The tools will either be links/descriptions of GitHub projects, or if it really is just a little bit of code then it will be just available here.

===Cypher Code===
Some bits of Cypher related code:
*[[CypherCode:LineReader]] - java program that reads and executes lines of cypher code.

Main Page

2019-12-23T11:47:54Z

Phil: /* Philip Garnett's Wiki Page */

==Philip Garnett's Wiki Page==

This is a dumping ground for bits of project and bits of code etc that goes with me other sites, prgarnett.net[http://prgarnett.net], obscurus.org[http://obscurus.org], and algorithmicindexing.net[http://algorithmicindexing.net].

Some project details:
*Algorithmic Indexing[http://algorithmicindexing.net] is a meta-project looking at the analysis of text using algorithms and AI to allow us to ''read'' the text. There are few sub-projects looking at public inquires and investigations.
*The Hillards Archive[http://thehillardsarchive.net] is a project digitising surviving documents of the Hillards supermarket chain.

===Text Analysis===
A lot of the programming that I do now is around the development of text analysis tools, and tools to populate and manipulate databases, including network (e.g. Neo4J[http://www.neo4j.com]) and documents databases.

====Databases====
*[[Database Tools:Neo4J]] - a few tools for working with Neo4J in different contexts or for particular use cases.
*[[Database Tools:MongoDB]] - a few tools for working with MongoDB.

====Text Analysis Tools====
*[[Text Analysis:Ngrams]] - tools for doing ngrams analysis.
*[[Text Analysis:Text Processing]] - tools for things like topic modelling, entity detection, other text analysis.
*[[Text Analysis:Bibliometrix]] - tools processing Bibliometrix data.

Text Analysis:Bibliometrix

2018-09-20T17:40:27Z

Phil:

Page for various tools for the processing of bibliometrix data, these tools were originally developed for a project that was looking into academic communities by exploring search results from Web of Knowledge, at the time specifically Complex Thinking. Most of the tools are built in Neo4J and R. The programs build a network of papers and their cited works.

The first bit of code that is useful is for turning data loaded from Web of Science into the R Bibliometrix package to a CSV file that is useful for then loading into

<syntaxhighlight lang=R style="border:3px dashed blue">
library(bibliometrix)
library(igraph)
library(ggplot2)

#reads in the exported
D <- readFiles("bib_export.bib")
#convert it into a matrix
M <- convert2df(D, dbsource = "isi", format = "bibtex")

#pull out the bits of the data that we want, could add more in here.
cited=M$CR
author=M$AU
key = M$SR
title = M$TI
data = cbind(key,title,author,cited)

#build a data frame for the export, there must be a better way of doing this.
E_List <- NULL;
for(i in 1:length(data[,1])){
cit_list = unlist(strsplit(data[i,4], split=". "))
for(s in 1:length(cit_list))
{
E_List <- rbind(E_List, c(data[i,1],cit_list[s]))
}
}

#save the data
write.csv2(E_List, file = "NetExport.txt", row.names = FALSE)
</syntaxhighlight>

That produces a csv file that Ne04J can load using the following command.

<syntaxhighlight lang="text" style="border:3px dashed blue">
LOAD CSV FROM "file:/NetExport.txt" AS line FIELDTERMINATOR ';' MATCH (a:Paper),(b:Reference) WHERE a.name = line[0] AND b.name = line[1] MERGE (a)-[r:Cites]->(b) RETURN r
</syntaxhighlight>

That loads the data into a Neo4J database, that could probably be done in one step. I will probably update this later. Next we can do things with the network in R using iGraph.

<syntaxhighlight lang=R style="border:3px dashed blue">
library(igraph);
library(RNeo4j); #This needs to be installed with devtools to get a new enough version to interface with Neo4J

graph = startGraph("http://localhost:7474/db/data/", username = "neo4j", password= "Nufoa23")

papersQuery = "MATCH (p:Paper) RETURN id(p) AS id, p.name AS pName, labels(p)";
refsQuery = "MATCH (r:Reference) RETURN id(r) AS id, r.name AS Name, labels(r)";

papers = cypher(graph, papersQuery)
colnames(papers) = c("ID","Name","Type")
references = cypher(graph, refsQuery)
colnames(references) = c("ID","Name","Type")
nodes = rbind(papers,references)

#Edit the whole graph
wholeGraphQ = "MATCH (p:Paper)-[r:Cites]->(s:Reference) RETURN id(p) AS pID, id(s) AS sID"
relations = cypher(graph, wholeGraphQ)
wG = graph.data.frame(relations,directed=TRUE,nodes)
#V(wG)$label.cex <- 0.5
V(wG)$color <- ifelse(V(wG)$Type == "Paper", "lightblue", "orange")
V(wG)$shape <- ifelse(V(wG)$Type == "Reference", "square", "circle")

area = vcount(wG)^2

co <- layout_with_fr(wG, grid=c("nogrid"))

#save the whole graph
pdf("~/Documents/wGraph.pdf",10,10)
plot(wG, layout=co, vertex.size=1, edge.arrow.size=0.3, vertex.label="")
dev.off()

#Papers by the cited works, weighted network
papersByRefs = "MATCH path=(n:Paper)-->(d:Reference)<--(m:Paper) WHERE NOT id(n) = id(m) AND id(n) < id(m) RETURN n.name AS Paper1, m.name AS Paper2, count(d) AS Weight"

pByRefRels = cypher(graph, papersByRefs)
prG = graph.data.frame(pByRefRels,directed = FALSE)
area = vcount(prG)^2

co <- layout_with_fr(prG, grid=c("nogrid"))

papByRClust = cluster_fast_greedy(prG, merges = TRUE, modularity = TRUE,
membership = TRUE, weights = E(prG)$weight)
V(prG)$color <- papByRClust$membership + 1

PapCl_out = cbind(V(prG)$name,papByRClust$membership)

write_graph(prG, file ="~/Documents/PapersNet.graphml", format = c("graphml"))
write.csv2(cl_out, file ="~/Documents/PapersClusters.txt")

#Save paper-paper graph as pdf
pdf("~/Documents/ppGraph.pdf",10,10)
plot(clusters, prG, layout=co, vertex.size=2, edge.arrow.size=0.3, vertex.label="")
dev.off()

#build references by papers weighted
refsByPapers="MATCH path=(r1:Reference)<--(p:Paper)-->(r2:Reference) WHERE NOT id(r1) = id(r2) AND id(r1) < id(r2) RETURN r1.name AS Ref1, r2.name AS Ref2, count(p) AS Weight"

rByPapRels = cypher(graph, refsByPapers)
refG = graph.data.frame(rByPapRels,directed = FALSE)
area = vcount(refG)^2

co <- layout_with_fr(refG, grid=c("nogrid"))

refByPClust = cluster_fast_greedy(refG, merges = TRUE, modularity = TRUE,
membership = TRUE, weights = E(refG)$weight)
V(refG)$color <- refByPClust$membership + 1

RefCl_out = cbind(V(refG)$name,refByPClust$membership)

write_graph(refG, file ="~/Documents/RefsNet.graphml", format = c("graphml"))
write.csv2(RefCl_out, file ="~/Documents/RefClusters.txt")

#Save Ref-Ref graph as pdf
pdf("~/Documents/rrGraph.pdf",10,10)
plot(refG, layout=co, vertex.size=2, edge.arrow.size=0.3, vertex.label="")
dev.off()
</syntaxhighlight>

Text Analysis:Bibliometrix

2018-09-20T17:32:16Z

Phil:

Page for various tools for the processing of bibliometrix data, these tools were originally developed for a project that was looking into academic communities by exploring search results from Web of Knowledge, at the time specifically Complex Thinking. Most of the tools are built in Neo4J and R. The programs build a network of papers and their cited works.

The first bit of code that is useful is for turning data loaded from Web of Science into the R Bibliometrix package to a CSV file that is useful for then loading into

<syntaxhighlight lang=R style="border:3px dashed blue">
library(bibliometrix)
library(igraph)
library(ggplot2)

#reads in the exported
D <- readFiles("bib_export.bib")
#convert it into a matrix
M <- convert2df(D, dbsource = "isi", format = "bibtex")

#pull out the bits of the data that we want, could add more in here.
cited=M$CR
author=M$AU
key = M$SR
title = M$TI
data = cbind(key,title,author,cited)

#build a data frame for the export, there must be a better way of doing this.
E_List <- NULL;
for(i in 1:length(data[,1])){
cit_list = unlist(strsplit(data[i,4], split=". "))
for(s in 1:length(cit_list))
{
E_List <- rbind(E_List, c(data[i,1],cit_list[s]))
}
}

#save the data
write.csv2(E_List, file = "NetExport.txt", row.names = FALSE)
</syntaxhighlight>

That produces a csv file that Ne04J can load using the following command.

<syntaxhighlight style="border:3px dashed blue">
LOAD CSV FROM "file:/NetExport.txt" AS line FIELDTERMINATOR ';' MATCH (a:Paper),(b:Reference) WHERE a.name = line[0] AND b.name = line[1] MERGE (a)-[r:Cites]->(b) RETURN r
</syntaxhighlight>

That loads the data into a Neo4J database, that could probably be done in one step. I will probably update this later. Next we can do things with the network in R using iGraph.

<syntaxhighlight lang=R style="border:3px dashed blue">
library(igraph);
library(RNeo4j); #This needs to be installed with devtools to get a new enough version to interface with Neo4J

graph = startGraph("http://localhost:7474/db/data/", username = "neo4j", password= "Nufoa23")

papersQuery = "MATCH (p:Paper) RETURN id(p) AS id, p.name AS pName, labels(p)";
refsQuery = "MATCH (r:Reference) RETURN id(r) AS id, r.name AS Name, labels(r)";

papers = cypher(graph, papersQuery)
colnames(papers) = c("ID","Name","Type")
references = cypher(graph, refsQuery)
colnames(references) = c("ID","Name","Type")
nodes = rbind(papers,references)

#Edit the whole graph
wholeGraphQ = "MATCH (p:Paper)-[r:Cites]->(s:Reference) RETURN id(p) AS pID, id(s) AS sID"
relations = cypher(graph, wholeGraphQ)
wG = graph.data.frame(relations,directed=TRUE,nodes)
#V(wG)$label.cex <- 0.5
V(wG)$color <- ifelse(V(wG)$Type == "Paper", "lightblue", "orange")
V(wG)$shape <- ifelse(V(wG)$Type == "Reference", "square", "circle")

area = vcount(wG)^2

co <- layout_with_fr(wG, grid=c("nogrid"))

#save the whole graph
pdf("~/Documents/wGraph.pdf",10,10)
plot(wG, layout=co, vertex.size=1, edge.arrow.size=0.3, vertex.label="")
dev.off()

#Papers by the cited works, weighted network
papersByRefs = "MATCH path=(n:Paper)-->(d:Reference)<--(m:Paper) WHERE NOT id(n) = id(m) AND id(n) < id(m) RETURN n.name AS Paper1, m.name AS Paper2, count(d) AS Weight"

pByRefRels = cypher(graph, papersByRefs)
prG = graph.data.frame(pByRefRels,directed = FALSE)
area = vcount(prG)^2

co <- layout_with_fr(prG, grid=c("nogrid"))

papByRClust = cluster_fast_greedy(prG, merges = TRUE, modularity = TRUE,
membership = TRUE, weights = E(prG)$weight)
V(prG)$color <- papByRClust$membership + 1

PapCl_out = cbind(V(prG)$name,papByRClust$membership)

write_graph(prG, file ="~/Documents/PapersNet.graphml", format = c("graphml"))
write.csv2(cl_out, file ="~/Documents/PapersClusters.txt")

#Save paper-paper graph as pdf
pdf("~/Documents/ppGraph.pdf",10,10)
plot(clusters, prG, layout=co, vertex.size=2, edge.arrow.size=0.3, vertex.label="")
dev.off()

#build references by papers weighted
refsByPapers="MATCH path=(r1:Reference)<--(p:Paper)-->(r2:Reference) WHERE NOT id(r1) = id(r2) AND id(r1) < id(r2) RETURN r1.name AS Ref1, r2.name AS Ref2, count(p) AS Weight"

rByPapRels = cypher(graph, refsByPapers)
refG = graph.data.frame(rByPapRels,directed = FALSE)
area = vcount(refG)^2

co <- layout_with_fr(refG, grid=c("nogrid"))

refByPClust = cluster_fast_greedy(refG, merges = TRUE, modularity = TRUE,
membership = TRUE, weights = E(refG)$weight)
V(refG)$color <- refByPClust$membership + 1

RefCl_out = cbind(V(refG)$name,refByPClust$membership)

write_graph(refG, file ="~/Documents/RefsNet.graphml", format = c("graphml"))
write.csv2(RefCl_out, file ="~/Documents/RefClusters.txt")

#Save Ref-Ref graph as pdf
pdf("~/Documents/rrGraph.pdf",10,10)
plot(refG, layout=co, vertex.size=2, edge.arrow.size=0.3, vertex.label="")
dev.off()
</syntaxhighlight>

Text Analysis:Bibliometrix

2018-09-20T17:12:29Z

Phil: Created page with "Page for various tools for the processing of bibliometrix data, these tools were originally developed for a project that was looking into academic communities by exploring sea..."

Main Page

2018-09-20T17:05:29Z

Phil: /* Philip Garnett's Wiki Page */

==Philip Garnett's Wiki Page==

This is a dumping ground for bits of project and bits of code etc that goes with me other sites, prgarnett.net[http://prgarnett.net], obscurus.org[http://obscurus.org], and algorithmicindexing.net[http://algorithmicindexing.net].

Some project details:
*Algorithmic Indexing[http://algorithmicindexing.net] is a meta-project looking at the analysis of text using algorithms and AI to allow us to ''read'' the text. There are few sub-projects looking at public inquires and investigations.
*The Hillards Archive is a project deigitising surviving documents of the Hillards supermarket chain [http://thehillardsarchive.net].

===Text Analysis===
A lot of the programming that I do now is around the developement of text analysis tools, and tools to populate and manuiplate databases, including network (e.g. Neo4J[http://www.neo4j.com]) and documents databases.

====Databases====
*[[Database Tools:Neo4J]] - a few tools for working with Neo4J in different contexts or for particular use cases.
*[[Database Tools:MongoDB]] - a few tools for working with MongoDB.

====Text Analysis Tools====
*[[Text Analysis:Ngrams]] - tools for doing ngrams analysis.
*[[Text Analysis:Text Processing]] - tools for things like topic modelling, entity detection, other text analysis.
*[[Text Analysis:Bibliometrix]] - tools processing Bibliometrix data.

CypherCode:LineReader

2017-12-07T21:06:17Z

Phil:

This is a really simple project that loads data from a file and put it into a Neo4J data. It assumes that each line of the file is an individual cypher command. Its great for building a network offline in a text file, and then you can simply run the program and it will put the data in neo4j.

Obviously you need to provide it with the server address, and a username/password if needed. It just runs so make sure you don't send all the data to the wrong place!

<syntaxhighlight lang=java style="border:3px dashed blue">
package net.prgarnett.cypherlinereader;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import org.neo4j.driver.v1.AuthTokens;
import org.neo4j.driver.v1.Driver;
import org.neo4j.driver.v1.GraphDatabase;
import org.neo4j.driver.v1.Session;
import org.neo4j.driver.v1.Transaction;

/**
*
* @author philip
*/
public class LineReader {

private final Driver driver;

/**
* Load up the driver, using the address, pword, and username.
*
* @param serveradd
* @param password
* @param username
*/
public LineReader(String serveradd, String password, String username)
{
this.driver = GraphDatabase.driver(serveradd, AuthTokens.basic(username, password));
}

/**
* Give the method the path to the file, it will read any line that does not
* start with a '#'.
*
* @param path
*/
public void ExecuteLines(String path)
{
try{
File targetFile = new File(path);

try(Session session = driver.session())
{
Scanner lineScanner = new Scanner(targetFile);
while(lineScanner.hasNext())
{
String line = lineScanner.nextLine();
if(!line.startsWith("#"))
{
this.processLine(line, session);
}
}
}
}
catch (FileNotFoundException e)
{
System.err.println(e.getMessage());
}
}

/**
* Private method that processes the line in a transaction, requires the
* session to be past.
*
* @param line
* @param session
*/
private void processLine(String line, Session session)
{
try (Transaction tx = session.beginTransaction())
{

System.out.println(line);
tx.run(line);
tx.success();
}
}
}
</syntaxhighlight>

CypherCode:LineReader

2017-12-07T14:03:51Z

Phil: Created page with "This is a really simple project that loads data from a file and put it into a Neo4J data. It assumes that each line of the file is an individual cypher command. Its great for..."

This is a really simple project that loads data from a file and put it into a Neo4J data. It assumes that each line of the file is an individual cypher command. Its great for building a network offline in a text file, and then you can simply run the program and it will put the data in neo4j.

Obviously you need to provide it with the server address, and a username/password if needed. It just runs so make sure you don't send all the data to the wrong place!

<syntaxhighlight lang=java style="border:3px dashed blue">
package net.prgarnett.cypherlinereader;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import org.neo4j.driver.v1.AuthTokens;
import org.neo4j.driver.v1.Driver;
import org.neo4j.driver.v1.GraphDatabase;
import org.neo4j.driver.v1.Session;
import org.neo4j.driver.v1.Transaction;

/**
*
* @author philip
*/
public class LineReader {

private final Driver driver;

/**
* Load up the driver, using the address, pword, and username.
*
* @param serveradd
* @param password
* @param username
*/
public LineReader(String serveradd, String password, String username)
{
this.driver = GraphDatabase.driver(serveradd, AuthTokens.basic(username, password));
}

/**
* Give the method the path to the file, it will read any line that does not
* start with a '#'.
*
* @param path
*/
public void ExecuteLines(String path)
{
try{
File targetFile = new File(path);

try(Session session = driver.session())
{
Scanner lineScanner = new Scanner(targetFile);
while(lineScanner.hasNext())
{
String line = lineScanner.nextLine();
if(!line.startsWith("#"))
{
this.processLine(line, session);
}
}
}
}
catch (FileNotFoundException e)
{
System.err.println(e.getMessage());
}
}

/**
* Private method that processes the line in a transaction, requires the
* session to be past.
*
* @param line
* @param session
*/
private void processLine(String line, Session session)
{
try (Transaction tx = session.beginTransaction())
{
tx.run(line);
tx.success();
}
}
}
</syntaxhighlight>

Database Tools:Neo4J

2017-12-07T11:44:13Z

Phil: Created page with "Neo4J[http://www.neo4j.com] is a great graph database tool, it is really flexible and perfect for storing relationship data where objects are nodes and the relationships betwe..."

Main Page

2017-12-07T11:28:14Z

Phil:

==Philip Garnett's Wiki Page==

This is a dumping ground for bits of project and bits of code etc that goes with me other sites, prgarnett.net[http://prgarnett.net], obscurus.org[http://obscurus.org], and algorithmicindexing.net[http://algorithmicindexing.net].

Some project details:
*Algorithmic Indexing[http://algorithmicindexing.net] is a meta-project looking at the analysis of text using algorithms and AI to allow us to ''read'' the text. There are few sub-projects looking at public inquires and investigations.
*The Hillards Archive is a project deigitising surviving documents of the Hillards supermarket chain [http://thehillardsarchive.net].

===Text Analysis===
A lot of the programming that I do now is around the developement of text analysis tools, and tools to populate and manuiplate databases, including network (e.g. Neo4J[http://www.neo4j.com]) and documents databases.

====Databases====
*[[Database Tools:Neo4J]] - a few tools for working with Neo4J in different contexts or for particular use cases.
*[[Database Tools:MongoDB]] - a few tools for working with MongoDB.

====Text Analysis Tools====
*[[Text Analysis:Ngrams]] - tools for doing ngrams analysis.
*[[Text Analysis:Text Processing]] - tools for things like topic modelling, entity detection, other text analysis.

Main Page

2017-12-07T11:23:22Z

Phil:

==Philip Garnett's Wiki Page==

This is a dumping ground for bits of project and bits of code etc that goes with me other sites, prgarnett.net[http://prgarnett.net], obscurus.org[http://obscurus.org], and algorithmicindexing.net[http://algorithmicindexing.net].

Some project details:
*Algorithmic Indexing[http://algorithmicindexing.net] is a meta-project looking at the analysis of text using algorithms and AI to allow us to ''read'' the text. There are few sub-projects looking at public inquires and investigations.
*The Hillards Archive is a project deigitising surviving documents of the Hillards supermarket chain [http://thehillardsarchive.net].

===Text Analysis===
A lot of the programming that I do now is around the developement of text analysis tools, and tools to populate and manuiplate databases, including network (e.g. Neo4J[http://www.neo4j.com]) and documents databases.

====Databases====
[[Database Tools:Neo4J]] - a few tools for working with Neo4J in different contexts or for particular use cases. <br />
[[Database Tools:MongoDB]] - a few tools for working with MongoDB.

====Text Analysis Tools====
[[Text Analysis:Ngrams]] - tools for doing ngrams analysis. <br />
[[Text Analysis:Text Processing]] - tools for things like topic modelling, entity detection, other text analysis.

Main Page

2017-12-07T11:15:33Z

Phil: