The whole process from installation to practice of knowledge graph Neo4j

Foreword: Hello everyone, my name is Dream. In this actual combat, we will complete the entire process of knowledge graph Neo4j installation to practice and explore the relationships and attributes. Knowledge graph is a data structure stored in the form of triples, consisting of entities, relationships and attributes. It can help us better understand and analyze complex knowledge relationships. Let’s take a look~

1. Semantic network writing into graph database

Purpose of the experiment
(1) Understand the methods of writing semantic networks into databases.
(2) Simply use Neo4j to present the semantic network.
Experimental requirements
After this experiment, you will be able to understand how the Node and Relationship of the semantic network are presented in the database.
Experimental Principle
To represent a fact using a semantic network, you must first find its nodes, then describe its relationship with other nodes, and finally write it into the database using Python.

Experimental preparation

1. Install JDK

Before downloading neo4j, you must first install the JDK.

1.1 Download

Official website download link: https://www.oracle.com/java/technologies/javase-downloads.html
The choice of JDK version must be appropriate. It is recommended that jdk1.8 is relatively stable. A version that is too high or too low may cause subsequent neo4j to be unusable.
The following is the installation path:

1.2 Configuring environment variables

After installing the JDK, you need to start configuring environment variables. The steps to configure environment variables are as follows:
Search for environment variables directly at the beginning. After opening, the following interface will appear, then click on the environment variables in the lower right corner:

In the system variable area below, create a new environment variable, name it JAVA_HOME, and set the variable value to the installation path of JAVA just now. In my case, it is D:\Big Airport\java\java1.

Edit the Path in the system variable area, click New, and then enter %JAVA_HOME%\bin

Open the command prompt CMD (WIN + R, enter cmd) and enter java -version. If the Java version information is prompted, the environment variable configuration is successful:

2. Install neo4j

After installing the JDK, you can install neo4j.

2.1 Download

Official download link: https://neo4j.com/download-center/#community
Here, I downloaded neo4j Community Edition 5.8.0.

After downloading, just unzip it to the appropriate path without installation:

2.2 Configuring environment variables

Next, we need to configure the environment variables. The method of configuring the JAVA environment variables is very similar, so we will only briefly describe it here.
In the system variable area, create a new environment variable and name it NEO4J_HOME. The variable value is set to the installation path of neo4j just now. In my case, it is D:\Big Airport\\
eo4j\\
eo4j1.


Edit the Path in the system variable area, click New, then enter %NEO4J_HOME%\bin, and finally click OK to save.

3. Start neo4j

Run cmd as administrator.

Then, enter neo4j.bat console at the command line

If this interface appears, it proves that neo4j was started successfully.
Enter the URL http://localhost:7474/ given in the interface into the browser, and the following interface will be displayed.

The default username and password are neo4j.
At this point, neo4j is installed~

Experimental steps

1. Import classes from the neomodel package.

from neomodel import StructuredNode, StringProperty, RelationshipTo, RelationshipFrom, config

2. Connect to Neo4j graph database.

config.DATABASE URL= 'bolt://neo4i:neo4ialocalhost:7687'

The fact that is about to be constructed is “Trees and grass are both plants. Trees and grass have leaves and roots. Water is early and grows in water. Fruit trees are trees, and sun-bearing trees are a type of trees. , it will bear pears.”

3. Write node class.

Plants, trees, grass, leaves, roots, aquatic plants, water, fruit trees, fruits, pear trees, and knots are some nodes inherited from the StructuredNode class, including node attributes and connection relationships.

class Plant(StructuredNode):
    name = StringProperty(unique_index=True)
    has1 = RelationshipFrom("Tree", "AKO")
    has2 = RelationshipFrom('Grass', 'AKO')
    have1 =RelationshipTo('Leaf', 'Have')
    have2 = RelationshipTo('Root', 'Have')
class Tree(StructuredNode):
    name = StringProperty(unique_index=True)
    ako = RelationshipTo('Plant', 'AKO')
    have = RelationshipFrom('Fruiter', 'AKO')
class Grass(StructuredNode):
    name = StringProperty(unique_index=True)
    ako = RelationshipTo('Plant' , 'AKO')
    has = RelationshipFrom('Waterweeds', 'AKO')
class Leaf(StructuredNode):
    name = StringProperty(unique_index=True)
    have = RelationshipFrom('Plant', 'Have')
class Root(StructuredNode):
    name = StringProperty(unique_index=True)
    have = RelationshipFrom('Plant', 'Have')
class Waterweeds(StructuredNode):
    name = StringProperty(unique_index=True)
    ako = RelationshipTo('Grass' , 'AKO')
    live = RelationshipTo('Water', 'Live')
class Water(StructuredNode):
    name = StringProperty(unique_index=True)
    have = RelationshipFrom('Waterweeds', 'Live')

class Fruiter(StructuredNode):
    name = StringProperty(unique_index=True)
    ako = RelationshipTo('Tree', 'AKO')
    can = RelationshipTo('Bear', 'Can')
    have = RelationshipFrom('Pear','AKO')

class Bear(StructuredNode):
    name = StringProperty(unique_index=True)
    have = RelationshipFrom('Fruiter' , 'Can')

class Pear(StructuredNode):
    name = StringProperty(unique_index = True)
    ako = RelationshipTo('Fruiter', 'AKO')
    can = RelationshipTo('BearPear', 'Can')

class BearPear(StructuredNode):
    name = StringProperty(unique_index=True)
    have = RelationshipFrom('Pear', 'Can')



plant =Plant(name="Plant").save()
tree = Tree(name="tree").save()
grass = Grass(name="grass" ).save()
leaf = Leaf(name= "叶" ).save()
root=Ro

4. Generate instances based on classes.

leaf = Leaf(name= "叶" ).save()
root = Root(name="root").save()
waterweeds = Waterweeds(name="Waterweed").save()
water = Water(name="水" ).save()
fruiter = Fruiter(name="fruit tree").save()
bear = Bear(name="result").save()
pear = Pear(name="Pear Tree").save()
bearpear = BearPear(name="Jieli").save()

5. Create connection relationships between instances.

pear.ako.connect(fruiter)
pear.can.connect(bearpear)
fruiter.ako.connect(tree)
fruiter.can.connect(bear)
waterweeds.ako.connect(grass)
waterweeds.live.connect(water)
plant.have1.connect(leaf)
plant.have2.connect(root)
tree.ako.connect(plant)
grass.ako.connect(plant)

Experimental results

2. Construction of Water Margin knowledge graph

Start neo4j

Run cmd as administrator,
Enter neo4j.bat console at the command line

If this interface appears, neo4j starts successfully.
Enter the URL http://localhost:7474/ given in the interface into the browser to open the neo4j visual interface.

Open jupyter notebook

1. Enter jupyter notebook:

After entering the “jupyter notebook” command on the command line and pressing Enter, it will automatically jump to the jupyter notebook workspace page. In this way, we can start building the knowledge graph.

2. Data set download

Dataset: triples.csv
Link: https://pan.baidu.com/s/19vrJ1vkEf2lgchBkF8OALQ?pwd=gn8o
Extraction code: gn8o
Data description: Building a knowledge graph requires processing data into the form of triples or . Each triplet (triples) can be regarded as consisting of subject ( Subject), predicate (predicate) and object (object). There are mainly two types of triples in the knowledge graph. One is the relationship triples, and the other is the attribute triples. The subject and object in the relation triples are both entities, and the predicate is usually called relation. The subject in attribute triples is the entity, and object is the value. The value is usually a numerical value or text, and its predicate is usually called an attribute.
Data is divided into unstructured data (such as text, documents, pictures, etc.), semi-structured text (such as log files, XML documents, JSON documents, etc.) and structured data. The data set used in this experiment is structured The triplet data can be used directly without additional processing.

3. Install third-party libraries

In order to build the knowledge graph smoothly, we need to install some necessary third-party libraries:

  • Install the py2neo library: used to interact with the neo4j graph database.
  • Install pyahocorasick, numpy and pandas libraries: used to process data sets and perform related operations.

!pip install py2neo pyahocorasick numpy pandas

!pip install pytest-cov==2. 0

!pip install pytest-filter-subpackage==0. 1

!pip install typed ast== 1. 4. 0

4. Guide package

In jupyter notebook, we need to import some necessary libraries to build the knowledge graph. Just add the following code at the beginning of the code:

import py2neo
from py2neo import Graph,Node,Relationship,NodeMatcher

5. Connection graph database

In order to be able to interact with the neo4j graph database, we need to connect to the database first. Next, we can use the following code to connect to the graph database:
The values of auth are the account number and password used when logging in to neo4j. You must use the password changed during installation, otherwise you will not be able to connect to the graph database.

g=Graph("neo4j://localhost:7687", auth=("neo4j", "mima"))

6. Map construction

import csv
with open(r"D:\PycharmProjects\Knowledge Representation\triples.csv",'r', encoding='utf-8') as f:
    reader=csv.reader(f)
    for item in reader:
        if reader.line_num==1:
            continue
        print('Current line number:',reader.line_num,"Current content:",item)
        start_node=Node("Person" ,name=item[0])
        end_node=Node("Person", name=item[1])
        relation=Relationship(start_node,item[3],end_node)
        g.merge(start_node,"Person", "name")
        g. merge(end_node,"Person","name")
        g.merge(relation,"Person","name")

This code will read the data set file line by line and convert each line of data into a triplet for graph construction.
After running the code, you can see the current line number and corresponding content on the console.

The results show that:

The results show:
After running successfully, we can view the constructed knowledge graph in the neo4j visual interface.

Graph visualization results

We can view the visual results of the constructed Water Margin knowledge graph by querying relevant nodes and relationships in the neo4j visual interface.
Just enter the corresponding Cypher query statement in the interface.