Java initializes large amounts of data into Neo4j (1)

Background: Our project is deploying a graph database for the first time, which requires us to initialize existing business data and relationships into Neo4j as soon as they go online. The amount of data in the development environment has reached millions. The amount of generated environment data is larger.

When I first started developing, since I didn’t know much about Neo4j, the first thing I thought of was to use the code to assemble the create statement universally to create nodes and relationships.

Business description: There are many entity tables in the system, each entity table has its own data, and different entities have a relationship table for maintenance.

My development idea is: 1. First take out all the data in the table and use it as nodes. 2. Find out the relationship of this data according to the relationship table, then assemble the statement and add the data to Neo4j.

The specific code is as follows (Springboot project version 2.2.5RELEASE):
pom.xml

<dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-neo4j</artifactId>
        </dependency>

Configure the configuration file as follows:

spring:
   data:
      neo4j:
         uri: bolt://localhost:7687
         username: neo4j
         password: neo4j

Use Java code to assemble CQL statements and use native sessions.
Neo4jConfig.java

@Configuration
public class Neo4jConfig {<!-- -->

    @Value("${spring.data.neo4j.uri}")
    private String uri;
    @Value("${spring.data.neo4j.username}")
    private String userName;
    @Value("${spring.data.neo4j.password}")
    private String password;

    @Bean
    public org.neo4j.ogm.config.Configuration getConfiguration() {<!-- -->
        org.neo4j.ogm.config.Configuration configuration = new org.neo4j.ogm.config.Configuration.
                Builder().uri(uri).connectionPoolSize(100).credentials(userName, password).withBasePackages("com.troy.keeper.desc.repository").build();
        return configuration;
    }

    @Bean
    public SessionFactory sessionFactory() {<!-- -->
        return new SessionFactory(getConfiguration());
    }

    @Bean("neo4jTransaction")
    public Neo4jTransactionManager neo4jTransactionManager(SessionFactory sessionFactory) {<!-- -->
        return new Neo4jTransactionManager(sessionFactory);
    }

Interface entryController.java

@GetMapping("initDataToNeo4j")
        public void initDataToNeo4j() {<!-- -->
            service.initDataToNeo4j();
        }

Service.java

//Node data is added according to your actual business. What I correspond to here is the data of all tables, because the results of all tables in my business are basically the same, that is, the node attributes are the same. The data of each table is a map, and the key is the table name as the label of the node.
Map<String, List<NodeData>> nodeDataMap;
//Relational data, use the relationship of each table data as a RelationData entity
List<RelationData> relationDatas;

//After data assembly is completed, create nodes
neo4jUtil.creatNode(nodeDataMap);

//Perform relationship binding
neo4jUtil.bindRelation(relationDatas);

NodeData.java

private String id; //Attribute id
private String name;//property name
private String table;//as node label

RelationData.java

//relationship id
private String id;
//relationship name
private String relationName;
//Because my relationship here spans entities, I need to specify the end tag
private String endLableName;

//Because my relationship here spans entities, I need to specify the start tag
private String startLableName;

//The value of the starting node
private String startValue;

//The value of the end node
private String endWhereValue;

Neo4jUtil.java

@Component
public class Neo4jUtil {<!-- -->
@Resource
private Session session;
/**
     * Delete the nodes under the label (including the relationship between nodes)
     * @param lableName
     * @return
     */
    public Integer deleteByLable(String lableName) {<!-- -->
        if (StringUtils.isEmpty(lableName)) {<!-- -->
            return 0;
        }

        String cypherSql = String.format("MATCH (r:`%s`) DETACH DELETE r ", lableName);
        Result query = session.query(cypherSql, new HashMap<>(16));
        session.clear();
        return query.queryStatistics().getNodesDeleted();
    }

//Create node
public void creatNode(Map<String, List<NodeData>> nodeDataMap) {<!-- -->

        if (nodeDataMap == null) {<!-- -->
            return ;
        }

        for(String key:nodeDataMap.keySet()){<!-- -->
            List<NodeData> data= nodeDataMap.get(key);
            if (StringUtils.isEmpty(key)) {<!-- -->
                continue;
            }

            //If there is no data under the table, only a node with no attributes will be created.
            if (data== null || data.isEmpty()) {<!-- -->
                String sql =String.format("create (:`%s`)",key);
                session.query(sql, new HashMap<>(16));
                continue;
            }
            //Because it is a full import, you can delete all nodes and relationships under this label first, and add them yourself according to your business requirements.
            deleteByLable(key);
            for (NodeData nodeData:data) {<!-- -->
                //Compatible with Chinese and special symbols
                String labels = ":`" + String.join("`:`", key) + "`";;
                String id = nodeData.getId();
                String name = nodeData.getName();
                String property = String.format("{id:'%s',name:'%s'} ", id,name);

                String sql = String.format("create (%s%s)", labels, property);
                session.query(sql, new HashMap<>(16));

            }
        }
    
    }

//Binding relationship
public void bindRelation( List<RelationData> relations) {<!-- -->
     if (relations== null) {<!-- -->
         return;
     }
      for (RelationData relation:relations) {<!-- -->
           String id = relation.getId();
            String relationName = relation.getRelationName();
          String startLableName = relation.getStartLableName();
          String endLableName = relation.getEndLableName();
          String startValue = relation.getStartValue();
          String endValue = relation.getEndValue();
    
          String property = String.format("{id:'%s',name:'%s'} ", id,relationName);
          String cypherSql = String.format("MATCH (n:`%s`),(m:`%s`) where n.id ='%s' and m.id= '%s' CREATE (n)-[ r:%s%s]->(m)",
                  startLableName,endLableName,startValue,endValue,relationName,property);
          session.query(cypherSql, new HashMap<>(16));
      }
    }
}

Then execute the controller interface to extract and import data into Neo4j. The environment I used when developing has about 70,000 nodes and 1.2 million relationships. It took more than two hours to run with local Neo4j, and 8 hours with server deployment (cross-region). . . .

Too slow

Later, I checked the information and said that create is suitable for use when the amount of data is small. For large amounts of data import, you can use neo4j-admin import. Next, use neo4j-admin import. See Java initialization data into Neo4j (Part 2) )