内容简介:版权声明:本文为博主原创文章,未经博主允许不得转载。如有问题,欢迎指正! https://blog.csdn.net/sun7545526/article/details/90757800
版权声明:本文为博主原创文章,未经博主允许不得转载。如有问题,欢迎指正! https://blog.csdn.net/sun7545526/article/details/90757800
这里写自定义目录标题
- 1. Json导入到本地TinkerGraph
- 2. CSV导入到本地TinkerGraph
- 3. Json导入到分布式存储(berkeleyje-es)
本文中的代码基于janusgraph 0.3.1进行演示。数据文件都为janusgraph包中自带的数据文件。
1. Json导入到本地TinkerGraph
1.1 配置
conf/hadoop-graph/hadoop-load-json.properties 配置如下:
# # Hadoop Graph Configuration # gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONInputFormat gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat gremlin.hadoop.inputLocation=./data/grateful-dead.json gremlin.hadoop.outputLocation=output gremlin.hadoop.jarsInDistributedCache=true # # SparkGraphComputer Configuration # spark.master=local[*] spark.executor.memory=1g spark.serializer=org.apache.spark.serializer.KryoSerializer spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
1.2 样例Json
{"id":1,"label":"song","inE":{"followedBy":[{"id":3059,"outV":153,"properties":{"weight":1}},{"id":276,"outV":5,"properties":{"weight":2}},{"id":3704,"outV":3,"properties":{"weight":2}},{"id":4383,"outV":62,"pr operties":{"weight":1}}]},"outE":{"followedBy":[{"id":0,"inV":2,"properties":{"weight":1}},{"id":1,"inV":3,"properties":{"weight":2}},{"id":2,"inV":4,"properties":{"weight":1}},{"id":3,"inV":5,"properties":{"we ight":1}},{"id":4,"inV":6,"properties":{"weight":1}}],"sungBy":[{"id":7612,"inV":340}],"writtenBy":[{"id":7611,"inV":527}]},"properties":{"name":[{"id":0,"value":"HEY BO DIDDLEY"}],"songType":[{"id":2,"value":" cover"}],"performances":[{"id":1,"value":5}]}} {"id":2,"label":"song","inE":{"followedBy":[{"id":0,"outV":1,"properties":{"weight":1}},{"id":323,"outV":34,"properties":{"weight":1}}]},"outE":{"followedBy":[{"id":6190,"inV":123,"properties":{"weight":1}},{"i d":6191,"inV":50,"properties":{"weight":1}}],"sungBy":[{"id":7666,"inV":525}],"writtenBy":[{"id":7665,"inV":525}]},"properties":{"name":[{"id":3,"value":"IM A MAN"}],"songType":[{"id":5,"value":"cover"}],"perfo rmances":[{"id":4,"value":1}]}} s
1.3 代码
readGraph = GraphFactory.open('conf/hadoop-graph/hadoop-load-json.properties') writeGraphConf = new BaseConfiguration() writeGraphConf.setProperty("gremlin.graph", "org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph") writeGraphConf.setProperty("gremlin.tinkergraph.graphFormat", "gryo") writeGraphConf.setProperty("gremlin.tinkergraph.graphLocation", "/tmp/csv-graph.kryo") blvp = BulkLoaderVertexProgram.build().bulkLoader(OneTimeBulkLoader).writeGraph(writeGraphConf).create(readGraph) readGraph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get()
1.4 文件校验
新生成的文件如下
[root@vm03 data]# ls -l /tmp/csv-graph.kryo -rw-r--r--. 1 root root 726353 May 29 04:09 /tmp/csv-graph.kryo
2. CSV导入到本地TinkerGraph
2.1 配置
conf/hadoop-graph/hadoop-load-csv.properties 配置如下:
# # Hadoop Graph Configuration # gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptInputFormat gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONOutputFormat gremlin.hadoop.inputLocation=./data/grateful-dead.txt gremlin.hadoop.outputLocation=output gremlin.hadoop.jarsInDistributedCache=true gremlin.hadoop.scriptInputFormat.script=./data/script-input-grateful-dead.groovy # # SparkGraphComputer Configuration # spark.master=local[*] spark.executor.memory=1g spark.serializer=org.apache.spark.serializer.KryoSerializer spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
2.2 样例CSV
1,song,HEY BO DIDDLEY,cover,5 followedBy,2,1|followedBy,3,2|followedBy,4,1|followedBy,5,1|followedBy,6,1|sungBy,340|writtenBy,527 followedBy,3,2|followedBy,5,2|followedBy,62,1|followedBy,153,1 2,song,IM A MAN,cover,1 followedBy,50,1|followedBy,123,1|sungBy,525|writtenBy,525 followedBy,1,1|followedBy,34,1 3,song,NOT FADE AWAY,cover,531 followedBy,81,1|followedBy,86,5|followedBy,127,10|followedBy,59,1|followedBy,83,3|followedBy,103,2|followedBy,68,1|followedBy,134,2|followedBy,131,1|followedBy,151,1|followedBy,3
2.3 代码
script-input-grateful-dead.groovy 代码如下:
def parse(line) { def (vertex, outEdges, inEdges) = line.split(/\t/, 3) def (v1id, v1label, v1props) = vertex.split(/,/, 3) def v1 = graph.addVertex(T.id, v1id.toInteger(), T.label, v1label) switch (v1label) { case "song": def (name, songType, performances) = v1props.split(/,/) v1.property("name", name) v1.property("songType", songType) v1.property("performances", performances.toInteger()) break case "artist": v1.property("name", v1props) break default: throw new Exception("Unexpected vertex label: ${v1label}") } [[outEdges, true], [inEdges, false]].each { def edges, def out -> edges.split(/\|/).grep().each { def edge -> def parts = edge.split(/,/) def otherV, eLabel, weight = null if (parts.size() == 2) { (eLabel, otherV) = parts } else { (eLabel, otherV, weight) = parts } def v2 = graph.addVertex(T.id, otherV.toInteger()) def e = out ? v1.addOutEdge(eLabel, v2) : v1.addInEdge(eLabel, v2) if (weight != null) e.property("weight", weight.toInteger()) } } return v1 }
janusgraph代码:
readGraph = GraphFactory.open('conf/hadoop-graph/hadoop-load-csv.properties') writeGraphConf = new BaseConfiguration() writeGraphConf.setProperty("gremlin.graph", "org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph") writeGraphConf.setProperty("gremlin.tinkergraph.graphFormat", "gryo") writeGraphConf.setProperty("gremlin.tinkergraph.graphLocation", "/tmp/csv-graph2.kryo") blvp = BulkLoaderVertexProgram.build().bulkLoader(OneTimeBulkLoader).writeGraph(writeGraphConf).create(readGraph) readGraph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get() g = GraphFactory.open(writeGraphConf).traversal() g.V().valueMap(true)
2.4 文件校验
新生成的文件如下
[root@vm03 data]# ls -l /tmp/csv-graph2.kryo -rw-r--r--. 1 root root 339939 May 29 04:56 /tmp/csv-graph2.kryo
3. Json导入到分布式存储(berkeleyje-es)
3.1 配置
conf/hadoop-graph/hadoop-load-json-ber-es.properties 配置如下:
# # Hadoop Graph Configuration # gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONInputFormat gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat gremlin.hadoop.inputLocation=./data/grateful-dead.json gremlin.hadoop.outputLocation=output gremlin.hadoop.jarsInDistributedCache=true # # SparkGraphComputer Configuration # spark.master=local[*] spark.executor.memory=1g spark.serializer=org.apache.spark.serializer.KryoSerializer spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
./conf/janusgraph-berkeleyje-es-bulkload.properties 配置如下:
gremlin.graph=org.janusgraph.core.JanusGraphFactory storage.backend=berkeleyje storage.directory=../db/berkeley index.search.backend=elasticsearch
3.2 样例Json
{"id":1,"label":"song","inE":{"followedBy":[{"id":3059,"outV":153,"properties":{"weight":1}},{"id":276,"outV":5,"properties":{"weight":2}},{"id":3704,"outV":3,"properties":{"weight":2}},{"id":4383,"outV":62,"pr operties":{"weight":1}}]},"outE":{"followedBy":[{"id":0,"inV":2,"properties":{"weight":1}},{"id":1,"inV":3,"properties":{"weight":2}},{"id":2,"inV":4,"properties":{"weight":1}},{"id":3,"inV":5,"properties":{"we ight":1}},{"id":4,"inV":6,"properties":{"weight":1}}],"sungBy":[{"id":7612,"inV":340}],"writtenBy":[{"id":7611,"inV":527}]},"properties":{"name":[{"id":0,"value":"HEY BO DIDDLEY"}],"songType":[{"id":2,"value":" cover"}],"performances":[{"id":1,"value":5}]}} {"id":2,"label":"song","inE":{"followedBy":[{"id":0,"outV":1,"properties":{"weight":1}},{"id":323,"outV":34,"properties":{"weight":1}}]},"outE":{"followedBy":[{"id":6190,"inV":123,"properties":{"weight":1}},{"i d":6191,"inV":50,"properties":{"weight":1}}],"sungBy":[{"id":7666,"inV":525}],"writtenBy":[{"id":7665,"inV":525}]},"properties":{"name":[{"id":3,"value":"IM A MAN"}],"songType":[{"id":5,"value":"cover"}],"perfo rmances":[{"id":4,"value":1}]}} s
3.3 代码
outputGraphConfig = './conf/janusgraph-berkeleyje-es-bulkload.properties' readGraph = GraphFactory.open('conf/hadoop-graph/hadoop-load-json-ber-es.properties') blvp = BulkLoaderVertexProgram.build().writeGraph(outputGraphConfig).create(readGraph) readGraph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get() g = GraphFactory.open(outputGraphConfig).traversal() g.V().valueMap(true)
3.4 验证
通过gremlin-server搭建服务进行验证
- gremline-server配置文件如下(gremlin-server-berkeleyje-bulkload.yaml),与gremlin-server-berkeleyje.yaml类似,下面的位置进行调整:
graph: conf/janusgraph-berkeleyje-es-bulkload.properties
- ./gremlin-server.sh conf/gremlin-server/gremlin-server-berkeleyje-bulkload.yaml 启动服务
- 通过graphexp进行查询
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:- 不想用POI?几行代码完成Excel导出导入
- 少说话多写代码之Python学习021——导入模块
- Android Studio导入AOSP源代码【亲测可行】
- GitLab 11.2 发布,可批量导入 Android 操作系统代码
- ASP.NET Aries 高级开发教程:Excel导入之代码编写(番外篇)
- Ubuntu 16.04下将Hadoop2.7.3源代码导入到Eclipse Neon中
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
An Introduction to the Analysis of Algorithms
Robert Sedgewick、Philippe Flajolet / Addison-Wesley Professional / 1995-12-10 / CAD 67.99
This book is a thorough overview of the primary techniques and models used in the mathematical analysis of algorithms. The first half of the book draws upon classical mathematical material from discre......一起来看看 《An Introduction to the Analysis of Algorithms》 这本书的介绍吧!