内容简介:版权声明:本文为博主原创文章,未经博主允许不得转载。如有问题,欢迎指正! https://blog.csdn.net/sun7545526/article/details/90757800
版权声明:本文为博主原创文章,未经博主允许不得转载。如有问题,欢迎指正! https://blog.csdn.net/sun7545526/article/details/90757800
这里写自定义目录标题
- 1. Json导入到本地TinkerGraph
- 2. CSV导入到本地TinkerGraph
- 3. Json导入到分布式存储(berkeleyje-es)
本文中的代码基于janusgraph 0.3.1进行演示。数据文件都为janusgraph包中自带的数据文件。
1. Json导入到本地TinkerGraph
1.1 配置
conf/hadoop-graph/hadoop-load-json.properties 配置如下:
# # Hadoop Graph Configuration # gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONInputFormat gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat gremlin.hadoop.inputLocation=./data/grateful-dead.json gremlin.hadoop.outputLocation=output gremlin.hadoop.jarsInDistributedCache=true # # SparkGraphComputer Configuration # spark.master=local[*] spark.executor.memory=1g spark.serializer=org.apache.spark.serializer.KryoSerializer spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
1.2 样例Json
{"id":1,"label":"song","inE":{"followedBy":[{"id":3059,"outV":153,"properties":{"weight":1}},{"id":276,"outV":5,"properties":{"weight":2}},{"id":3704,"outV":3,"properties":{"weight":2}},{"id":4383,"outV":62,"pr operties":{"weight":1}}]},"outE":{"followedBy":[{"id":0,"inV":2,"properties":{"weight":1}},{"id":1,"inV":3,"properties":{"weight":2}},{"id":2,"inV":4,"properties":{"weight":1}},{"id":3,"inV":5,"properties":{"we ight":1}},{"id":4,"inV":6,"properties":{"weight":1}}],"sungBy":[{"id":7612,"inV":340}],"writtenBy":[{"id":7611,"inV":527}]},"properties":{"name":[{"id":0,"value":"HEY BO DIDDLEY"}],"songType":[{"id":2,"value":" cover"}],"performances":[{"id":1,"value":5}]}} {"id":2,"label":"song","inE":{"followedBy":[{"id":0,"outV":1,"properties":{"weight":1}},{"id":323,"outV":34,"properties":{"weight":1}}]},"outE":{"followedBy":[{"id":6190,"inV":123,"properties":{"weight":1}},{"i d":6191,"inV":50,"properties":{"weight":1}}],"sungBy":[{"id":7666,"inV":525}],"writtenBy":[{"id":7665,"inV":525}]},"properties":{"name":[{"id":3,"value":"IM A MAN"}],"songType":[{"id":5,"value":"cover"}],"perfo rmances":[{"id":4,"value":1}]}} s
1.3 代码
readGraph = GraphFactory.open('conf/hadoop-graph/hadoop-load-json.properties') writeGraphConf = new BaseConfiguration() writeGraphConf.setProperty("gremlin.graph", "org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph") writeGraphConf.setProperty("gremlin.tinkergraph.graphFormat", "gryo") writeGraphConf.setProperty("gremlin.tinkergraph.graphLocation", "/tmp/csv-graph.kryo") blvp = BulkLoaderVertexProgram.build().bulkLoader(OneTimeBulkLoader).writeGraph(writeGraphConf).create(readGraph) readGraph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get()
1.4 文件校验
新生成的文件如下
[root@vm03 data]# ls -l /tmp/csv-graph.kryo -rw-r--r--. 1 root root 726353 May 29 04:09 /tmp/csv-graph.kryo
2. CSV导入到本地TinkerGraph
2.1 配置
conf/hadoop-graph/hadoop-load-csv.properties 配置如下:
# # Hadoop Graph Configuration # gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.script.ScriptInputFormat gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONOutputFormat gremlin.hadoop.inputLocation=./data/grateful-dead.txt gremlin.hadoop.outputLocation=output gremlin.hadoop.jarsInDistributedCache=true gremlin.hadoop.scriptInputFormat.script=./data/script-input-grateful-dead.groovy # # SparkGraphComputer Configuration # spark.master=local[*] spark.executor.memory=1g spark.serializer=org.apache.spark.serializer.KryoSerializer spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
2.2 样例CSV
1,song,HEY BO DIDDLEY,cover,5 followedBy,2,1|followedBy,3,2|followedBy,4,1|followedBy,5,1|followedBy,6,1|sungBy,340|writtenBy,527 followedBy,3,2|followedBy,5,2|followedBy,62,1|followedBy,153,1 2,song,IM A MAN,cover,1 followedBy,50,1|followedBy,123,1|sungBy,525|writtenBy,525 followedBy,1,1|followedBy,34,1 3,song,NOT FADE AWAY,cover,531 followedBy,81,1|followedBy,86,5|followedBy,127,10|followedBy,59,1|followedBy,83,3|followedBy,103,2|followedBy,68,1|followedBy,134,2|followedBy,131,1|followedBy,151,1|followedBy,3
2.3 代码
script-input-grateful-dead.groovy 代码如下:
def parse(line) { def (vertex, outEdges, inEdges) = line.split(/\t/, 3) def (v1id, v1label, v1props) = vertex.split(/,/, 3) def v1 = graph.addVertex(T.id, v1id.toInteger(), T.label, v1label) switch (v1label) { case "song": def (name, songType, performances) = v1props.split(/,/) v1.property("name", name) v1.property("songType", songType) v1.property("performances", performances.toInteger()) break case "artist": v1.property("name", v1props) break default: throw new Exception("Unexpected vertex label: ${v1label}") } [[outEdges, true], [inEdges, false]].each { def edges, def out -> edges.split(/\|/).grep().each { def edge -> def parts = edge.split(/,/) def otherV, eLabel, weight = null if (parts.size() == 2) { (eLabel, otherV) = parts } else { (eLabel, otherV, weight) = parts } def v2 = graph.addVertex(T.id, otherV.toInteger()) def e = out ? v1.addOutEdge(eLabel, v2) : v1.addInEdge(eLabel, v2) if (weight != null) e.property("weight", weight.toInteger()) } } return v1 }
janusgraph代码:
readGraph = GraphFactory.open('conf/hadoop-graph/hadoop-load-csv.properties') writeGraphConf = new BaseConfiguration() writeGraphConf.setProperty("gremlin.graph", "org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph") writeGraphConf.setProperty("gremlin.tinkergraph.graphFormat", "gryo") writeGraphConf.setProperty("gremlin.tinkergraph.graphLocation", "/tmp/csv-graph2.kryo") blvp = BulkLoaderVertexProgram.build().bulkLoader(OneTimeBulkLoader).writeGraph(writeGraphConf).create(readGraph) readGraph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get() g = GraphFactory.open(writeGraphConf).traversal() g.V().valueMap(true)
2.4 文件校验
新生成的文件如下
[root@vm03 data]# ls -l /tmp/csv-graph2.kryo -rw-r--r--. 1 root root 339939 May 29 04:56 /tmp/csv-graph2.kryo
3. Json导入到分布式存储(berkeleyje-es)
3.1 配置
conf/hadoop-graph/hadoop-load-json-ber-es.properties 配置如下:
# # Hadoop Graph Configuration # gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.graphson.GraphSONInputFormat gremlin.hadoop.graphWriter=org.apache.hadoop.mapreduce.lib.output.NullOutputFormat gremlin.hadoop.inputLocation=./data/grateful-dead.json gremlin.hadoop.outputLocation=output gremlin.hadoop.jarsInDistributedCache=true # # SparkGraphComputer Configuration # spark.master=local[*] spark.executor.memory=1g spark.serializer=org.apache.spark.serializer.KryoSerializer spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
./conf/janusgraph-berkeleyje-es-bulkload.properties 配置如下:
gremlin.graph=org.janusgraph.core.JanusGraphFactory storage.backend=berkeleyje storage.directory=../db/berkeley index.search.backend=elasticsearch
3.2 样例Json
{"id":1,"label":"song","inE":{"followedBy":[{"id":3059,"outV":153,"properties":{"weight":1}},{"id":276,"outV":5,"properties":{"weight":2}},{"id":3704,"outV":3,"properties":{"weight":2}},{"id":4383,"outV":62,"pr operties":{"weight":1}}]},"outE":{"followedBy":[{"id":0,"inV":2,"properties":{"weight":1}},{"id":1,"inV":3,"properties":{"weight":2}},{"id":2,"inV":4,"properties":{"weight":1}},{"id":3,"inV":5,"properties":{"we ight":1}},{"id":4,"inV":6,"properties":{"weight":1}}],"sungBy":[{"id":7612,"inV":340}],"writtenBy":[{"id":7611,"inV":527}]},"properties":{"name":[{"id":0,"value":"HEY BO DIDDLEY"}],"songType":[{"id":2,"value":" cover"}],"performances":[{"id":1,"value":5}]}} {"id":2,"label":"song","inE":{"followedBy":[{"id":0,"outV":1,"properties":{"weight":1}},{"id":323,"outV":34,"properties":{"weight":1}}]},"outE":{"followedBy":[{"id":6190,"inV":123,"properties":{"weight":1}},{"i d":6191,"inV":50,"properties":{"weight":1}}],"sungBy":[{"id":7666,"inV":525}],"writtenBy":[{"id":7665,"inV":525}]},"properties":{"name":[{"id":3,"value":"IM A MAN"}],"songType":[{"id":5,"value":"cover"}],"perfo rmances":[{"id":4,"value":1}]}} s
3.3 代码
outputGraphConfig = './conf/janusgraph-berkeleyje-es-bulkload.properties' readGraph = GraphFactory.open('conf/hadoop-graph/hadoop-load-json-ber-es.properties') blvp = BulkLoaderVertexProgram.build().writeGraph(outputGraphConfig).create(readGraph) readGraph.compute(SparkGraphComputer).workers(1).program(blvp).submit().get() g = GraphFactory.open(outputGraphConfig).traversal() g.V().valueMap(true)
3.4 验证
通过gremlin-server搭建服务进行验证
- gremline-server配置文件如下(gremlin-server-berkeleyje-bulkload.yaml),与gremlin-server-berkeleyje.yaml类似,下面的位置进行调整:
graph: conf/janusgraph-berkeleyje-es-bulkload.properties
- ./gremlin-server.sh conf/gremlin-server/gremlin-server-berkeleyje-bulkload.yaml 启动服务
- 通过graphexp进行查询
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:- 不想用POI?几行代码完成Excel导出导入
- 少说话多写代码之Python学习021——导入模块
- Android Studio导入AOSP源代码【亲测可行】
- GitLab 11.2 发布,可批量导入 Android 操作系统代码
- ASP.NET Aries 高级开发教程:Excel导入之代码编写(番外篇)
- Ubuntu 16.04下将Hadoop2.7.3源代码导入到Eclipse Neon中
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
测出转化率:营销优化的科学与艺术
【美】高尔德(Goward,C.) / 谭磊、唐捷译 / 电子工业出版社 / 2014-10-1 / 68.00元
本书作者通过已成功实现大幅提升转化率的案例,展示了大量以营销为核心的电子商务网站的测试设计方法及转化优化方案。书中作者强调了测试及优化思维的重要性,并就实现方法做了详细讲解。 通过本书,读者将学到如何能够在网站遇到发展和收入瓶颈时,测试出存在的问题并找到解决方案;如何可以深入地了解客户需求,并以此为基础优化网站,使其达到提升转化率的目的;如何提升网站的竞争优势,把在线营销渠道变成高效的转化通......一起来看看 《测出转化率:营销优化的科学与艺术》 这本书的介绍吧!