目前,我正在编写一个spark代码,使用newapihadooprdd从janusgraph获取数据。在获取数据时,我使用vertexwriteable作为newapihadooprdd的输入类。vertexwriteable是实现hadoop可写接口的类。我还遇到了janusgraphkryoregistrator,它也用于spark中的序列化。我的问题是我的代码是否需要同时使用vertexwriteable和janusgraphkryoregistrator,这两者之间有什么区别?
val rdd: RDD[(NullWritable, VertexWritable)] =spark.sparkContext.newAPIHadoopRDD(hadoopConfiguration,hadoopConfiguration.getClass(Constants.GREMLIN_HADOOP_GRAPH_READER, classOf[InputFormat[NullWritable, VertexWritable]]).asInstanceOf[Class[InputFormat[NullWritable, VertexWritable]]],classOf[NullWritable], classOf[VertexWritable])
conf.setProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
conf.setProperty("spark.kryo.registrator", "org.janusgraph.hadoop.serialize.JanusGraphKryoRegistrator")
暂无答案!
目前还没有任何答案,快来回答吧!