graphframes:py4j.protocol.py4jjavaerror:调用o100.creategraph时出错

vawmfj5a  于 2021-05-27  发布在  Spark
关注(0)|答案(0)|浏览(516)

我使用spark 2.4.4运行一个简单的emr群集,我想使用graphframes v0.7运行以下代码:

from pyspark import *
from pyspark.sql import *
from graphframes import *

sc= SparkContext().getOrCreate()
sc.setLogLevel("ERROR")
spark = SparkSession.builder.appName('graphFrames').getOrCreate()
spark.sparkContext.addPyFile("/home/hadoop/jars/graphframes.zip")

vertices = spark.createDataFrame([('1', 'Carter', 'Derrick', 50),
                                  ('2', 'May', 'Derrick', 26),
                                 ('3', 'Mills', 'Jeff', 80),
                                  ('4', 'Hood', 'Robert', 65),
                                  ('5', 'Banks', 'Mike', 93),
                                 ('98', 'Berg', 'Tim', 28),
                                 ('99', 'Page', 'Allan', 16)],
                                 ['id', 'name', 'firstname', 'age'])
edges = spark.createDataFrame([('1', '2', 'friend'),
                               ('2', '1', 'friend'),
                              ('3', '1', 'friend'),
                              ('1', '3', 'friend'),
                               ('2', '3', 'follows'),
                               ('3', '4', 'friend'),
                               ('4', '3', 'friend'),
                               ('5', '3', 'friend'),
                               ('3', '5', 'friend'),
                               ('4', '5', 'follows'),
                              ('98', '99', 'friend'),
                              ('99', '98', 'friend')],
                              ['src', 'dst', 'type'])
g = GraphFrame(vertices, edges)

## Take a look at the DataFrames

g.vertices.show()
g.edges.show()

## Check the number of edges of each vertex

g.degrees.show()

它被发现并导入如下:

[root@ip-172-31-13-149 scripts]# $SPARK_HOME/bin/spark-submit --packages
graphframes:graphframes:0.7.0-spark2.4-s_2.11 tst.py
Ivy Default Cache set to: /root/.ivy2/cache
The jars for the packages stored in: /root/.ivy2/jars
:: loading settings :: url = jar:file:/usr/lib/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
graphframes#graphframes added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-835b0432-a6e7-4b5c-afd6-44e7f6ab2c26;1.0
        confs: [default]
        found graphframes#graphframes;0.7.0-spark2.4-s_2.11 in spark-packages
        found org.slf4j#slf4j-api;1.7.16 in central
:: resolution report :: resolve 116ms :: artifacts dl 3ms
        :: modules in use:
        graphframes#graphframes;0.7.0-spark2.4-s_2.11 from spark-packages in [default]
        org.slf4j#slf4j-api;1.7.16 from central in [default]

当我运行一个简单的graphframe示例时,遇到以下错误:

Traceback (most recent call last):
  File "/home/hadoop/scripts/tst.py", line 32, in <module>
    g = GraphFrame(vertices, edges)
  File "/root/.ivy2/jars/graphframes_graphframes-0.7.0-spark2.4-s_2.11.jar/graphframes/graphframe.py", line 89, in __init__
  File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
  File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o100.createGraph.
: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
        at org.graphframes.GraphFrame$.apply(GraphFrame.scala:676)
        at org.graphframes.GraphFramePythonAPI.createGraph(GraphFramePythonAPI.scala:10)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:282)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Thread.java:748)

在spark-default.sh中还添加了jar包:

spark.jars.packages              graphframes:graphframes:0.7.0-spark2.4-s_2.11

还尝试了hughcristensen建议的步骤,如下所示:https://github.com/graphframes/graphframes/issues/172
我真的很感激任何帮助,因为我不知道我还能做什么。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题