我在dataproc上运行一个由15个n1-standard-v4工作者组成的集群。我的i/o数据是avro格式。
spark作业的最后一个阶段是保存数据,以stackoverflowerror结束。dag是:
员工统计:
包含错误的日志:
java.lang.StackOverflowError
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:3076)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1618)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2093)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1655)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
我试着:
调查vm示例对失败工作进程的监视(一切正常)
平衡分区大小(减小它会产生相同的堆栈溢出错误)
调整spark.executor.memoryoverhead、spark.serializer等参数
到目前为止似乎什么都没用。我想问的问题如下:
为什么在java.io.objectinputstream$blockdatainputstream.readbyte处记录错误点作为错误原因?
我怎样才能改善像stackoverflow这样的问题的调查政策?
谢谢!
暂无答案!
目前还没有任何答案,快来回答吧!