如何在shell脚本中捕获spark submit的作业状态

0yycz8jy 于 2021-05-29 发布在 Spark

关注(0)|答案(2)|浏览(987)

我正在使用bashshell和spark-sql-2.4.1v。我在shell脚本中使用spark submit提交spark作业。

Need to capture the status of my job. how can this be achieved ?

有什么帮助/建议吗？

apache-spark apache-spark-sql airflow sh

来源：https://stackoverflow.com/questions/62208315/how-to-capture-the-job-status-in-shell-script-for-spark-submit

2条答案

按热度按时间

v1uwarro1#

检查以下代码。

process_start_datetime=$(date +%Y%m%d%H%M%S)
log_path="<log_dir>"
log_file="${log_path}/${app_name}_${process_start_datetime}.log"

spark-submit \
    --verbose \
    --deploy-mode cluster \
    --executor-cores "$executor_cores" \
    --num-executors "$num_executors" \
    --driver-memory "$driver_memory" \
    --executor-memory "$executor_memory"  \
    --master yarn \
    --class main.App "$appJar" 2>&1 | tee -a "$log_file"

status=$(grep "final status:" < "$log_file" | cut -d ":" -f2 | tail -1 | awk '$1=$1')

获取应用程序id

applicationId=$(grep "tracking URL" < "$log_file" | head -n 1 | cut -d "/" -f5)

赞(0）回复(0）举报 2021-05-29

gj3fmq9x2#

spark-submit 是一个异步作业，因此当我们提交命令时，您可以通过调用 SparkContext.applicationId . 然后可以检查状态。
参考-https://issues.apache.org/jira/browse/spark-5439
如果spark部署在Yarn上，则可以使用-

///To get application ID use yarn application -list
yarn application -status application_1459542433815_0002

他们在回答中提到了另一种方式

赞(0）回复(0）举报 2021-05-29

我来回答

如何在shell脚本中捕获spark submit的作业状态

2条答案

相关问题

热门标签

最新问答