hive将多阶段作业的错误刷新到python中的stderr

fkvaft9z 于 2021-06-04 发布在 Hadoop

关注(0)|答案(1)|浏览(472)

我想知道是否可以在消息发生时将它们从hivecli刷新到stderr。目前我正在尝试执行一个多阶段查询（只是一个示例，而不是实际的）：

SELECT  COUNT(*) FROM ( 
SELECT user from users
where datetime = 05-10-2013
UNION ALL
SELECT user from users
where datetime = 05-10-2013 
) a

这将启动3个作业，但是如果作业1因为被终止而失败，我不想运行作业2。目前我的代码如下所示，但是在所有子查询完成并返回错误之前，hive不会写入stderr。

def execute_hive_query(query):
    return_code = None
    cmd = ["hive", "-e", query]
    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    while return_code is None:
        out = proc.stdout.read()
        error = proc.stderr.read()
        handle_hive_exception(out,error)
        time.sleep(10)
        return_code = proc.poll()

def handle_hive_exception(stdout,stderr):
      if stderr != '':
      raise Exception(stderr)

谢谢！

hadoop Hive python subprocess

来源：https://stackoverflow.com/questions/16825066/hive-flush-errors-of-multi-stage-jobs-to-stderr-in-python