如果在使用wait时嵌套函数中出现错误，则shell脚本不退出

wh6knrhe 于 2023-05-01 发布在 Shell

关注(0)|答案(1)|浏览(137)

在我的shell脚本中，如果几个嵌套函数并发运行并失败，脚本不会失败。

status() {
    exit_st=$1
    error_1=$2
    if ! [[ $exit_st -eq 0 ]]; then
        echo "[ERROR] -  ${error_1}"
        exit 1
    else 
        echo "[INFO] -  ${error_1}"
    fi
}

abc(){
   val1= $1
   val2= $2
   #Some SQL command here
   status $? "SQL command step"
}

abc cmd1 cmd2 &

abc cmd3 cmd4 &

wait 

echo 'hi'

在上面的代码中，如果#Some SQL command here上的命令失败，则脚本不会退出并继续打印hi。我尝试将exit更改为return，但它不会出错。我希望如果任何abc作业失败，那么整个脚本应该退出非零代码。
我的bash版本是GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)，所以我无法使用wait -n选项

shell

来源：https://stackoverflow.com/questions/76104362/shell-script-not-exiting-if-there-is-error-in-a-nested-function-while-using-wait

1条答案

按热度按时间

gmxoilav1#

当后台进程以非零状态退出时，即使set -e（set -o errexit）处于活动状态，Bash也不会自动退出。如果您希望您的程序在后台进程失败时退出，那么您需要显式地检测失败。
如果你有Bash 43（2014年发布）或更高版本，然后您可以在问题中的代码中通过替换

wait

与

while wait -n; do
    :
done

在Bash 4中引入了wait的-n选项。3.导致wait等待下一个后台进程退出，并返回其状态。
循环运行直到wait -n返回非零状态;或者是因为后台进程以非零状态退出，或者是因为所有后台进程都以零状态退出。
有关wait -n的更多信息，以及有关在Bash中处理后台进程的一般信息，请参见ProcessManagement - Greg's Wiki。页面上说在使用wait -n的程序中运行set -m。我还没有发现这是必要的，但这可能是因为我使用的是更晚的Bash版本。YMMV。

尽管while wait ...循环允许程序在后台进程失败时继续运行（例如，在后台进程失败时）。例如，它可以调用exit），它可能会让其他后台进程仍然运行，即使在主程序退出后。这可能导致不需要的处理和/或对终端的意外输出。您可能希望在while wait ...循环终止后终止所有剩余的后台进程。一种方法是：

jobs_output=$(jobs)
while IFS= read -r line; do
    jobnum=${line#*\[}
    jobnum=${jobnum%%\]*}
    kill "%$jobnum"
done <<<"$jobs_output"

jobs命令为每个活动的后台进程输出一行。每行以方括号中的作业编号开始。请参阅删除字符串的一部分（BashFAQ/100（How do I do string manipulation in bash？）），以了解${line#*\[}和${jobnum%%\]*}（用于从行中提取作业号）的说明。

Bash的版本超过4。3我管理后台进程的首选选项是在轮询循环中使用jobs命令。
这是一个，Shellcheck-干净的，修改过的程序版本，它演示了该技术：

#! /bin/bash -p

function status
{
    local -r exit_st=$1
    local -r error_1=$2

    if (( exit_st == 0 )); then
        printf '[INFO] - %s\n' "$error_1" >&2
    else 
        printf '[ERROR] - %s\n' "$error_1" >&2
        exit 1
    fi
}

function abc
{
   local -r val1=$1
   local -r val2=$2

   run_sql_command "$val1" "$val2"
   status "$?" 'SQL command step'
}

# Wait for background processes (specified by PIDs given as function
# arguments) to complete.
# If any background process completes with non-zero exit status, return
# immediately (without waiting for any other background processes) using the
# failed process's exit status as the return status.
function wait_for_pids
{
    local -r bgpids=( "$@" )

    # Use a sparse array ('is_active_pids') indexed by PID values to maintain
    # a set of background processes that are still active
    local pid is_active_pid=()
    for pid in "${bgpids[@]}"; do
        is_active_pid[pid]=1
    done

    local jobs_output old_active_pids=()
    while (( ${#is_active_pid[*]} > 0 )); do
        # Get a list of PIDs of background processes that are still active
        jobs_output=$(jobs -pr)
        IFS=$'\n' read -r -d '' -a active_pids <<<"$jobs_output"

        old_active_pids=( "${!is_active_pid[@]}" )

        # Update the set of still active background PIDs
        is_active_pid=()
        for pid in ${active_pids[@]+"${active_pids[@]}"}; do
            is_active_pid[pid]=1
        done

        # Find processes that are no longer active (i.e. they have exited)
        # and check their exit statuses
        for pid in "${old_active_pids[@]}"; do
            if (( ! ${is_active_pid[pid]-0} )); then
                wait "$pid" || return "$?"
            fi
        done

        sleep 1
    done
}

# Kill all background processes that are running, and exit the program
# with the exit status provided as an argument
function kill_running_jobs_and_exit
{
    local -r exit_status=$1

    local jobs_output line jobnum
    jobs_output=$(jobs -r)
    while IFS= read -r line; do
        [[ $line == *\[*\]* ]] || continue
        jobnum=${line#*\[}
        jobnum=${jobnum%%\]*}
        # Kill by job number instead of PID because killing by PID is
        # subject to race conditions that may cause the wrong process to be
        # killed
        kill "%$jobnum"
        printf '[INFO] - Killed: %s\n' "$line" >&2
    done <<<"$jobs_output"

    exit "$exit_status"
}

bgpids=()

abc cmd1 cmd2 &
bgpids+=( "$!" )

abc cmd3 cmd4 &
bgpids+=( "$!" )

wait_for_pids "${bgpids[@]}" || kill_running_jobs_and_exit "$?"

echo 'hi'

有几个变化是小的，以修复Shellcheck警告或转换为标准或最佳实践（例如。例如，将诊断输出发送到标准错误并使用printf而不是echo）。
一个重要的变化是使用数组bgpids来保存后台进程的PID列表。
另一个重要的变化是增加了两个新功能：wait_for_pids和kill_running_jobs_and_exit。
最后一个重大变化是将wait替换为wait_for_pids "${bgpids[@]}" || kill_running_jobs_and_exit "$?"。
将run_sql_command "$val1" "$val2"替换为适合您的任何内容。我编写并使用了一个名为run_sql_command的函数进行测试。
我使用的轮询间隔为1秒（sleep 1）。有些东西可能对你更好（E）。例如sleep 10或（如果sleep支持浮点参数）sleep 0.1）。
有关如何使用is_active_pid数组的信息，请参见BashGuide/Arrays - Greg's Wiki的稀疏数组部分。
使用${active_pids[@]+"${active_pids[@]}"}而不是"${active_pids[@]}来解决旧版本Bash中的一个错误，该错误导致它在set -o nounset（set -u）生效时错误处理空数组。参见bash empty array expansion with 'set -u'。
我用Bash版本3测试了代码。2.它也应该可以与Bash的所有后续版本一起工作。

赞(0）回复(0）举报 2023-05-01

我来回答

如果在使用wait时嵌套函数中出现错误，则shell脚本不退出

1条答案

相关问题

热门标签

最新问答