shell Bash：等待超时

omjgkv6w 于 2023-03-19 发布在 Shell

关注(0)|答案(9)|浏览(566)

在Bash脚本中，我希望执行以下操作：

app1 &
pidApp1=$!
app2 &
pidApp2=$1

timeout 60 wait $pidApp1 $pidApp2
kill -9 $pidApp1 $pidApp2

也就是说，在后台启动两个应用程序，给予它们60秒的时间完成工作，如果它们没有在60秒内完成工作，就杀死它们。
不幸的是，上面的命令不起作用，因为timeout是一个可执行文件，而wait是一个shell命令。

timeout 60 bash -c wait $pidApp1 $pidApp2

但这仍然不起作用，因为wait只能在同一shell中启动的PID上调用。
有什么想法吗？

shell

来源：https://stackoverflow.com/questions/10028820/bash-wait-with-timeout

9条答案

按热度按时间

uoifb46i1#

你的例子和公认的答案都过于复杂，既然这正是timeout的用例，为什么不 * 仅 * 使用timeout呢？timeout命令甚至有一个内置选项（-k）在发送终止命令的初始信号之后发送SIGKILL（默认为SIGTERM），如果在发送初始信号后命令仍在运行（请参见man timeout）。
如果脚本不一定需要wait并在等待后恢复控制流，则只需

timeout -k 60s 60s app1 &
timeout -k 60s 60s app2 &
# [...]

但是，如果是这样，也可以通过保存**timeout**PID来实现：

pids=()
timeout -k 60s 60s app1 &
pids+=($!)
timeout -k 60s 60s app2 &
pids+=($!)
wait "${pids[@]}"
# [...]

例如

$ cat t.sh
#!/bin/bash

echo "$(date +%H:%M:%S): start"
pids=()
timeout 10 bash -c 'sleep 5; echo "$(date +%H:%M:%S): job 1 terminated successfully"' &
pids+=($!)
timeout 2 bash -c 'sleep 5; echo "$(date +%H:%M:%S): job 2 terminated successfully"' &
pids+=($!)
wait "${pids[@]}"
echo "$(date +%H:%M:%S): done waiting. both jobs terminated on their own or via timeout; resuming script"

。

$ ./t.sh
08:59:42: start
08:59:47: job 1 terminated successfully
08:59:47: done waiting. both jobs terminated on their own or via timeout; resuming script

赞(0）回复(0）举报 2023-03-19

ocebsuys2#

将PID写入文件并启动应用程序，如下所示：

pidFile=...
( app ; rm $pidFile ; ) &
pid=$!
echo $pid > $pidFile
( sleep 60 ; if [[ -e $pidFile ]]; then killChildrenOf $pid ; fi ; ) &
killerPid=$!

wait $pid
kill $killerPid

这将创建另一个进程，该进程在超时期间休眠，如果到目前为止还没有完成，则会终止该进程。
如果进程完成得更快，则删除PID文件并终止杀手进程。
killChildrenOf是一个脚本，用于获取所有进程并终止某个PID的所有子进程。有关实现此功能的不同方法，请参阅此问题的答案：Best way to kill all child processes
如果你想跳出BASH，你可以把PID和超时写入一个目录，然后监视这个目录，每隔一分钟左右，读取条目，检查哪些进程还在运行，它们是否超时了。

EDIT如果要了解进程是否已成功终止，可以使用kill -0 $pid
EDIT2或者您可以尝试进程组。kevinarpe表示：要获取PID（146322）的PGID：

ps -fjww -p 146322 | tail -n 1 | awk '{ print $4 }'

在我的例子中：145974。然后PGID可以与一个特殊的kill选项一起使用，以终止组中的所有进程：kill -- -145974

赞(0）回复(0）举报 2023-03-19

ru9i0ody3#

下面是亚伦·迪古拉答案的简化版本，它使用了亚伦·迪古拉在评论中留下的kill -0技巧：

app &
pidApp=$!
( sleep 60 ; echo 'timeout'; kill $pidApp ) &
killerPid=$!

wait $pidApp
kill -0 $killerPid && kill $killerPid

在我的例子中，我希望set -e -x安全并返回状态码，所以我使用了：

set -e -x
app &
pidApp=$!
( sleep 45 ; echo 'timeout'; kill $pidApp ) &
killerPid=$!

wait $pidApp
status=$?
(kill -0 $killerPid && kill $killerPid) || true

exit $status

退出状态143表示SIGTERM，几乎可以肯定是由于超时。

赞(0）回复(0）举报 2023-03-19

vq8itlhq4#

我编写了一个bash函数，它将等待PID完成或超时，如果超时则返回非零值，并打印所有未完成的PID。

function wait_timeout {
  local limit=${@:1:1}
  local pids=${@:2}
  local count=0
  while true
  do
    local have_to_wait=false
    for pid in ${pids}; do
      if kill -0 ${pid} &>/dev/null; then
        have_to_wait=true
      else
        pids=`echo ${pids} | sed -e "s/${pid}//g"`
      fi
    done
    if ${have_to_wait} && (( $count < $limit )); then
      count=$(( count + 1 ))
      sleep 1
    else
      echo ${pids}
      return 1
    fi
  done   
  return 0
}

要使用它，只需wait_timeout $timeout $PID1 $PID2 ...

赞(0）回复(0）举报 2023-03-19

lhcgjxsq5#

把我的2c代入，我们可以把特谢拉的解简化为：

try_wait() {
    # Usage: [PID]...
    for ((i = 0; i < $#; i += 1)); do
        kill -0 $@ && sleep 0.001 || return 0
    done
    return 1 # timeout or no PIDs
} &>/dev/null

Bash的sleep接受小数秒，0.001s = 1 ms = 1 KHz =充足的时间，然而，UNIX在文件和进程方面没有漏洞。

$ cat &
[1] 16574
$ try_wait %1 && echo 'exited' || echo 'timeout'
timeout
$ kill %1
$ try_wait %1 && echo 'exited' || echo 'timeout'
exited

我们必须回答一些棘手的问题才能取得进展。
为什么wait没有超时参数？可能是因为timeout，kill -0，wait和wait -n命令可以更精确地告诉机器我们想要什么。
为什么wait一开始就内置在Bash中，这样timeout wait PID就不工作了？也许只是为了让Bash能够实现正确的信号处理。
考虑：

$ timeout 30s cat &
[1] 6680
$ jobs
[1]+    Running   timeout 30s cat &
$ kill -0 %1 && echo 'running'
running
$ # now meditate a bit and then...
$ kill -0 %1 && echo 'running' || echo 'vanished'
bash: kill: (NNN) - No such process
vanished

无论是在物质世界还是在机器中，我们都需要一些可以奔跑的地面，我们也需要一些可以等待的地面。

当kill失败时，你几乎不知道为什么。除非你写了这个过程，或者它的手册命名了这种情况，否则没有办法确定一个合理的超时值。
当你编写了这个过程之后，你可以实现一个合适的TERM处理程序，甚至可以通过一个命名管道响应发送给它的“Auf Wiedersehen！”，这样你就有了一些基础，即使是像try_wait：-）

赞(0）回复(0）举报 2023-03-19

lx0bsm1f6#

您可以使用“read”内部命令的超时。
以下命令将终止未终止的作业，并在最多60秒后显示已完成作业的名称：

( (job1; echo -n "job1 ")& (job2; echo -n "job2 ")&) | (read -t 60 -a jobarr; echo ${jobarr[*]} ${#jobarr[*]} )

它的工作原理是创建一个包含所有后台作业的子shell，这个子shell的输出被读入一个bash数组变量，可以根据需要使用该变量（在本例中是打印数组+元素计数）。
确保在读取命令所在的子shell中引用${jobarr}（因此使用括号），否则${jobarr}将为空。
读命令结束后，所有的子shell将自动静音（不被杀死）。你必须自己杀死它们。

赞(0）回复(0）举报 2023-03-19

9rnv2umw7#

app1 &
app2 &
sleep 60 &

wait -n

赞(0）回复(0）举报 2023-03-19

7fhtutme8#

又一个 * 超时 * bash的脚本

运行许多子进程，总超时。使用bash的最新特性，我编写了以下代码：

#!/bin/bash
maxTime=5.0 jobs=() pids=() cnt=1 Started=${EPOCHREALTIME/.}
if [[ $1 == -m ]] ;then maxTime=$2; shift 2; fi

for cmd ;do  # $cmd is unquoted in order to use strings as command + args
    $cmd &
    jobs[$!]=$cnt pids[cnt++]=$!
done

printf -v endTime %.6f $maxTime
endTime=$(( Started + 10#${endTime/.} ))
exec {pio}<> <(:) # Pseudo FD for "builtin sleep" by using "read -t" 
while ((${#jobs[@]})) && (( ${EPOCHREALTIME/.} < endTime ));do
    for cnt in ${jobs[@]};do
        if ! jobs $cnt &>/dev/null;then
            Elap=00000$(( ${EPOCHREALTIME/.} - Started ))
            printf 'Job %d (%d) ended after %.4f secs.\n' \
                   $cnt ${pids[cnt]} ${Elap::-6}.${Elap: -6}
            unset jobs[${pids[cnt]}] pids[cnt]
        fi
    done
    read -ru $pio -t .02 _
done
if ((${#jobs[@]})) ;then
    Elap=00000$(( ${EPOCHREALTIME/.} - Started ))
    for cnt in ${jobs[@]};do
        printf 'Job %d (%d) killed after %.4f secs.\n' \
               $cnt ${pids[cnt]} ${Elap::-6}.${Elap: -6}
    done
    kill ${pids[@]}
fi

样品运行：

带参数的命令可作为字符串提交
-m开关允许您选择一个浮动为 * 最大时间 *（以秒为单位）。

$ ./execTimeout.sh -m 2.3 "sleep 1" 'sleep 2' sleep\ {3,4}  'cat /dev/tty'
Job 1 (460668) ended after 1.0223 secs.
Job 2 (460669) ended after 2.0424 secs.
Job 3 (460670) killed after 2.3100 secs.
Job 4 (460671) killed after 2.3100 secs.
Job 5 (460672) killed after 2.3100 secs.

为了进行测试，我编写了以下脚本

在 1.0000 和 9.9999 秒之间选择随机持续时间n
用于输出 0 和 8 之间的随机行数。（他们无法输出任何内容）。
行输出包含进程ID（$$）、要打印的剩余行数和总持续时间（秒）。

#!/bin/bash

tslp=$RANDOM lnes=${RANDOM: -1}
printf -v tslp %.6f ${tslp::1}.${tslp:1}
slp=00$((${tslp/.}/($lnes?$lnes:1)))
printf -v slp %.6f ${slp::-6}.${slp: -6}
# echo >&2 Slp $lnes x $slp == $tslp
exec {dummy}<> <(: -O)
while read -rt $slp -u $dummy; ((--lnes>0)); do
    echo $$ $lnes $tslp
done

一次运行此脚本5次，超时时间为5.0秒：

$ ./execTimeout.sh -m 5.0 ./tstscript.sh{,,,,}
2869814 6 2.416700
2869815 5 3.645000
2869814 5 2.416700
2869814 4 2.416700
2869815 4 3.645000
2869814 3 2.416700
2869813 5 8.414000
2869812 1 3.408000
2869814 2 2.416700
2869815 3 3.645000
2869814 1 2.416700
2869815 2 3.645000
Job 3 (2869814) ended after 2.4511 secs.
2869813 4 8.414000
2869815 1 3.645000
Job 1 (2869812) ended after 3.4518 secs.
Job 4 (2869815) ended after 3.6757 secs.
2869813 3 8.414000
Job 2 (2869813) killed after 5.0159 secs.
Job 5 (2869816) killed after 5.0159 secs.

赞(0）回复(0）举报 2023-03-19

rpppsulh9#

有一些进程在从timeout调用时不能很好地工作。我遇到了一个问题，需要在qemu示例周围放置一个timeout catch，如果您调用

timeout 900 qemu

它会一直挂着。
我的解决方案

./qemu_cmd &
qemuPid=$!
timeout 900 tail --pid=$qemuPid -f /dev/null
ret=$?
if [ "$ret" != "0" ]; then
   allpids=()
   descendent_pids $tracePid
   for pids in ${allpids[@]};do
      kill -9 $pids
   done
fi

descendent_pids(){
   allpids=("${allpids[@]}" $1)
   pids=$(pgrep -P $1)
   for pid in $pids; do
      descendent_pids $pid
   done
}

还要注意的是，超时并不总是会杀死后代进程，这取决于您从超时中派生的cmd的复杂程度。

赞(0）回复(0）举报 2023-03-19

我来回答

shell Bash：等待超时

9条答案

又一个 * 超时 * bash的脚本

相关问题

热门标签

最新问答