如何在bash脚本中等待从该脚本派生的几个子进程完成,然后在任何子进程以code !=0结束时返回退出代码!=0?

简单的脚本:

#!/bin/bash
for i in `seq 0 9`; do
  doCalculations $i &
done
wait

上面的脚本将等待所有10个子进程,但它总是给出退出状态0(参见help wait)。我如何修改这个脚本,以便它将发现衍生子进程的退出状态,并在任何子进程以code !=0结束时返回退出代码1 ?

有没有比收集子进程的pid、按顺序等待它们并求和退出状态更好的解决方案呢?


当前回答

在等待流程之前,流程可能已经完成。如果我们触发等待一个已经完成的进程,它将触发一个错误,比如pid不是这个shell的子进程。为了避免这种情况,可以使用以下函数来查找过程是否完成:

isProcessComplete(){
PID=$1
while [ -e /proc/$PID ]
do
    echo "Process: $PID is still running"
    sleep 5
done
echo "Process $PID has finished"
}

其他回答

我最近用过这个(感谢Alnitak):

#!/bin/bash
# activate child monitoring
set -o monitor

# locking subprocess
(while true; do sleep 0.001; done) &
pid=$!

# count, and kill when all done
c=0
function kill_on_count() {
    # you could kill on whatever criterion you wish for
    # I just counted to simulate bash's wait with no args
    [ $c -eq 9 ] && kill $pid
    c=$((c+1))
    echo -n '.' # async feedback (but you don't know which one)
}
trap "kill_on_count" CHLD

function save_status() {
    local i=$1;
    local rc=$2;
    # do whatever, and here you know which one stopped
    # but remember, you're called from a subshell
    # so vars have their values at fork time
}

# care must be taken not to spawn more than one child per loop
# e.g don't use `seq 0 9` here!
for i in {0..9}; do
    (doCalculations $i; save_status $i $?) &
done

# wait for locking subprocess to be killed
wait $pid
echo

从这里,我们可以很容易地推断,并拥有一个触发器(触摸文件,发送信号)并改变计数标准(计数触摸的文件,或其他)以响应该触发器。或者如果你只是想要'any'非零rc,只需从save_status中杀死锁。

只需将结果存储在shell之外,例如在一个文件中。

#!/bin/bash
tmp=/tmp/results

: > $tmp  #clean the file

for i in `seq 0 9`; do
  (doCalculations $i; echo $i:$?>>$tmp)&
done      #iterate

wait      #wait until all ready

sort $tmp | grep -v ':0'  #... handle as required

在等待流程之前,流程可能已经完成。如果我们触发等待一个已经完成的进程,它将触发一个错误,比如pid不是这个shell的子进程。为了避免这种情况,可以使用以下函数来查找过程是否完成:

isProcessComplete(){
PID=$1
while [ -e /proc/$PID ]
do
    echo "Process: $PID is still running"
    sleep 5
done
echo "Process $PID has finished"
}

这里已经有很多答案了,但我很惊讶似乎没有人建议使用数组……这就是我所做的——这可能在将来对一些人有用。

n=10 # run 10 jobs
c=0
PIDS=()

while true

    my_function_or_command &
    PID=$!
    echo "Launched job as PID=$PID"
    PIDS+=($PID)

    (( c+=1 ))

    # required to prevent any exit due to error
    # caused by additional commands run which you
    # may add when modifying this example
    true

do

    if (( c < n ))
    then
        continue
    else
        break
    fi
done 


# collect launched jobs

for pid in "${PIDS[@]}"
do
    wait $pid || echo "failed job PID=$pid"
done

wait还(可选地)接受要等待的进程的PID,并且使用$!你会得到最后一个命令的PID在后台启动。 修改循环,将每个衍生子进程的PID存储到一个数组中,然后再次循环等待每个PID。

# run processes and store pids in array
for i in $n_procs; do
    ./procs[${i}] &
    pids[${i}]=$!
done

# wait for all pids
for pid in ${pids[*]}; do
    wait $pid
done