向shell脚本添加重试

rqdpfwrv  于 2023-10-23  发布在  Shell
关注(0)|答案(1)|浏览(114)

我的目标是增加这种fastq验证的重试次数,这种验证有时会由于网络问题而失败,尽管给定的NCBI SRA ID有数据,但它会失败。所以我需要增加重试至少5倍前中止。
我该怎么做?

set -x
PS4='[\\d \\t] '

# Check parameter for error
check=0
# Print fastq-dump executable path
echo \$(which fastq-dump)

# Loop through all parameters to check validity
for file in \$@;
do
    cp \${file} .
    # Extract filename for sampleID
    file_basename=\$(basename \${file})
    id=\${file_basename%".id"}
    # Start validation
    echo "Checking \${id}..."
    # Download start of fastq
    fastq-dump $(get_ngc()) -X 1 -Z --split-spot \${id} > \${id}.test.fastq 2> \${id}.test.log
    # Get number of lines downloaded to valildate for error
    numLines=\$(cat \${id}.test.fastq | wc -l)
    if [ \$numLines -gt 0 ]; then
        echo "\${id} has data... OK"
    else
        echo "\${id} does not have data... ERROR"
        check=1
    fi
done
# Exit with error if some fastqs not accessible
if [ \$check -gt 0 ]; then
    echo "ERROR: One or more samples have inaccessible fastqs.. exiting"
    exit 1
fi
zsbz8rwp

zsbz8rwp1#

这是一个试探性的回答。希望至少它可以帮助你提出一个更好的问题。
你在评论中提到反斜杠在“cwl文件”中为你工作,但你没有解释什么是“cwl文件”。如果你指的是通用工作流语言,那么我们需要了解更多关于脚本周围的YAML结构;当然也有YAML标量格式选项,其中反斜杠是不必要的。
没有更多的细节,让我们只是说,对于一个“shell脚本”,这些反斜杠破坏的功能,所以我已经删除了他们。
也不清楚您到底想捕获哪种错误条件。据推测,fastq-dump是这些错误的来源。如果它写得很好,你应该能够简单地说,

if fastq-dump --options arguments; then ...

但是我坚持使用您的笨拙方法来计算输出行数,因为我不熟悉这个工具,而且您也没有提供指向它的文档的链接。

#!/bin/sh
# ^ explicitly name which shell you are using
# See also https://en.wikipedia.org/wiki/Shebang_(Unix)

set -x
PS4='[\\d \\t] '

check=0
# Avoid useless use of echo
# https://www.iki.fi/era/unix/award.html#echo
# Prefer POSIX "command -v" over nonstandard "which"
command -v fastq-dump

# Quote all file names
# https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable
for file in "$@"; do
    cp "$file" .
    # Basename knows how to trim extension
    id=$(basename "$file" .id)
    # Write diagnostics to stderr
    echo "$0: Checking $id..." >&2
    # Truncate log so we can append in a loop; see below
    : > "$id".test.log
    # Loop until success, or retries exhausted
    for retry in 1 2 3 4 5; do
        # Append rather than overwrite error log, in case we retry
        fastq-dump $(get_ngc()) -X 1 -Z --split-spot "$id" > "$id".test.fastq 2>> "$id".test.log
        # Avoid useless cat
        # https://stackoverflow.com/questions/11710552/useless-use-of-cat 
        numLines=$(wc -l < "$id".test.fastq)
        if [ $numLines -gt 0 ]; then
            echo "$0: $id has data... OK" >&2
            break
        else
            echo "$0: $id does not have data... ERROR" >&2
            case $retry in
             5) echo "$0: $id: aborting, after 5 attempts" >&2
                check=1;;
             *) # Sleep before retry
                sleep 5;;
            esac
        fi
    done
done
# Exit with error if some fastqs not accessible
if [ $check -gt 0 ]; then
    echo "$0: ERROR: One or more samples have inaccessible fastqs.. exiting" >&2
    exit 1
fi

这里的许多初学者错误都会被https://shellcheck.net/;检测到,甚至经常被修复,在询问这里之前,可能会通过这个工具运行您的脚本,以避免分散回答者的注意力。显然,有时修复也会解决你想问的问题。
$(get_ngc())看起来仍然像一个语法错误,但我猜它是CWL的一部分.?

相关问题