shell 用另一个引用文件中的值更新特定的行和列

8yparm6h  于 2023-06-30  发布在  Shell
关注(0)|答案(3)|浏览(99)

这是我上一个线程的后续问题(更新匹配行后的第二行引用值),具有更高级的要求。我有一个主文件main,我希望修改2个目标:(1)要在main中找到MATCH LINE短语,向下跳转2行,并将第3列替换为ref文件的第2列;(2)如果一行有write output短语,则用类似的替换来替换其第4列。所以ref有两列:第一个用于输出文件名,第二个用于替换值。请参见下面的示例和所需的输出。

主文件

one line here
This is the 'MATCH LINE'
# this is just a comment
Now this *** to be updated
write output label ***
another line here

参考文件

Out1 ONE
Out2 TWO
Out3 THREE

所需输出file 1(Out 1)

one line here
This is the 'MATCH LINE'
# this is just a comment
Now this ONE to be updated
write output label ONE
another line here

所需输出file 1(Out 2)

one line here
This is the 'MATCH LINE'
# this is just a comment
Now this TWO to be updated
write output label TWO
another line here

所需输出file 1(Out 3)

one line here
This is the 'MATCH LINE'
# this is just a comment
Now this THREE to be updated
write output label THREE
another line here

我的脚本来自Ed Morton @ed-morton,他好心地帮助我完成了上一个线程,我已经修改了它以适应新的要求,但它给了我错误。谢谢你的帮助。

#!/bin/awk -f

NR == FNR {
    lines[++numLines] = $0
    a[NR]=$2
    if ( /\047MATCH LINE\047/ ) {
        tgt1 = NR + 2
    }
    if ( /write output/ ) {
        tgt2 = NR
    }
    next
}
{
    for ( lineNr=1; lineNr<=numLines; lineNr++ ) {
        line = lines[lineNr]
        if ( lineNr == tgt1 ) {
            #sub(/NUMBER/,$2,line)
            line[$3]=a[FNR]
        }
        if ( lineNr == tgt2 ) {
            line[$4]=a[FNR]
        }
        print line > $1
    }
    close($1)
}
./tst.awk main ref

错误:
标量“line”不能用作数组
艾德建议将行拆分成数组,替换正确的索引并将它们缝合在一起;但是输出看起来很奇怪。下面是更新后的脚本和输出。

#!/bin/awk -f

NR == FNR {
    lines[++numLines] = $0
    a[NR]=$2
    if ( /\047MATCH LINE\047/ ) {
        tgt1 = NR + 2
    }
    if ( /write output/ ) {
        tgt2 = NR
    }
    next
}
{
    for ( lineNr=1; lineNr<=numLines; lineNr++ ) {
        line = lines[lineNr]
        if ( lineNr == tgt1 ) {
            #sub(/NUMBER/,$2,line)
            numFlds = split(line,flds)
            flds[3] = a[FNR]
            for ( fldNr=1; fldNr<=numFlds; fldNr++ ) {
                line = (fldNr==1 ? "" : line " ") flds[fldNr]
            }
        }
        if ( lineNr == tgt2 ) {
            numFlds = split(line,flds)
            flds[4] = a[FNR]
            for ( fldNr=1; fldNr<=numFlds; fldNr++ ) {
                line = (fldNr==1 ? "" : line " ") flds[fldNr]
            }
        }
        print line > $1
    }
    close($1)
}

输出

$ head Out*
==> Out1 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this line to be updated
write output label line
another line here

==> Out2 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this is to be updated
write output label is
another line here
c7rzv4ha

c7rzv4ha1#

在任何POSIX awk中,更改:

line[$3]=a[FNR]

致:

match(line,/^[[:space:]]*([^[:space:]]+[[:space:]]+){2}/)
tail = substr(line,RSTART+RLENGTH)
sub(/[^[:space:]]+/,"",tail)
line = substr(line,RSTART,RLENGTH) a[FNR] tail

同样对于line[$4]=a[FNR],只需将上面match()中的{2}改为{3}
正如在注解中已经提到的,您的错误消息是因为line是一个标量(在本例中包含一个字符串),而您试图将其视为一个数组。如果你想把line当作一个数组,那么你必须首先在它上面运行split(),从它的内容创建一个新数组,然后在新数组中赋值,然后将数组重新组合成一个字符串存储在line中。
例如,如果你不关心保留白色(可以用GNU awks 4th arg to split()来解决),你可以将line中的第三个字段替换为:

numFlds = split(line,flds)
flds[3] = a[FNR]
line = flds[1]
for ( fldNr=2; fldNr<=numFlds; fldNr++ ) {
    line = line " " flds[fldNr]
}

我在上面使用了文字字符串替换而不是*sub(),所以即使a[FNR]包含反向引用元字符&,它也能工作。
另外,当试图修改my previous answer以解决当前问题时,您在更改时引入了一个逻辑错误。

NR==FNR {...; next }
{ ...sub(/NUMBER/,$2,line) }

致:

NR==FNR { a[FNR=$2; next }
{ ...line[$3]=a[FNR]... }

而不是:

NR==FNR {...; next }
{ ...line[$3]=$2... }

您所做的是完全不同的逻辑,将line的一部分替换为main的字符串,而不是ref的字符串。在充实了一些公共代码并将其移动到函数之后,以下是您当前问题的完整脚本:

$ cat tst.awk
NR == FNR {
    lines[++numLines] = $0
    if ( /\047MATCH LINE\047/ ) {
        tgt1 = NR + 2
    }
    if ( /write output/ ) {
        tgt2 = NR
    }
    next
}
{
    for ( lineNr=1; lineNr<=numLines; lineNr++ ) {
        line = lines[lineNr]
        if ( lineNr == tgt1 ) {
            line = rplc(line,3,$2)
        }
        if ( lineNr == tgt2 ) {
            line = rplc(line,4,$2)
        }
        print line > $1
    }
    close($1)
}

function rplc(str,tgt,val,      numFlds,flds,fldNr) {
    numFlds = split(line,flds)
    if ( tgt > numFlds ) {
        numFlds = tgt
    }
    flds[tgt] = val
    str = flds[1]
    for ( fldNr=2; fldNr<=numFlds; fldNr++ ) {
        str = str " " flds[fldNr]
    }
    return str
}
$ awk -f tst.awk main ref
$ head Out*
==> Out1 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this ONE to be updated
write output label ONE
another line here

==> Out2 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this TWO to be updated
write output label TWO
another line here

==> Out3 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this THREE to be updated
write output label THREE
another line here
g6ll5ycj

g6ll5ycj2#

#!/bin/awk -f

function join(array, start, end, sep,    result, i)
{
    if (sep == "")
       sep = " "
    else if (sep == SUBSEP) # magic value
       sep = ""
    result = array[start]
    for (i = start + 1; i <= end; i++)
        result = result sep array[i]
    return result
}
/\047MATCH LINE\047/{
    mline = NR+2
}
FNR==NR{
    main[NR] = $0
    next 
}
{
    out = $1
    for (i=1; i<=length(main); i++){
        if(i == mline){
           n=split(main[i], a, " ") 
           a[3]=$2
           print join(a, 1, n) > out
        }else if (i == mline+1 && main[i] ~ /write output label .*/) {
            n=split(main[i], a, " ") 
            a[4]=$2
            print join(a, 1, n) > out
        }else{
            print main[i] > out
        }
    }
    close(out)
}
./tst.awk main ref

$ head Out*
==> Out1 <==
one line here
This is the 'MATCH LINE'  
# this is just a comment  
Now this ONE to be updated
write output label ONE    
another line here

==> Out2 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this TWO to be updated
write output label TWO
another line here
    
==> Out3 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this THREE to be updated
write output label THREE
another line here
fnx2tebb

fnx2tebb3#

一个awk的想法:

awk '
NR == FNR {
    if ( /MATCH LINE/              ) tgt = FNR + 2
    if ( FNR == tgt                ) $3  = "REPLACE_ME"       # replace 4th field with a string that you know does not exist in main
    if ( tgt > 0 && /write output/ ) $4  = "REPLACE_ME"       # replace 3rd field with the same dummy replacement string

    template = template (template != "" ? ORS : "") $0        # add current line to our template block of text
    next
}
{ if ( tgt > 0 ) {                                            # if "MATCH LINE" exists then ...
     template_copy = template                                 # copy template
     gsub(/REPLACE_ME/,$2,template_copy)                      # perform replacements against "template_copy"
     print template_copy > $1                                 # print "template_copy" to output file "$1"
     close($1)                                                # close file descriptor
  }
}
' main ref

注意事项:

  • 如果main中不存在MATCH LINE,则不会生成输出文件
  • 如果main可以包含字符串REPLACE_ME,则修改代码以使用您知道在main中不存在的字符串
  • 如果字段(在main中)由除单个空格之外的其他内容(例如,制表符,多个空格)分隔,则此解决方案将 * 不 * 保持原始间距(即,制表符和多个空格将被替换为单个空格);保持原来的间距是可行的,但需要更多的代码

该geneartes:

$ head Out*
==> Out1 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this ONE to be updated
write output label ONE
another line here

==> Out2 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this TWO to be updated
write output label TWO
another line here

==> Out3 <==
one line here
This is the 'MATCH LINE'
# this is just a comment
Now this THREE to be updated
write output label THREE
another line here

相关问题