linux 在awk中更改上一个重复行

xvw2m8pv  于 2022-11-02  发布在  Linux
关注(0)|答案(2)|浏览(176)

我想将.csv中所有重复的名称更改为唯一的,但在找到重复的名称后,我无法到达上一行,因为它已经被打印出来了。我曾尝试将所有行保存在数组中并在End部分打印它们,但它不起作用,而且我不知道如何访问此数组中的特定字段(awk不支持二维数组?)
样本输入

...,9,phone,...
...,43,book,...
...,27,apple,...
...,85,hook,...
...,43,phone,...

期望输出

...,9,phone9,...
...,43,book,...
...,27,apple,...
...,85,hook,...
...,43,phone43,...

我的尝试($2 - id字段,$3 - name字段)

BEGIN{
       FS=","
       OFS=","
       marker=777
     } 
     {
       if (names[$3] == marker) {
       $3 = $3 $2
       #Attempt to change previous duplicate
       results[nameLines[$3]]=$3 id[$3]
       }
       names[$3] = marker
       id[$3] = $2
       nameLines[$3] = NR
       results[NR] = $0
     }
END{
     #it prints some numbers, not saved lines
     for(result in results)
     print result
   }
lndjwyie

lndjwyie1#

以下是将所有记录存储在缓冲区中的单次传递awk

awk -F, '
{
   rec[NR] = $0
   ++fq[$3]
}
END {
   for (i=1; i<=NR; ++i) {
      n = split(rec[i], a, /,/)
      if (fq[a[3]] > 1)
         a[3] = a[3] a[2]
      for (k=1; k<=n; ++k)
         printf "%s", a[k] (k < n ? FS : ORS)
    }
}' file

...,9,phone9,...
...,43,book,...
...,27,apple,...
...,85,hook,...
...,43,phone43,...
qni6mghb

qni6mghb2#

这可以很容易地在awk中的2遍Input_file中完成,我们不需要在其中创建2维数组。

awk '
BEGIN{FS=OFS=","}
FNR==NR{
  arr1[$3]++
  next
}
{
  $3=(arr1[$3]>1?$3 $2:$3)
}
1
' Input_file  Input_file

输出如下:

...,9,phone9,...
...,43,book,...
...,27,apple,...
...,85,hook,...
...,43,phone43,...

相关问题