linux 通过复制第一个字符串完成制表符分隔文件

llew8vvj  于 2023-02-15  发布在  Linux
关注(0)|答案(2)|浏览(106)

我不知道如何把这个转换成文字。我有一个列表,我试图转换成一个制表符分隔的文件。以下是原始形式的列表:

|01BFRUITS|
^banana
^apple
^orange
^pear
|01AELECTRONICS|
^television
^radio
^dishwasher
^computer
|01AANIMAL|
^bear
^cat
^dog
^elephant
|01ASHAPE|
^circle
^square
^diamond
^star

经过一番头痛之后,我了解到GNU有sed-z(cattest.txt|sed-z的/|\r\n ^/\ t/g '|tr '^''\t'|tr-d '|'),它允许我创建以下输出

01BFRUITS       banana
        apple
        orange
        pear
01AELECTRONICS  television
        radio
        dishwasher
        computer
01AANIMAL       bear
        cat
        dog
        elephant
01ASHAPE        circle
        square
        diamond
        star

现在我尝试让输出看起来像:

01BFRUITS       banana
01BFRUITS        apple
01BFRUITS        orange
01BFRUITS        pear
01AELECTRONICS  television
01AELECTRONICS        radio
01AELECTRONICS        dishwasher
01AELECTRONICS        computer
01AANIMAL       bear
01AANIMAL        cat
01AANIMAL        dog
01AANIMAL        elephant
01ASHAPE        circle
01ASHAPE        square
01ASHAPE        diamond
01ASHAPE        star

什么类型的命令可以处理这个问题?
建议如下:

$ awk -v OFS='\t' '/^\|/{ c1=$0; gsub(/\|/,"",c1) } /^\^/{ c2=$0; sub(/^\^/,"",c2); print c1,c2 }'  < test.txt
01BFRUITbanana
01BFRUITapple
01BFRUITorange
01BFRUITpear
01AELECTtelevision
01AELECTradioS
01AELECTdishwasher
01AELECTcomputer
01AANIMAbear
01AANIMAcat
01AANIMAdog
01AANIMAelephant
01ASHAPEcircle
01ASHAPEsquare
01ASHAPEdiamond
01ASHAPEstar

剪切第一个字符串并忽略中间的制表符。这似乎是一个好的开始。我将尝试看看我是否可以修复这个问题。
通过将OFS添加到打印解决了此问题:

$ awk -v OFS='\t' '/^\|/{ c1=$0; gsub(/\|/,"",c1) } /^\^/{ c2=$0; sub(/^\^/,"",c2); print c1,OFS,c2 }' < test.txt

01BFRUITS               banana
01BFRUITS               apple
01BFRUITS               orange
01BFRUITS               pear
01AELECTRONICS          television
01AELECTRONICS          radio
01AELECTRONICS          dishwasher
01AELECTRONICS          computer
01AANIMAL               bear
01AANIMAL               cat
01AANIMAL               dog
01AANIMAL               elephant
01ASHAPE                circle
01ASHAPE                square
01ASHAPE                diamond
01ASHAPE                star

谢谢你带我到那里@jhnc
编辑:
新增|sed-z s/\ r\t\t//g用于删除c1后面的\r\t

cat test.txt | awk -v OFS='\t' '/^\|/{ c1=$0; gsub(/\|/,"",c1) } /^\^/{ c2=$0; sub(/^\^/,"",c2); print c1,OFS,c2 }' | sed -z s/\\r\\t\\t//g
01BFRUITS       banana
01BFRUITS       apple
01BFRUITS       orange
01BFRUITS       pear
01AELECTRONICS  television
01AELECTRONICS  radio
01AELECTRONICS  dishwasher
01AELECTRONICS  computer
01AANIMAL       bear
01AANIMAL       cat
01AANIMAL       dog
01AANIMAL       elephant
01ASHAPE        circle
01ASHAPE        square
01ASHAPE        diamond
01ASHAPE        star
wpx232ag

wpx232ag1#

$ awk -F'|' -v OFS="\t" 'NF==3{h=$2; next}{gsub(/^[\^]/,""); print h,$0}' inputfile
01BFRUITS       banana
01BFRUITS       apple
01BFRUITS       orange
01BFRUITS       pear
01AELECTRONICS  television
01AELECTRONICS  radio
01AELECTRONICS  dishwasher
01AELECTRONICS  computer
01AANIMAL       bear
01AANIMAL       cat
01AANIMAL       dog
01AANIMAL       elephant
01ASHAPE        circle
01ASHAPE        square
01ASHAPE        diamond
01ASHAPE        star

或者

$ awk -F'[|^]' -v OFS="\t" 'NF==3{h=$2;next}{print h,$2}' inputfile

或者

$ awk -F'[|^]' 'NF==3{h=$2;next}{$0=h"\t"$2}1' inputfie
2vuwiymt

2vuwiymt2#

@jhnc
命令的打印部分缺少OFS..我添加了它,瞧!
编辑:为了说明c1之后的\r\t,我添加了

| sed -z s/\\r\\t\\t//g

这导致了

cat TESTCOUNT.txt | awk -v OFS='\t' '/^\|/{ c1=$0; gsub(/\|/,"",c1) } /^\^/{ c2=$0; sub(/^\^/,"",c2); print c1,OFS,c2 }' | sed -z s/\\r\\t\\t//g

01BFRUITS       banana
01BFRUITS       apple
01BFRUITS       orange
01BFRUITS       pear
01AELECTRONICS  television
01AELECTRONICS  radio
01AELECTRONICS  dishwasher
01AELECTRONICS  computer
01AANIMAL       bear
01AANIMAL       cat
01AANIMAL       dog
01AANIMAL       elephant
01ASHAPE        circle
01ASHAPE        square
01ASHAPE        diamond
01ASHAPE        star

相关问题