linux 如何在AWK中俱乐部名称之间的空格[已关闭]

hfyxw5xn  于 2023-01-29  发布在  Linux
关注(0)|答案(1)|浏览(95)

2天前关闭。
Improve this question
我正在尝试使用awk命令从文件中过滤出数据并将其放入CSV文件。我正在尝试创建列标题,但数据之间有空格,因此脚本将每个字符作为单独的名称。
我正在使用的脚本

$ cat tst.sh
#!/usr/bin/env bash

cat file |
awk '
    BEGIN {
        OFS = ","
        numTags = split("Machine Name Type Node Name Agent Name Operating System Agent Release Agent Build",tags)
        for ( tagNr=1; tagNr<=numTags; tagNr++ ) {
            tag = tags[tagNr]
            printf "\"%s\"%s", tag, (tagNr<numTags ? OFS : ORS)
        }
    }

    !NF || /^\/\*/ { next }
    { gsub(/^[[:space:]]+|[[:space:]]+$/,"") }

    match($0,/[[:space:]]job_type:/) {
        if ( jobNr++ ) {
            prt()
            delete tag2val
        }

        # save "insert_job" value
        tag = substr($1,1,length($1)-1)
        val = substr($0,length($1)+1,RSTART-(length($1)+2))
        gsub(/^[[:space:]]+|[[:space:]]+$/,"",val)
        tag2val[tag] = val

        # update $0 to start with "job_type" to look like all other input
        $0 = substr($0,RSTART+1)
    }

    {
        tag = val = $0
        sub(/:.*/,"",tag)
        sub(/[^:]+:[[:space:]]*/,"",val)
        tag2val[tag] = val
    }

    END { prt() }

    function prt(    tagNr,tag,val) {
        for ( tagNr=1; tagNr<=numTags; tagNr++ ) {
            tag = tags[tagNr]
            val = tag2val[tag]
            printf "\"%s\"%s", val, (tagNr<numTags ? OFS : ORS)
        }
    }
'

File的内容:

$ cat file

Machine Name:       machine1
Type:               a
Node Name:          machine1.test
Agent Name:         WA_AGENT
Operating System:   Windows Server 2012 
Agent Release:      12.0
Agent Build:        6181, Service Pack 00, Maintenance Level 00

Machine Name:       machine2
Type:               a
Node Name:          machine2.test
Agent Name:         WA_AGENT
Operating System:   Windows Server 2012 for amd64
Agent Release:      12.0
Agent Build:        6181, Service Pack 00, Maintenance Level 00

我得到的输出:

"Machine","Name","Type","Node","Name","Agent","Name","Operating","System","Agent","Release","Agent","Build"
"","","a","","","","","","","","","",""

所需输出:

"Machine Name","Type","Node Name","Agent Name","Operating System","Agent Release","Agent Build"
"machine1"," a","  machine1.test","  AGENT","  Windows Server 2012","  12.0","  6181, Service Pack 00, Maintenance Level 00"
"machine2"," a","  machine2.test","  AGENT","  Windows Server 2012","  12.0","  6181, Service Pack 00, Maintenance Level 00"

有没有办法得到我想要的输出。

0md85ypi

0md85ypi1#

忽略一些输出字段中的前导空格作为idk,如果/为什么你想要这些,如果你真的想要,你可以调整这个来添加它们,下面是如何修改你的问题中的代码来做你想要的:

$ cat tst.sh
#!/usr/bin/env bash

cat file |
awk '
    BEGIN {
        OFS = ","
        numTags = split("Machine Name:Type:Node Name:Agent Name:Operating System:Agent Release:Agent Build",tags,":")
        for ( tagNr=1; tagNr<=numTags; tagNr++ ) {
            tag = tags[tagNr]
            printf "\"%s\"%s", tag, (tagNr<numTags ? OFS : ORS)
        }
    }

    !NF || /^\/\*/ { next }
    { gsub(/^[[:space:]]+|[[:space:]]+$/,"") }

    /^Machine Name:/ {
        if ( jobNr++ ) {
            prt()
            delete tag2val
        }
    }

    {
        tag = val = $0
        sub(/[[:space:]]*:.*/,"",tag)
        sub(/[^:]+:[[:space:]]*/,"",val)
        tag2val[tag] = val
    }

    END { prt() }

    function prt(    tagNr,tag,val) {
        for ( tagNr=1; tagNr<=numTags; tagNr++ ) {
            tag = tags[tagNr]
            val = tag2val[tag]
            printf "\"%s\"%s", val, (tagNr<numTags ? OFS : ORS)
        }
    }
'
$ ./tst.sh file
"Machine Name","Type","Node Name","Agent Name","Operating System","Agent Release","Agent Build"
"machine1","a","machine1.test","WA_AGENT","Windows Server 2012","12.0","6181, Service Pack 00, Maintenance Level 00"
"machine2","a","machine2.test","WA_AGENT","Windows Server 2012 for amd64","12.0","6181, Service Pack 00, Maintenance Level 00"

实际上,如果我对这个特定的问题从头开始,我不会在问题中硬编码标记,我只会在每次遇到空行时打印所有的值。

$ cat tst.sh
#!/usr/bin/env bash

cat file |
awk '
    BEGIN {
        OFS = ","
    }

    { gsub(/^[[:space:]]+|[[:space:]]+$/,"") }

    !NF {
        prt()
        delete tag2val
        numTags = 0
        next
    }

    {
        tag = val = $0
        sub(/[[:space:]]*:.*/,"",tag)
        sub(/[^:]+:[[:space:]]*/,"",val)
        if ( !(tag in tag2val) ) {
            tags[++numTags] = tag
        }
        tag2val[tag] = val
    }

    END { prt() }

    function prt(    tagNr,tag,val) {
        if ( !doneHdr++ ) {
            for ( tagNr=1; tagNr<=numTags; tagNr++ ) {
                tag = tags[tagNr]
                printf "\"%s\"%s", tag, (tagNr<numTags ? OFS : ORS)
            }
        }

        for ( tagNr=1; tagNr<=numTags; tagNr++ ) {
            tag = tags[tagNr]
            val = tag2val[tag]
            printf "\"%s\"%s", val, (tagNr<numTags ? OFS : ORS)
        }
    }
'
$ ./tst.sh file
"Machine Name","Type","Node Name","Agent Name","Operating System","Agent Release","Agent Build"
"machine1","a","machine1.test","WA_AGENT","Windows Server 2012","12.0","6181, Service Pack 00, Maintenance Level 00"
"machine2","a","machine2.test","WA_AGENT","Windows Server 2012 for amd64","12.0","6181, Service Pack 00, Maintenance Level 00"

关于我为您提供的任何脚本,有一点需要注意--我不使用$1和$2这样的字段来保存标记或值,因为一旦您这样做,如果您的数据可以包含任何用作FS的内容,您就会遇到问题。
例如,如果您的数据如下所示:

tag: value

那么就不要在代码中执行类似以下的操作:

FS = ": *"
tag = $1
val = $2

因为当值(或者,可能性更小的标签)包含与FS匹配的字符串时(例如,本例中的:),它将失败,例如,给定以下数据:

foo: "the ratio was 2:1"

最后val会被设置为"the ratio was 2

tag = val = $0
sub(/[[:space:]]*:.*/,"",tag)
sub(/^:+:[[:space:]]*/,"",val)

因此最后val设置为"the ratio was 2:1"

相关问题