将csv输出读入数组,并使用bash [duplicate]在循环中处理变量

kulphzqa  于 2023-01-22  发布在  其他
关注(0)|答案(4)|浏览(129)
    • 此问题在此处已有答案**:

How to parse a CSV file in Bash?(6个答案)
2天前关闭。
假设我有输出/文件

1,a,info
2,b,inf
3,c,in

我想用read运行while循环

while read r ; do 
   echo "$r";
   # extract line to $arr as array separated by ',' 
   # call some program (e.g. md5sum, echo ...) on one item of arr
done <<HEREDOC
1,a,info
2,b,inf
3,c,in   
HEREDOC

我想使用readarraywhile,但也欢迎引人注目的替代品。
有一种特定的方法可以让readarraymapfile)正确地执行进程替换,但我总是忘记它。这是一个问答,所以解释一下会很好

cunj1qz1

cunj1qz11#

因为compelling alternatives are welcome too并且假设您只是尝试一次填充arr一行:

$ cat tst.sh
#!/usr/bin/env bash

while IFS=',' read -a arr ; do
    # extract line to $arr as array separated by ','
    # echo the first item of arr
    echo "${arr[0]}"
done <<HEREDOC
1,a,info
2,b,inf
3,c,in
HEREDOC
$ ./tst.sh
1
2
3

或者,如果您还需要在单独的变量r中输入整行:

$ cat tst.sh
#!/usr/bin/env bash

while IFS= read -r r ; do
    # extract line to $arr as array separated by ','
    # echo the first item of arr
    IFS=',' read -r -a arr <<< "$r"
    echo "${arr[0]}"
done <<HEREDOC
1,a,info
2,b,inf
3,c,in
HEREDOC
$ ./tst.sh
1
2
3

但是无论如何要记住why-is-using-a-shell-loop-to-process-text-considered-bad-practice

oalqel3c

oalqel3c2#

readarraymapfile)和read -a消歧

readarray == mapfile第一个:

help readarray
readarray: readarray [-d delim] [-n count] [-O origin] [-s count] [-t] [-u fd] [-C callback] [-c quantum] [array]
    Read lines from a file into an array variable.
    
    A synonym for `mapfile'.

然后

help mapfile
mapfile: mapfile [-d delim] [-n count] [-O origin] [-s count] [-t] [-u fd] [-C callback] [-c quantum] [array]
    Read lines from the standard input into an indexed array variable.
    
    Read lines from the standard input into the indexed array variable ARRAY, or
    from file descriptor FD if the -u option is supplied.  The variable MAPFILE
    is the default ARRAY.
    
    Options:
      -d delim    Use DELIM to terminate lines, instead of newline
      -n count    Copy at most COUNT lines.  If COUNT is 0, all lines are copied
      -O origin   Begin assigning to ARRAY at index ORIGIN.  The default index is 0
      -s count    Discard the first COUNT lines read
      -t  Remove a trailing DELIM from each line read (default newline)
      -u fd       Read lines from file descriptor FD instead of the standard input
      -C callback Evaluate CALLBACK each time QUANTUM lines are read
      -c quantum  Specify the number of lines read between each call to
                          CALLBACK
...

read -a

help read
read: read [-ers] [-a array] [-d delim] [-i text] [-n nchars] [-N nchars] [-p prompt] [-t timeout] [-u fd] [name ...]
    Read a line from the standard input and split it into fields.
    
    Reads a single line from the standard input, or from file descriptor FD
    if the -u option is supplied.  The line is split into fields as with word
    splitting, and the first word is assigned to the first NAME, the second
    word to the second NAME, and so on, with any leftover words assigned to
    the last NAME.  Only the characters found in $IFS are recognized as word
    delimiters.
...
    Options:
      -a array    assign the words read to sequential indices of the array
                  variable ARRAY, starting at zero
...

注:
只有在$IFS中找到的字符才被识别为单词分隔符。对-a标志很有用!

从拆分的字符串创建数组

要通过拆分字符串来创建数组,您可以:
一个三个三个一个
Oe使用mapfile,但由于此命令旨在处理整个 * 文件 *,因此语法有些违反直觉:

mapfile -td, myArray < <(printf %s 'A,1,spaced string,42')
declare -p myArray
declare -a myArray=([0]="A" [1]="1" [2]="spaced string" [3]="42")

或者,如果要避免使用fork(* < <(printf... *),则必须

mapfile -td, myArray <<<'A,1,spaced string,42'
myArray[-1]=${myArray[-1]%$'\n'}
declare -p myArray
declare -a myArray=([0]="A" [1]="1" [2]="spaced string" [3]="42")

这样会快一点,但可读性不会更强...

对于您的样品:

一个9个1x一个10个1x一个11个1x
或者,如果您真的想使用readarray
一个12b1x一个13b1x

播放 * callback * 选项:

(在最后一行添加了一些空格)

testfunc() { 
    local IFS array cnt line
    read cnt line <<< "$@"
    IFS=,
    read -a array <<< "$line"
    printf ' [%3d]: %3s | %3s :: %s\n' $cnt "${array[@]}"
}
mapfile -t -C testfunc -c 1  <<HEREDOC
1,a,info
2,b,inf
3,c d,in fo   
HEREDOC
[  0]:   1 |   a :: info
 [  1]:   2 |   b :: inf
 [  2]:   3 | c d :: in fo

相同,带有-u标志:

打开 * 文件描述符 *:

exec {mydoc}<<HEREDOC
1,a,info                             
2,b,inf                                                                                        
3,c d,in fo   
HEREDOC

然后
一个17块一个18块一个
最后关闭 * 文件描述符 *:

exec {mydoc}<&-

关于bash csv模块

有关enable -f /path/to/csv csv、RFC和限制的更多信息,请参阅my previous post about How to parse a CSV file in Bash?

h6my8fg2

h6my8fg23#

如果可加载的内置csv可用/可接受,类似于:

help csv
csv: csv [-a ARRAY] string
    Read comma-separated fields from a string.
    
    Parse STRING, a line of comma-separated values, into individual fields,
    and store them into the indexed array ARRAYNAME starting at index 0.
    If ARRAYNAME is not supplied, "CSV" is the default array name.

剧本。

#!/usr/bin/env bash

enable csv || exit

while IFS= read -r line && csv -a arr "$line"; do
  printf '%s\n' "${arr[0]}"
done <<HEREDOC
1,a,info
2,b,inf
3,c,in
HEREDOC
  • 参见help enable

对于bash 5.2+config-top.h中的可加载项有一个默认路径,该路径应该可以在编译时配置。

BASH_LOADABLES_PATH
tcbh2hod

tcbh2hod4#

解为readarray -t -d, arr < <(printf "%s," "$r")
特殊部分是< <(...),因为readarray ....
找不到为什么首先需要重定向箭头然后需要进程替换的适当理由。
tldp process-subSS64中都没有。
我最后的理解是,<(...)打开了一个命名管道,readarray正在等待它关闭,通过将其移动到<后面的文件位置,bash将其作为文件输入处理,并(匿名地)通过管道传输到stdin。
示例:

while read r ; do 
   echo "$r";
   readarray -t -d, arr < <(printf "%s," "$r");
   echo "${arr[0]}";
done <<HEREDOC
1,a,info
2,b,inf
3,c,in   
HEREDOC

不管怎么说,这只是对我自己的提醒,因为我一直在忘记而readarray是我唯一真正需要这个的地方。
这个问题也得到了mostly herehere why the pipe isn't working和某种程度上here的回答,但它们很难找到,推理也很难理解。
例如shopt -s lastpipe解决方案一开始并不清楚,但它证明在bash中所有管道元素通常不在主shell中执行,因此状态更改对整个程序没有影响.这个命令更改行为以使最后一个管道元素在主shell中执行(除了在交互式shell中)

shopt -s lastpipe;
while read r ; do 
    echo "$r";       
    printf "%s," "$r"  | readarray -t -d, arr;
    echo "${arr[0]}"; 
    done <<HEREDOC
1,a,info
2,b,inf
3,c,in   
HEREDOC

lastpipe的一个替代方案是在子 shell 中执行所有活动:

while read r ; do 
       echo "$r";
       printf "%s," "$r"  | { 
            readarray -t -d, arr ; 
            echo "${arr[0]}"; 
       }
    done <<HEREDOC
1,a,info
2,b,inf
3,c,in   
HEREDOC

相关问题