shell 需要合并具有不同单词列表大小的2个文件

64jmpszr 于 2022-11-16 发布在 Shell

关注(0)|答案(3)|浏览(96)

我需要合并2个文件的bash脚本，这是有不同的单词计数的单词列表，我想把它们结合起来，如下所示。

文件1：

word1
word2
word3

文件2：

8.8.8.8
4.4.4.4
4.4.2.2
5.5.5.5

所需输出：

word1,8.8.8.8
word1,4.4.4.4
word1,4.4.2.2
word1,5.5.5.5
word2,8.8.8.8
word2,4.4.4.4
word2,4.4.2.2
word2,5.5.5.5
word3,8.8.8.8
word3,4.4.4.4
word3,4.4.2.2
word3,5.5.5.5

shell

来源：https://stackoverflow.com/questions/71273795/need-to-combine-2-files-having-different-word-list-size

3条答案

按热度按时间

olhwl3o21#

找到一个文件中不包含的足够高的字段编号（如100），然后（ab）使用join生成笛卡尔积
第一个
编辑：为了使用逗号作为列分隔符，请使用-t选项对其进行命名;为了使输出不以该分隔符（以前是空格，现在是逗号）开头，请使用-o选项明确排序：
第一次

赞(0）回复(0）举报 2022-11-16

hof1towb2#

您可以使用awk将两个文件值读入单独的索引数组，然后在END规则中，简单地循环存储的值，以您想要的格式输出，例如：

awk '
  FNR==NR { f1[++n] = $0; next }        # save file_1 in array f1
  { f2[++m] = $0 }                      # save file_2 in array f2
  END {
    for (i=1; i<=n; i++)                # loop over all f1 values
      for(j=1; j<=m; j++)               # loop over all f2 values
        printf "%s,%s\n", f1[i], f2[j]  # output f1[],f2[]
  }
' file_1 file_2

使用/输出示例

对于file_1和file_2中的数据，您将得到：

$ awk '
>   FNR==NR { f1[++n] = $0; next }        # save file_1 in array f1
>   { f2[++m] = $0 }                      # save file_2 in array f2
>   END {
>     for (i=1; i<=n; i++)                # loop over all f1 values
>       for(j=1; j<=m; j++)               # loop over all f2 values
>         printf "%s,%s\n", f1[i], f2[j]  # output f1[],f2[]
>   }
> ' file_1 file_2
word1,8.8.8.8
word1,4.4.4.4
word1,4.4.2.2
word1,5.5.5.5
word2,8.8.8.8
word2,4.4.4.4
word2,4.4.2.2
word2,5.5.5.5
word3,8.8.8.8
word3,4.4.4.4
word3,4.4.2.2
word3,5.5.5.5

使用Bash

您可以在bash脚本中使用readarray（mapfile的同义词）将两个文件读入数组，例如：

#!/bin/bash

usage() {  ## simple function to output error and usage
  [ -n "$1" ] && printf "error: %s\n" "$1"
  printf "usage: %s file_1 file_2\n" "${0##*/}"
}

## validate filenames provided in first 2 arguments exist and are non-empty
[ -s "$1" ] || { usage "file $1 not found or empty"; exit 1; }
[ -s "$2" ] || { usage "file $2 not found or empty"; exit 1; }

readarray -t f1 < "$1"    # read file_1 int array f1
readarray -t f2 < "$2"    # read file_2 int array f2

for i in "${f1[@]}"; do         ## loop over f1
  for j in "${f2[@]}"; do       ## loop over f2
    printf "%s,%s\n" "$i" "$j"  ## output combined result
  done
done

（注：awk可能会提供更好的性能）

使用/输出示例

将脚本保存为cmbfiles.sh后，您将：

$ bash cmbfiles.sh file_1 file_2
word1,8.8.8.8
word1,4.4.4.4
word1,4.4.2.2
word1,5.5.5.5
word2,8.8.8.8
word2,4.4.4.4
word2,4.4.2.2
word2,5.5.5.5
word3,8.8.8.8
word3,4.4.4.4
word3,4.4.2.2
word3,5.5.5.5

赞(0）回复(0）举报 2022-11-16

wswtfjt73#

请尝试以下操作：

awk -v OFS="," -v ORS="\r\n" '                  # set comma as field separator, CRLF as record separator
    NR==FNR && NF>0 {a[++n]=$0; next}           # read file2.txt skipping blang lines
    NF>0 {for (i=1; i<=n; i++) print $0, a[i]}  # print line of file1.txt appending the lines of file2.txt
' file2.txt file1.txt

它跳过输入文件中的空行。
它附加Windows行尾，考虑用Excel打开。

赞(0）回复(0）举报 2022-11-16

我来回答

shell 需要合并具有不同单词列表大小的2个文件

3条答案

相关问题

热门标签

最新问答