shell BASH：根据文件名对文件进行排序

2vuwiymt 于 2023-05-01 发布在 Shell

关注(0)|答案(4)|浏览(485)

我需要将12000个文件分为1000组，根据其名称，并为每组创建一个包含该组文件的新文件夹。每个文件的名称以多列格式给出（带_ separator），其中第二列从1到12（部件编号）变化，最后一列从1到1000（系统编号），表示最初1000个不同的系统（最后一列）被拆分为12个单独的部件（第二列）。
下面是一个基于3个系统的小子集的示例，分为12个部分，共36个文件。

7000_01_lig_cne_1.dlg
7000_02_lig_cne_1.dlg
7000_03_lig_cne_1.dlg
...
7000_12_lig_cne_1.dlg

7000_01_lig_cne_2.dlg
7000_02_lig_cne_2.dlg
7000_03_lig_cne_2.dlg
...
7000_12_lig_cne_2.dlg

7000_01_lig_cne_3.dlg
7000_02_lig_cne_3.dlg
7000_03_lig_cne_3.dlg
...
7000_12_lig_cne_3.dlg

我需要根据文件名称的第二列（01，02，03 .. 12），从而创建1000个文件夹，其中应包含每个系统的12个文件，方式如下：

Folder1, name: 7000_lig_cne_1, it contains 12 files:   7000_{this is from 01 to 12}_lig_cne_1.dlg

 Folder2, name: 7000_lig_cne_2, it contains 12 files 7000_{this is from 01 to 12}_lig_cne_2.dlg
...
 Folder1000, name: 7000_lig_cne_1000, it contains 12 files 7000_{this is from 01 to 12}_lig_cne_1000.dlg

假设所有 *。dlg文件存在于同一个目录中，我建议bash循环工作流，它只缺少一些排序功能（sed，awk？？），按以下方式组织：

#set the name of folder with all DLG
home=$PWD
FILES=${home}/all_DLG/7000_CNE
# set the name of protein and ligand library to analyse
experiment="7000_CNE"

#name of the output
output=${home}/sub_folders_to_analyse

#now here all magic comes
rm -r ${output}
mkdir ${output}

# sed solution
for i in ${FILES}/*.dlg        # define this better to suit your needs
do 
    n=$( <<<"$i" sed 's/.*[^0-9]\([0-9]*\)\.dlg$/\1/' )
    # move the file to proper dir
    mkdir -p ${output}/"${experiment}_lig$n"
    cp "$i" ${output}/"${experiment}_lig$n"
done

注意：在这里，我将每个文件夹的名称开头指定为${experiment}，并在末尾添加最后一列$n的编号。是否可以根据复制文件的名称自动设置每次新文件夹的名称？

手动可以通过跳过文件夹名称中的第二列来实现

cp ./all_DLG/7000_*_lig_cne_987.dlg ./output/7000_lig_cne_987

shell

来源：https://stackoverflow.com/questions/64259905/bash-file-sorting-according-to-file-name

4条答案

按热度按时间

2ic8powd1#

迭代文件。从文件名中提取目标目录名。移动文件。

for i in *.dlg; do
    # extract last number with your favorite tool
    n=$( <<<"$i" sed 's/.*[^0-9]\([0-9]*\)\.dlg$/\1/' )
    # move the file to proper dir
    echo mkdir -p "folder$n"
    echo mv "$i" "folder$n"
done

备注：

不要在脚本中使用大写变量。使用小写变量。
记住在变量展开式中加上引号。
使用http://shellcheck.net检查脚本
重复测试
**更新：**OP的文件夹命名约定：

for i in *.dlg; do
    foldername="$HOME/output/${i%%_*}_${i#*_*_}"
    echo mkdir -p "$foldername"
    echo mv "$i" "$foldername"
done

赞(0）回复(0）举报 2023-05-01

63lcw9qa2#

这可能对你有用（GNU并行）：

ls *.dlg | 
parallel --dry-run 'd={=s/^(7000_).*(lig.*)\.dlg/$1$2/=};mkdir -p $d;mv {} $d'

将ls命令的输出列表以.dlg结尾的文件传输到并行，这将创建目录并将文件移动到其中。
按原样运行解决方案，如果对空运行的输出感到满意，则删除选项--dry-run。
解决方案可能是一个指令：

parallel 'd={=s/^(7000_).*(lig.*)\.dlg/$1$2/=};mkdir -p $d;mv {} $d' ::: *.dlg

赞(0）回复(0）举报 2023-05-01

lxkprmvk3#

仅使用POSIX shell的内置语法和sort：

#!/usr/bin/env sh

curdir=

# Create list of files with newline
# Safe since we know there is no special
# characters in name
printf -- %s\\n *.dlg |

# Sort the list by 5th key with _ as field delimiter
sort -t_ -k5 |

# Iterate reading the _ delimited fields of the sorted list
while IFS=_ read -r _ _ c d e; do

  # Compose the new directory name
  newdir="${c}_${d}_${e%.dlg}"

  # If we enter a new group / directory
  if [ "$curdir" != "$newdir" ]; then

    # Make the new directory current
    curdir="$newdir"

    # Create the new directory
    echo mkdir -p "$curdir"

    # Move all its files into it
    echo mv -- *_"$curdir.dlg" "$curdir/"
  fi
done

可选地作为sort和xargs参数流：

printf -- %s\\n * |
sort -u -t_ -k5 
xargs -n1 sh -c 
'd="lig_cne_${0##*_}"
d="${d%.dlg}"
echo mkdir -p "$d"
echo mv -- *"_$d.dlg" "$d/"
'

赞(0）回复(0）举报 2023-05-01

lyfkaqu14#

下面是一个非常简单的awk脚本，它在单次扫描中完成了这个任务。

脚本。鹰

BEGIN{FS="[_.]"} # make field separator "_" or "."
{ # for each filename
  dirName=$1"_"$3"_"$4"_"$5; # compute the target dir name from fields
  sysCmd = "mkdir -p " dirName"; cp "$0 " "dirName; # prepare bash command
  system(sysCmd); # run bash command
}

运行`script.awk`

ls -1 *.dlg | awk -f script.awk

oneliner `awk`脚本

ls -1 *.dlg | awk 'BEGIN{FS="[_.]"}{d=$1"_"$3"_"$4"_"$5;system("mkdir -p "d"; cp "$0 " "d);}'

赞(0）回复(0）举报 2023-05-01

我来回答

shell BASH：根据文件名对文件进行排序

4条答案

脚本。鹰

运行`script.awk`

oneliner `awk`脚本

相关问题

热门标签

最新问答

shell BASH：根据文件名对文件进行排序

4条答案

脚本。鹰

运行script.awk

oneliner awk脚本

相关问题

热门标签

最新问答

运行`script.awk`

oneliner `awk`脚本