unix 按文件名的长度对文件名进行排序

e5njpo68  于 2023-03-29  发布在  Unix
关注(0)|答案(6)|浏览(207)

ls显示目录中可用的文件。我希望根据文件名的长度显示文件名。
任何帮助都将受到高度赞赏。提前感谢

6pp0gazn

6pp0gazn1#

最简单的方法就是:

$ ls | perl -e 'print sort { length($b) <=> length($a) } <>'
dzhpxtsq

dzhpxtsq2#

你可以这样做

for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n
eit6fx6z

eit6fx6z3#

制作测试文件:

mkdir -p test; cd test 
touch short-file-name  medium-file-name  loooong-file-name

脚本:

ls |awk '{print length($0)"\t"$0}' |sort -n |cut --complement -f1

输出:

short-file-name
medium-file-name
loooong-file-name
m2xkgtsf

m2xkgtsf4#

for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2-
q0qdq0h2

q0qdq0h25#

靶区;DR

指令:

find . -maxdepth 1 -type f -print0 | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | perl -F'/\0/' -ape '$_=join("\n", sort { length($b) <=> length($a) } @F)' | sed 's#/#/\\n/#g'

更易于阅读的命令的替代版本:

find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g'

Not Parsing ls输出与基准测试

这里有很好的答案。但是,如果你想遵循建议not to parse the output of ls,这里有一些方法来完成这项工作。* 这将特别注意文件名中有空格的情况。* 我将在这里对所有内容以及paring-ls示例进行基准测试。(希望我很快就能做到这一点。)我把过去25年左右从不同地方下载的一些随机的文件名放在一起--开始是73个。所有73个都是“正常”的文件名,只有字母数字字符,下划线,点,和连字符。我将添加2个以上,我现在(为了显示一些排序的问题)。

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ mkdir ../dir_w_fnames__spaces

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ cp ./* ../dir_w_fnames__spaces/

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ cd ../dir_w_fnames__spaces/

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ touch "just one file with a really long filename that can throw off some counts bla so there"

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ mkdir ../dir_w_fnames__spaces_and_newlines

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cp ./* ../dir_w_fnames__spaces_and_newlines/

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cd ../dir_w_fnames__spaces_and_newlines/

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ touch $'w\nlf.aa'

这一个,即文件名,

w
lf.aa

代表withlinefeed -我这样做是为了更容易看到问题。我不知道为什么我选择.aa作为文件扩展名,除了它使这个文件名长度在排序中很容易看到。
现在,我将返回到orig_dir_73目录;请相信我,这个目录只包含文件。我们将使用一个万无一失的方法来获得文件的数量。

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ du --inodes
74      .

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ # The 74th inode is for the current directory, '.'; we have 73 files

有一种更可靠的方法,它不依赖于目录中只有文件,也不需要您记住额外的'.' inode。我只是浏览了一下man页面,做了一些研究和实验。

awk -F"\0" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=$1}END{print sum}'

或者,以更可读的方式,

awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'

让我们看看我们有多少文件

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'
73

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ cd ../dir_w_fnames__spaces

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'
74

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cd ../dir_w_fnames__spaces_and_newlines/

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'
75

(See[ 1 ]以了解详细信息和以前的解决方案的边缘情况,该解决方案导致了现在的命令。)
我将在这些目录之间来回切换;只要确保你注意路径-我不会注意到每一个开关。

一米七六

1a. Perl à la @tchrist with Additions

Usingfindwith null separator. Hacking around newlines in a filename.

指令:

find . -maxdepth 1 -type f -print0 | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | perl -F'/\0/' -ape '$_=join("\n", sort { length($b) <=> length($a) } @F)' | sed 's#/#/\\n/#g'

更易于阅读的命令的替代版本:

find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g'

我将展示排序结果的一部分,以证明下面的命令是有效的。我还将展示如何检查奇怪的文件名不会破坏任何东西。
请注意,如果想要完整的排序列表(希望不是sordid列表),通常不会使用headtail
第一,“正常”文件名。

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g' | tail -n 5
137.csv
13.csv
o6.dat
3.csv
a.dat

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ # No spaces in fnames, so...

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f | wc -l
73

适用于普通文件名*

下一页:空间

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt

适用于包含空格的文件名*

下一个:换行

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g' | tail -8
Lk3f.png
LOqU.txt
137.csv
w/\n/lf.aa
13.csv
o6.dat
3.csv
a.dat

如果您愿意,您也可以稍微更改此命令,因此文件名将显示“已评估”的换行符。

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#\n#g' | tail -8
LOqU.txt
137.csv
w
lf.aa
13.csv
o6.dat
3.csv
a.dat

在这两种情况下,您将知道,由于我们一直在做的事情,列表是排序的,即使它看起来不是这样。
(未按文件名长度排序的外观)

********
********
*******
**********       <-- Visual Problem
*****
*****
****
****

********
*******
*                <-- Visual
****             <-- Problems
*****
*****
****
****

适用于包含换行符的文件名*

一米十三分一秒

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2- | head
lf.aa
3.csv
a.dat
13.csv
o6.dat
137.csv
w
1UG5.txt
1uWj.txt
2Ese.txt

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2- | tail -5
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
just one file with a really long filename that can throw off some counts bla so there
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt

注意,对于head器件,中的w

w(\n)
lf.aa

对于6个字符长的文件名,lf.aa位于正确的排序位置。但是,lf.aa不在逻辑位置。

一米十七

1b. Perl à la @tchrist使用find,而不是ls

一米二十纳一x一米二十一纳一x一米二十二纳一x一米二十三纳一x

指令:

find . -maxdepth 1 -type f -print0 | xargs -I'{}' -0 echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | perl -e 'print sort { length($b) <=> length($a) } <>'

更易于阅读的命令的替代版本:

find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
        perl -e 'print sort { length($b) <=> length($a) } <>'

让我们去吧。

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
      perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
      perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
IKlT.txt
Lk3f.png
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat

适用于普通文件名*

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
      perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt

适用于包含空格的文件名*

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
      perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | 
      perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
lf.aa
w

警告

*BREAKS用于包含换行符的文件名

1c.适用于普通文件名和带空格的文件名,但可用于包含换行符的文件名- à la @tchrist

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
lf.aa
w

3a.适用于普通文件名和带空格的文件名,但可在文件名包含换行符时断开- à la @Peter_O

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | awk '{print length($0)"\t"$0}' | sort -n | cut --complement -f1 | head -n 8
w
3.csv
a.dat
lf.aa
13.csv
o6.dat
137.csv
1UG5.txt

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | awk '{print length($0)"\t"$0}' | sort -n | cut --complement -f1 | tail -5
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
just one file with a really long filename that can throw off some counts bla so there
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt

一米二十四分一秒

4a.适合普通文件名- à la @Raghuram

这个版本的文件名包含空格或换行符(或两者)时是可断开的。
我确实想补充一点,我确实喜欢实际字符串长度的显示,如果只是为了分析的目的。

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n | head -n 20
1 a
1 w
2 so
3 bla
3 can
3 off
3 one
4 file
4 just
4 long
4 some
4 that
4 with
5 3.csv
5 a.dat
5 lf.aa
5 there
5 throw
6 13.csv
6 counts

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n | tail -5
69 17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt
70 83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
76 79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
87 oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
238 68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt

部分命令说明

现在,我只注意到,对于works-for-all find命令,我使用'/'作为换行符替换,因为它是在 *NIX和Windows上文件名中唯一非法的字符。

附注

[ 1 ]所使用的命令,

du --inodes --files0-from=<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=int($1)}END{print sum}'

在这种情况下,awkint函数将起作用,因为当有一个文件带有一个换行符,因此在find命令的输出中有一个“额外”行时,awkint函数将为该链接的文本求值为0。

w
lf.aa

我们会得到

$ awk '{print int($1)}' < <(echo "lf.aa")
0

如果您遇到文件名类似于
firstline\n3 and some other\n1\n2\texciting\n86stuff.jpg

firstline
3 and some other
1
2     exciting
86stuff.jpg

好吧,我想电脑已经打败我了。如果有人有解决办法,我很乐意听。

编辑我想我对这个问题太深入了。从this SO answer和实验中,我得到了这个命令(我不理解所有的细节,但我已经测试得很好了。)

awk -F"\0" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=$1}END{print sum}'

更易读:

awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'
gcuhipw9

gcuhipw96#

你可以用

ls --color=never --indicator-style=none | awk '{print length, $0}' |
sort -n | cut -d" " -f2-

要查看它的实际效果,请创建一些文件

% touch a ab abc

和一些目录

% mkdir d de def

正常ls命令的输出

% ls
a  ab  abc  d/  de/  def/

建议命令的输出

% ls --color=never --indicator-style=none | awk '{print length, $0}' |
sort -n | cut -d" " -f2-
a
d
ab
de
abc
def

相关问题