目标是收集要复制到目标FileSystem的文件名(使用AWK):
1.如果它们在source.csv中并且在target.csv中不存在
1.文件大小不同
1.源中时间戳大于目标中的时间戳
Source.csv
"2023-08-25","test/test2/filename1","10.00 B"
"2023-07-25","test/test2/filename2","15.00 B"
"2023-07-25","test/test2/filename3","5.00 B"
"2023-07-25","test/test2/filename4","5.00 B"
Target.csv
"2023-08-25","test/test2/filename0","10.00 B"
"2023-07-25","test/test2/filename2","10.00 B"
"2023-07-24","test/test2/filename3","5.00 B"
"2023-07-25","test/test2/filename4","5.00 B"
预期输出:
"2023-08-25","test/test2/filename1","10.00 B" ### Because does not exists in target.csv
"2023-07-25","test/test2/filename2","10.00 B" ### Because the size is different
"2023-07-24","test/test2/filename3","5.00 B" ### Because the timestamp in source.csv is grater than in target.csv (meaning new version in source, not in target)
对于我使用的唯一文件:awk -v FS="," 'BEGIN { OFS = FS } FNR == NR { unique[$2]; next } !($2 in unique) { print $2 }' target.csv source.csv | tr -d "\"" > files_to_copy.txt
但对于其他两个条件,我无法开发代码。缺少AWK知识。任何帮助?:)
2条答案
按热度按时间nc1teljy1#
假设条件:
B
一个
awk
的想法:这产生:
rta7y2nd2#
使用任何POSIX awk,无论CSV中的文件名中出现哪些字符(除换行符外),并假设每个文件名都是唯一的:
上面的代码假设CSV的第一个或最后一个字段中没有逗号或双引号。