hdfs+在.sparkstaging/未删除文件夹时的替代方法

6ovsh4lw  于 2021-07-14  发布在  Spark
关注(0)|答案(0)|浏览(297)

我们有hdp集群版本2.6.5,并且正在运行某些spark作业,我们看到hdfs中的sparkstaging目录在作业完成后仍然存在
低于 /user/hdfs/.sparkStaging/ ,我们有2021-03-30的旧文件夹,我们只想保留上个月的文件夹,并删除下的文件夹 /user/hdfs/.sparkStaging/ 那个月
我在找 HDFS 可以删除旧文件夹的命令,
现在我想使用以下命令(而min中的43800是一个月)

hdfs dfs -ls /user/hdfs/.sparkStaging  |   tr -s " "    |    cut -d' ' -f6-8    |     grep "^[0-9]"    |    awk 'BEGIN{ MIN=43800; LAST=60*MIN; "date +%s" | getline NOW } { cmd="date -d'\''"$1" "$2"'\'' +%s"; cmd | getline WHEN; DIFF=NOW-WHEN; if(DIFF > LAST){ print "Deleting: "$3; system("hdfs dfs -rm -r "$3) }}'

参考-删除hdfs上超过10天的文件
但我不确定当/user/hdfs/.sparkstaging/*文件夹由于某种原因没有被删除时,这种手动方法是否是一个好的解决方案

hdfs dfs -ls  /user/hdfs/.sparkStaging/ | more
Found 2324 items
drwx------   - hdfs hdfs          0 2021-03-30 06:40 /user/hdfs/.sparkStaging/application_1617025601058_0195
drwx------   - hdfs hdfs          0 2021-03-30 06:45 /user/hdfs/.sparkStaging/application_1617025601058_0224
drwx------   - hdfs hdfs          0 2021-03-30 06:56 /user/hdfs/.sparkStaging/application_1617025601058_0289
drwx------   - hdfs hdfs          0 2021-03-30 06:56 /user/hdfs/.sparkStaging/application_1617025601058_0290
drwx------   - hdfs hdfs          0 2021-03-30 07:01 /user/hdfs/.sparkStaging/application_1617025601058_0320
drwx------   - hdfs hdfs          0 2021-03-30 07:01 /user/hdfs/.sparkStaging/application_1617025601058_0323
drwx------   - hdfs hdfs          0 2021-03-30 07:06 /user/hdfs/.sparkStaging/application_1617025601058_0348
drwx------   - hdfs hdfs          0 2021-03-30 07:06 /user/hdfs/.sparkStaging/application_1617025601058_0352
drwx------   - hdfs hdfs          0 2021-03-30 07:11 /user/hdfs/.sparkStaging/application_1617025601058_0379
drwx------   - hdfs hdfs          0 2021-03-30 07:11 /user/hdfs/.sparkStaging/application_1617025601058_0383
drwx------   - hdfs hdfs          0 2021-03-30 07:12 /user/hdfs/.sparkStaging/application_1617025601058_0388
drwx------   - hdfs hdfs          0 2021-03-30 07:16 /user/hdfs/.sparkStaging/application_1617025601058_0410
drwx------   - hdfs hdfs          0 2021-03-30 07:16 /user/hdfs/.sparkStaging/application_1617025601058_0412
drwx------   - hdfs hdfs          0 2021-03-30 07:17 /user/hdfs/.sparkStaging/application_1617025601058_0416
drwx------   - hdfs hdfs          0 2021-03-30 07:26 /user/hdfs/.sparkStaging/application_1617025601058_0473
drwx------   - hdfs hdfs          0 2021-03-30 07:31 /user/hdfs/.sparkStaging/application_1617025601058_0505
drwx------   - hdfs hdfs          0 2021-03-30 07:31 /user/hdfs/.sparkStaging/application_1617025601058_0506
drwx------   - hdfs hdfs          0 2021-03-30 07:36 /user/hdfs/.sparkStaging/application_1617025601058_0533
drwx------   - hdfs hdfs          0 2021-03-30 07:36 /user/hdfs/.sparkStaging/application_1617025601058_0534
drwx------   - hdfs hdfs          0 2021-03-30 07:36 /user/hdfs/.sparkStaging/application_1617025601058_0536
drwx------   - hdfs hdfs          0 2021-03-30 07:36 /user/hdfs/.sparkStaging/application_1617025601058_0537
drwx------   - hdfs hdfs          0 2021-03-30 07:41 /user/hdfs/.sparkStaging/application_1617025601058_0566
drwx------   - hdfs hdfs          0 2021-03-30 07:41 /user/hdfs/.sparkStaging/application_1617025601058_0567

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题