在bash脚本中运行hadoop命令

qlvxas9a 于 2021-06-03 发布在 Hadoop

关注(0)|答案(1)|浏览(487)

我需要在bash脚本中运行hadoop命令，它遍历amazons3上的一堆文件夹，然后将这些文件夹名写入一个txt文件，然后执行进一步的处理。但问题是，当我运行脚本时，似乎没有文件夹名被写入txt文件。我想知道是不是hadoop命令运行的时间太长，bash脚本没有等到它完成，然后继续执行进一步的进程，如果是这样的话，我怎么能让bash等到hadoop命令完成，然后再执行其他进程呢？
这是我的代码，我试了两种方法，都不起作用：

1. 
listCmd="hadoop fs -ls s3n://$AWS_ACCESS_KEY:$AWS_SECRET_KEY@$S3_BUCKET/*/*/$mydate | grep s3n | awk -F' ' '{print $6}' | cut -f 4- -d / > $FILE_NAME"                            
echo -e "listing... $listCmd\n"                                                                                                                                                   
eval $listCmd
...other process ...

2. 
echo -e "list the folders we want to copy into a file"
hadoop fs -ls s3n://$AWS_ACCESS_KEY:$AWS_SECRET_KEY@$S3_BUCKET/*/*/$mydate | grep s3n | awk -F' ' '{print $6}' | cut -f 4- -d / > $FILE_NAME
... other process ....

有人知道哪里不对吗？使用eval函数还是直接使用第二种方式运行hadoop命令更好
谢谢。

hadoop bash

来源：https://stackoverflow.com/questions/19148745/run-hadoop-command-in-bash-script

1条答案

按热度按时间

ttygqcqt1#

我更愿意 eval 在本例中，将下一个命令附加到此命令后会更漂亮。我宁愿崩溃 listCmd 分成几部分，这样你就知道没有什么不对劲了 grep , awk 或者 cut 水平。

listCmd="hadoop fs -ls s3n://$AWS_ACCESS_KEY:$AWS_SECRET_KEY@$S3_BUCKET/*/*/$mydate > $raw_File"
gcmd="cat $raw_File | grep s3n | awk -F' ' '{print $6}' | cut -f 4- -d / > $FILE_NAME"
echo "Running $listCmd and other commands after that"
otherCmd="cat $FILE_NAME"
eval "$listCmd";
echo $?  # This will print the exit status of the $listCmd
eval "$gcmd" && echo "Finished Listing" && eval "$otherCmd"
``` `otherCmd` 只有在 `$gcmd` 成功。如果需要执行的命令太多，那么这就有点难看了。如果大致知道需要多长时间，可以插入sleep命令。

eval "$listCmd"
sleep 1800 # This will sleep 1800 seconds
eval "$otherCmd"

赞(0）回复(0）举报 2021-06-03

我来回答

在bash脚本中运行hadoop命令

1条答案

相关问题

热门标签

最新问答