sqoop无法覆盖

ippsafx7  于 2021-06-03  发布在  Sqoop
关注(0)|答案(1)|浏览(379)

我使用下面的命令将数据从sqlserver导入到azure blob存储

sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true --connect "jdbc:sqlserver://server-IP;database=database_name;username=user;password=password"
--username test --password "password" --query "select top 5 * from employ where \$CONDITIONS" --delete-target-dir --target-dir 'wasb://sample@workingclusterblob.blob.core.windows.net/source/employ'
-m 1

低于错误

18/01/30 03:35:45 INFO tool.ImportTool: Destination directory wasb://sample@workingclusterblob.blob.core.windows.net/source/employ is not present, hence not deleting.
18/01/30 03:35:45 INFO mapreduce.ImportJobBase: Beginning query import.
18/01/30 03:35:46 INFO client.AHSProxy: Connecting to Application History server at headnodehost/10.0.0.19:10200
18/01/30 03:35:46 ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory wasb://sample@workingclusterblob.blob.core.windows.net/source/employ already exists

logs语句很混乱,它告诉我们在删除时不存在,而在写入时存在。

mm9b1k5b

mm9b1k5b1#

来自apache sqoop用户指南:
默认情况下,导入将转到新的目标位置。如果目标目录已经存在于hdfs中,sqoop将拒绝导入并覆盖该目录的内容。如果使用--append参数,sqoop会将数据导入临时目录,然后以与该目录中现有文件名不冲突的方式将文件重命名为普通目标目录。
我不在azure环境中复制和验证解决方案,但请尝试添加 --append 你的sqoop导入,让我知道。

sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true --connect "jdbc:sqlserver://server-IP;database=database_name;username=user;password=password"
--username test --password "password" --query "select top 5 * from employ where \$CONDITIONS" --delete-target-dir --append --target-dir 'wasb://sample@workingclusterblob.blob.core.windows.net/source/employ'
-m 1

相关问题