我正在尝试使用sqoop将Parquet文件表单s3导出到sql server,出现以下错误:
19/07/09 16:12:57错误sqoop.sqoop:运行sqoop时出现异常:org.kitesdk.data.datasetnotfoundexception:未知的数据集uri模式:dataset:s3用法:/mybucket/data lake/serving zone/part-00002-b5a1da42.snappy.parquet检查s3数据集的jar是否在类路径org.kitesdk.data.datasetnotfoundexception:未知的数据集uri模式:dataset:s3用法:/mybucket/data lake/serving zone/part-00002-b5a1da42.snappy.parquet检查s3数据集的jar是否位于org.kitesdk.data.spi.registration.lookupdateaseturi(registration)的类路径上。java:128)在org.kitesdk.data.datasets.load(datasets。java:103)在org.kitesdk.data.datasets.load(datasets。java:140)在org.kitesdk.data.mapreduce.datasetkeyinputformat$configbuilder.readfrom(datasetkeyinputformat)。java:92)位于org.kitesdk.data.mapreduce.datasetkeyinputformat$configbuilder.readfrom(datasetkeyinputformat)。java:139)位于org.apache.sqoop.mapreduce.jdbcexportjob.configureinputformat(jdbcexportjob)。java:83)在org.apache.sqoop.mapreduce.exportjobbase.runexport(exportjobbase。java:434)位于org.apache.sqoop.manager.sqlservermanager.exporttable(sqlservermanager。java:192)在org.apache.sqoop.tool.exporttool.exporttable(exporttool。java:80)在org.apache.sqoop.tool.exporttool.run(exporttool。java:99)在org.apache.sqoop.sqoop.run(sqoop。java:147)在org.apache.hadoop.util.toolrunner.run(toolrunner。java:76)在org.apache.sqoop.sqoop.runsqoop(sqoop。java:183)在org.apache.sqoop.sqoop.runtool(sqoop。java:234)在org.apache.sqoop.sqoop.runtool(sqoop。java:243)在org.apache.sqoop.sqoop.main(sqoop。java:252)
数据集存在于上述位置,路径uri没有问题。我试着从同一路径导出一个csv文件,结果成功了。
下面是我的sqoop导出命令:
sqoop export --driver com.microsoft.sqlserver.jdbc.SQLServerDriver
--connection-manager org.apache.sqoop.manager.SQLServerManager
--connect "jdbc:sqlserver://localhost:1433;databaseName=salesdb"
--table DimEmployee_test --num-mappers 128
--export-dir s3://mybucket/data-lake/serving-zone/part-00002-b5a1da42.snappy.parquet
--username db-user --password mypassword
1条答案
按热度按时间dbf7pr2w1#
您的--connect uri似乎很笨拙,请尝试改用以下格式: