通过spark从mariadb读取日期和日期时间列时出错

krcsximq  于 2022-11-08  发布在  Spark
关注(0)|答案(2)|浏览(257)

我正在从Spark读取mariadb表,该表具有日期和日期时间字段。Spark在读取时抛出错误。
下面是mariadb表的模式:

Spark代码读取mariadb表:

val df = spark.read.format("jdbc").option("driver", "org.mariadb.jdbc.Driver").option("url", "jdbc:mariadb://xxxx:xxxx/db").option("user", "user").option("password", "password").option("dbtable", "select * from test_ankur").load()
df.select("ptime").show()

这将引发以下日期字段错误:

Caused by: java.sql.SQLTransientConnectionException: Could not get object as Date : Unparseable date: "ptime"
  at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:79)
  at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:183)
  at org.mariadb.jdbc.internal.com.read.resultset.rowprotocol.TextRowProtocol.getInternalDate(TextRowProtocol.java:546)
  at org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.getDate(SelectResultSet.java:1065)

获取以下日期时间字段错误:

Caused by: java.sql.SQLException: cannot parse data in timestamp string 'start_date'
  at org.mariadb.jdbc.internal.com.read.resultset.rowprotocol.TextRowProtocol.getInternalTimestamp(TextRowProtocol.java:645)
  at org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.getTimestamp(SelectResultSet.java:1125)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$12.apply(JdbcUtils.scala:452)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$12.apply(JdbcUtils.scala:451)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:356)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:338)
wh6knrhe

wh6knrhe1#

当我尝试获取日期类型值时,我遇到了类似的问题。我在URL中添加了“nullCatalogMeansCurrent=true”,它成功了。

url "jdbc:mariadb://xxxx:3306/datalake_test?useSSL=FALSE&nullCatalogMeansCurrent=true"
llew8vvj

llew8vvj2#

我通过将连接字符串更改为mysql来实现此功能:

jdbc:mysql://xxxx:xxxx/db

根据mariadb文件,MariaDB色谱柱储存(带Spark)
目前Spark不能正确识别mariadb特定的jdbc连接字符串,因此必须使用jdbc:mysql语法。

相关问题