spark cassandra sqlcontext和unix epoch timestamp列

w8biq8rn  于 2021-05-27  发布在  Spark
关注(0)|答案(1)|浏览(566)

我有一个带有unix epoch timestamp列的cassandra表(值例如1599613045)。我想使用sparksqlcontext根据这个unix epoch timestamp列从这个表中选择from date to date。我打算将from date、to date输入转换为epoch timestamp,并将(>=&<=)与表epoch ts列进行比较。有可能吗?有什么建议吗?非常感谢!

4smxwvx5

4smxwvx51#

遵循以下方法,
让我们考虑一下Cassandra正在竞选localhost:9042
键空间-->我的键空间
表格-->mytable
columnname-->时间戳
spark scala代码:

import org.apache.spark.sql.SparkSession
  import org.apache.spark.sql.functions._

   // create SparkSession
  val spark=SparkSession.builder().master("local[*]").getOrCreate()
  import spark.implicits._

 //Read table from cassandra, spark-cassandra connector should be added to  classpath
 spark.conf.set("spark.cassandra.connection.host", "localhost")
 spark.conf.set("spark.cassandra.connection.port", "9042")

  var cassandraDF = spark.read.format("org.apache.spark.sql.cassandra")
        .options(Map("keyspace" -> "mykeyspace", "table" -> "mytable")).load()

 //select timestamp column
cassandraDF=cassandraDF.select('timestamp)
cassandraDF.show(false)

// let's consider following as the output

+----------+
| timestamp|
+----------+
|1576089000|
|1575916200|
|1590258600|
|1591900200|
+----------+

// To convert the above output to spark's default date format yyyy-MM-dd
val outDF=cassandraDF.withColumn("date",to_date(from_unixtime('timestamp)))
outDF.show(false)

+----------+----------+
| timestamp|      date|
+----------+----------+
|1576089000|2019-12-12|
|1575916200|2019-12-10|
|1590258600|2020-05-24|
|1591900200|2020-06-12|
+----------+----------+

// You can proceed with next steps from here

相关问题