我正在尝试将下面的配置单元sql语句转换为spark dataframe,但出现了一个错误。
case when (lower(message_txt) rlike '.*sampletext(\\s?is\\s?)newtext.*' ) then 'P' else 'Y'
样本数据: message_txt = "This is new sampletext, followed by newtext"
请帮我提供等价的sparkDataframe语句。
我正在尝试将下面的配置单元sql语句转换为spark dataframe,但出现了一个错误。
case when (lower(message_txt) rlike '.*sampletext(\\s?is\\s?)newtext.*' ) then 'P' else 'Y'
样本数据: message_txt = "This is new sampletext, followed by newtext"
请帮我提供等价的sparkDataframe语句。
2条答案
按热度按时间mf98qq941#
使用
when(lower($"value").rlike(""".sampletext(\sis\s?)newtext."""),lit('P')).otherwise("Y")
```scala> df.withColumn("condition",when(lower($"value").rlike(""".sampletext(\s?is\s?)newtext."""),lit('P')).otherwise("Y")).show(false)
+-------------------------------------------+---------+
|value |condition|
+-------------------------------------------+---------+
|This is new sampletext, followed by newtext|Y |
+-------------------------------------------+---------+
8ljdwjyq2#
添加
end
结束时case statement
在sql中。Example:
In spark Sql:
```val df=Seq(("This is new sampletext, followed by newtext")).toDF("message_txt")
df.createOrReplaceTempView("tmp")
spark.sql("select case when (lower(message_txt) rlike '.sampletext(\s?is\s?)newtext.' ) then 'P' else 'Y' end from tmp").show()
//Result
//+--------------------------------------------------------------------------------+
//|CASE WHEN lower(message_txt) RLIKE .sampletext(s?iss?)newtext. THEN P ELSE Y END|
//+--------------------------------------------------------------------------------+
//| Y|
//+--------------------------------------------------------------------------------+
`In dataframe API:`
df.withColumn("status", when(lower(col("message_txt")).rlike(".sampletext(\s?is\s?)newtext."),"P").otherwise("Y")).show()
//Result
//+--------------------+------+
//| message_txt|status|
//+--------------------+------+
//|This is new sampl...| Y|
//+--------------------+------+
```
UPDATE:
正在检查message\u txt列中的字符串sampletext和newtext。