python—如何选择名称中包含指定单词且其值等于databricks上配置单元sql中指定值的列

nbnkbykc 于 2021-06-24 发布在 Hive

关注(0)|答案(0)|浏览(231)

我正在尝试选择一些列，这些列的名称在databricks上的配置单元sql中带有一些特定的单词。
基于配置单元使用正则表达式选择列名？
我的代码：

%py
  t = spark.createDataFrame([('50', 'rscds', 'tyhdvs'),], ['id', 'col_pattern_1', 'col_pattern_2'])
  t.write.saveAsTable('my_database.my_table')

  %sql 
  set hive.support.quoted.identifiers=none;
  select `col_pattern.*` 
  from my_database.my_table

我得到了：

Error in SQL statement: AnalysisException: cannot resolve '`col_pattern.*`' given input

我试过：

import pyspark.sql.functions as F
 selected = [s for s in t.columns if 'col_pattern' in s]
 t.filter(t[x]=='rscds' for x in selected)

我得到了：

TypeError: condition should be string or Column

输入：

the dataframe may have 20+ columns with the same prefix, I cannot type them in the query one by one, so I need to find a way to filter the DF by all the columns with the same prefix by a given value.    

 +---+-------------+-------------+-------------+
 | id|col_pattern_1|col_pattern_2|col_pattern_3|
 +---+-------------+-------------+-------------+
| 50|        rscds|       tyhdvs|        tyhdvs|
 +---+-------------+-------------+-------------+

输出：

e.g. I need to find the rows with the column that has the given prefix ('col_pattern') and its value == 'rscds'

  +---+-------------+
  | id|col_pattern_1|
  +---+-------------|
  | 50|        rscds|
  +---+-------------+

选择名称包含指定单词且其值==指定值的列。
谢谢

sql Hive python databricks hiveql

来源：https://stackoverflow.com/questions/64112749/how-to-select-the-columns-with-specified-words-in-their-names-and-their-values-a

暂无答案！

目前还没有任何答案，快来回答吧！

我来回答

python—如何选择名称中包含指定单词且其值等于databricks上配置单元sql中指定值的列

暂无答案！

相关问题

热门标签

最新问答