Is there any way to read data into pyspark dataframe from sql-server table based on condition, eg read only rows where column 'time_stamp' has current date?
Alternativey, I want to translate :
select * from table_name where time_stamp=cast(getdate() as date)
into pyspark dataframe.
I am using :
remote_table = (spark.read.format("sqlserver")
.option("host", "host_name")
.option("user", "use_name")
.option("password", "password")
.option("database", "database_name")
.option("dbtable", "dbo.table_name")
.load() )
which reads entire table 'table_name'. I just need to read rows that satisfy a condition, like 'where' clause in SQL.
1条答案
按热度按时间35g0bw711#