我已经尝试过删除,但是,值仍然存在,解决方法是我创建另一个这样的Dataframe
df_trans_new = df_transactional.filter("Quantity>=0")
``` `df_trans_new.show()` 但是我想从那列中去掉负的entries。谢谢
Python:
df_transactional = spark.read.option("sep", ",")
.option("inferSchema", "true")
.option("header", "true")
.csv("dbfs:/FileStore/tables/transactional_dataset.csv")
df_trans_new = df_transactional.filter("Quantity>=0")
df_trans_new.show()
---------+---------+--------------------+--------+--------------+---------+----------+--------------+
|InvoiceNo|StockCode| Description|Quantity| InvoiceDate|UnitPrice|CustomerID| Country|
+---------+---------+--------------------+--------+--------------+---------+----------+--------------+
| 536365| 85123A|WHITE HANGING HEA...| 6|12/1/2010 8:26| 2.55| 17850|United Kingdom|
| 536365| 71053| WHITE METAL LANTERN| 6|12/1/2010 8:26| 3.39| 17850|United Kingdom|
| 536365| 84406B|CREAM CUPID HEART...| 8|12/1/2010 8:26| 2.75| 17850|United Kingdom|
| 536365| 84029G|KNITTED UNION FLA...| 6|12/1/2010 8:26| 3.39| 17850|United Kingdom|
| 536365| 84029E|RED WOOLLY HOTTIE...| 6|12/1/2010 8:26| 3.39| 17850|United Kingdom|
| 536365| 22752|SET 7 BABUSHKA NE...| -2|12/1/2010 8:26| 7.65|
我需要去掉数量栏上所有的负数
2条答案
按热度按时间1tuwyuhd1#
我怀疑您正在使用python,我试过了,它在pyspark和scala中都能工作:
Python:
斯卡拉:
两种结果都是:
niwlg2el2#
尝试在scala中使用您的数据(在python中的功能也一样),效果很好-
输出-