使用条件筛选pyspark中的非相等值\where(array\u contains())

whhtz7ly  于 2021-07-14  发布在  Spark
关注(0)|答案(1)|浏览(346)

我有一个Pypark密码

condition_no_hypertension = condition.\
where(array_contains('clinicalStatus.coding.code', 'active')).\
where(array_contains('verificationStatus.coding.code', 'confirmed')).\
where(array_contains('code.coding.code', '38341003')).\
where(condition.onsetDateTime > '1900-01-01').\
withColumn('condition_status', condition['clinicalStatus.coding.code'].getItem(0)).\
withColumn('verification_status', condition['verificationStatus.coding.code'].getItem(0)).\
withColumn('snomed_code', condition['code.coding.code'].getItem(0)).\
withColumn('snomed_name', condition['code.coding.display'].getItem(0)).\
select(\
   (condition['subject.reference'].substr(10, 40).alias('patient_id')),
   'condition_status',\
   'verification_status',\
   'snomed_code', \
   'snomed_name',\
   to_date(condition['onsetDateTime']).alias('first_observation_date'))

如何更改此代码并获取除代码以外的所有内容?
我试过了

where(array_contains('code.coding.code', !='38341003')).\

但它不起作用。

tct7dpnv

tct7dpnv1#

你可以用 ~ (非):

where(~array_contains('code.coding.code', '38341003'))

相关问题