我有列数可变的数据。我希望能够排除所有包含以"9999"结尾的数字的行。
样品输入:
Customer Reference,Profile Name,Score,Band Text,Result1,Result2,Result3,Result4,Result5,Result6,Result7,Result8,Result9,Result10,Result11,Result12,Result13,Result14,Result15,Result16,Result17,Result18,Result19,Result20,Result21,Result22,Result23,Result24,Result25,Result26,Result27,Result28,Result29,Result30,Result31,Result32,Result33,Result34,Result35,Result36,Result37,Result38,Result39,Result40,Result41,Result42,Result43,Result44,Result45,Result46,Result47,Result48,Result49
038ff126-1ed5-4a96-bb34-3f4b595228d3,UK,1200,PASS - 2+2,155261,155101,155151,155161,155271,155251,1551001,1551101,1557031,1559500,1558101,395102,3953500,3952260,3952100,3952101,3952111,3953800,3953760,3953512,3953522,3956611,3959600,3959601,3963505,3963513,3963531,3963507,1062,2452,1332,,,,,,,,,,,,,,,,,,
87529660,Germany,1111,APPROVED,2289528,401126,401102,401423,401424,401426,4012100,4012101,4012111,4012260,4013500,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
37a52968-8093-41e5-8a2e-6bd251d0666d,UK,2200,PASS - 2+2,155261,155101,155151,155161,1551001,1551101,1551111,1551121,1551071,1551072,1553520,1556518,1557031,395102,3953500,3952260,3952100,3952101,3952111,3953512,3953521,3956760,3956600,3956601,3956611,3963504,3963508,3963512,3963526,3963529,3963500,3964510,1062,2452,1332,,,,,,,,,,,,,,
138629690,Germany,1111,APPROVED,2283524,2283525,2282111,2282100,2282101,2282263,2282264,2282266,2282260,4012,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1d45f78b-01c5-4007-8f8c-a9fb845cba1f,UK,1300,PASS - 2+2,155261,155101,155151,155161,155131,1551001,1551011,1551021,1553508,1551101,1551111,1551121,1551071,1551072,1553522,3952,3962,1062,2452,1332,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
a56b590b-b8bd-4e56-987e-f801a37e487d,UK,1300,PASS - 2+2,155261,155101,155151,155161,155131,1551001,1551011,1551021,1553508,1559999,1551111,1551121,1551071,1551072,1553522,1556514,3952,3962,1062,2452,1332,,,,,,,,,,,,,,,,,,,,,,,,,,,,
138888892,Germany,2221,APPROVED,2283525,2282111,2282100,2282101,2284530,2284266,2288263,2288264,2288260,401126,401102,401423,401424,401426,4012100,4012101,4012111,4012260,4013500,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
请注意,不同的文件具有不同数量的"结果"列。
我能找到的唯一有效的方法是这样的:
filter1 = data.Result1.astype(str).str.contains('9999')
filter2 = data.Result2.astype(str).str.contains('9999')
filter3 = data.Result3.astype(str).str.contains('9999')
filter4 = data.Result4.astype(str).str.contains('9999')
然后像这样应用过滤器
data.where(filter1 | filter2 | filter3 | filter4 ... etc
显然,这是痛苦的,我需要改变每个文件的过滤器数量。
我肯定有办法做这个,但我还没找到。有人能帮忙吗?
3条答案
按热度按时间qnzebej01#
连接所有Result* 列并搜索该新列
xwbd5t1u2#
我认为您可以使用
replace
和dropna
的组合。here的部分答案:
1.首先,将以9999结尾的所有值替换为np.nan的
1.你放下了NaN
它应该看起来像这样:
mklgxw1f3#
以下方法仅考虑标记为
Result
的列,并过滤掉那些列值以9999
结尾的行: