pandas Panda Dataframe列无法删除

我有一个104 x 4的 Dataframe ，但是第3、4列携带的是看起来像索引数据的数据，我需要删除它。下面是来自104 x 4 Dataframe 的数据示例。以279978开头的号码，然后是568296等，需要删除。我试过用drop =True重置索引，但它什么也没做。谢谢大家

week_number plant_name  normalized_wind_speed   normalized_temperature
0   52  MONTAGUE    "0         0.701286
1         0.767225
2         0.789204
3         0.921082
4         1.074940
            ...   
279978   -1.101045
279979   -0.969167
279980   -0.947187
279981   -1.035106
279982   -1.057085
Name: wind_speed_ms, Length: 5520, dtype: float64"  "0         0.933228
1         0.951533
2         1.043057
3         1.043057
4         0.988143
            ...   
279978   -2.746031
279979   -2.746031
279980   -2.727726
279981   -2.764335
279982   -2.855859
Name: air_temp_c, Length: 5520, dtype: float64"
1   52  STAR POINT  "288264   -0.131748
288265   -0.078411
288266    0.054931
288267   -0.051743
288268    0.454959
            ...   
568296   -1.411837
568297   -1.251826
568298   -1.331832
568299   -1.385169
568300   -1.305163
Name: wind_speed_ms, Length: 5520, dtype: float64"  "288264    0.969358
288265    0.988484
288266    0.988484
288267    0.950232
288268    0.873729
            ...   
568296   -2.836685
568297   -2.817559
568298   -2.741056
568299   -2.721930
568300   -2.721930
Name: air_temp_c, Length: 5520, dtype: float64"

下面是我在编辑器中看到的数据图片：

以下是显示更多数据的结果：

print(norm_data.head().to_dict("list"))
{'week_number': [52, 52, 51, 51, 50], 'plant_name': ['MONTAGUE', 'STAR POINT', 'MONTAGUE', 'STAR POINT', 'MONTAGUE'], 'normalized_wind_speed': [0         0.701286
1         0.767225
2         0.789204
3         0.921082
4         1.074940
  
279978   -1.101045
279979   -0.969167
279980   -0.947187
279981   -1.035106
279982   -1.057085
Name: wind_speed_ms, Length: 5520, dtype: float64, 288264   -0.131748
288265   -0.078411
288266    0.054931
288267   -0.051743
288268    0.454959
  
568296   -1.411837
568297   -1.251826
568298   -1.331832
568299   -1.385169
568300   -1.305163
Name: wind_speed_ms, Length: 5520, dtype: float64, 137      -0.437615
138      -0.574035
139      -0.733191
140      -0.801401
141      -0.733191
  
279738    1.040267
279739    0.972058
279740    0.858374
279741    0.790164
279742    0.767428
Name: wind_speed_ms, Length: 5544, dtype: float64, 288376   -1.123354
288377   -1.038684
288378   -1.123354
288379   -1.038684
288380   -0.925791
  
568357   -1.716045
568358   -0.869344
568359   -0.700004
568360   -0.558887
568361   -0.530664
Name: wind_speed_ms, Length: 5544, dtype: float64, 224       1.381725
225       1.559191
226       1.537008
227       1.448275
228       1.359542
  
280304   -0.259837
280305   -0.503853
280306   -0.636953
280307   -0.636953
280308   -0.171104
Name: wind_speed_ms, Length: 5544, dtype: float64], 'normalized_temperature': [0         0.933228
1         0.951533
2         1.043057
3         1.043057
4         0.988143
  
279978   -2.746031
279979   -2.746031
279980   -2.727726
279981   -2.764335
279982   -2.855859
Name: air_temp_c, Length: 5520, dtype: float64, 288264    0.969358
288265    0.988484
288266    0.988484
288267    0.950232
288268    0.873729
  
568296   -2.836685
568297   -2.817559
568298   -2.741056
568299   -2.721930
568300   -2.721930
Name: air_temp_c, Length: 5520, dtype: float64, 137      -0.428008
138      -0.375312
139      -0.498269
140      -0.691487
141      -0.884705
  
279738    0.239473
279739    0.221907
279740    0.186777
279741    0.186777
279742    0.221907
Name: air_temp_c, Length: 5544, dtype: float64, 288376   -2.485504
288377   -2.688291
288378   -2.872642
288379   -3.038558
288380   -3.167603
  
568357   -3.591611
568358   -3.683786
568359   -3.720657
568360   -3.646916
568361   -3.462565
Name: air_temp_c, Length: 5544, dtype: float64, 224       0.445641
225       0.408846
226       0.335257
227       0.224873
228       0.059297
  
280304    1.199931
280305    1.181534
280306    1.181534
280307    1.236726
280308    1.236726
Name: air_temp_c, Length: 5544, dtype: float64]}

这是生成如上所示的df的循环结构：

for week in week_numbers:

    # Get the data for the current week number
    current_week_data = ncDatad[ncDatad['week'] == week]

    # Loop over each plant name
    for site in sites:

        # Get the data for the current plant name
        current_plant_data = current_week_data[ncDatad['plant_name'] == site]

        # Calculate the mean and standard deviation for wind speed
        wind_speed_mean = current_plant_data['wind_speed_ms'].mean()
        wind_speed_std = current_plant_data['wind_speed_ms'].std()

        # Calculate the mean and standard deviation for temperature
        temperature_mean = current_plant_data['air_temp_c'].mean()
        temperature_std = current_plant_data['air_temp_c'].std()

        # Normalize the wind speed values
        normalized_wind_speed = (current_plant_data['wind_speed_ms'] - wind_speed_mean) / wind_speed_std

        # Normalize the temperature values
        normalized_temperature = (current_plant_data['air_temp_c'] - temperature_mean) / temperature_std

        # Create a new row for the current plant name
        new_row = {
            'week_number': week,
            'plant_name': site,
            'normalized_wind_speed': normalized_wind_speed,
            'normalized_temperature': normalized_temperature
        }

        # Add the new row to the dataframe
        norm_data = norm_data.append(new_row, ignore_index=True)

# Print the normalized values
norm_data = norm_data.reset_index(drop=True)
print(norm_data)

IIUC，你需要稍微修改你的代码（通过使用to_list）来获得一个列表而不是Series：

data = [] # <-- added

for week in week_numbers:
    ... # rest of your code
        new_row = {
            'week_number': week,
            'plant_name': site,
            'normalized_wind_speed': normalized_wind_speed.to_list(), # <-- updated
            'normalized_temperature': normalized_temperature.to_list() # <-- updated
        }
        
        data.append(pd.DataFrame(new_row)) # <-- updated

norm_data = pd.concat(data, ignore_index=True) # <-- updated

输出：

print(norm_data)

   week_number  plant_name          normalized_wind_speed         normalized_temperature
0           52    MONTAGUE  [0.701286, 0.767225, 0.789...  [0.933228, 0.951533, 1.043...
1           52  STAR POINT  [-0.131748, -0.078411, 0.0...  [0.969358, 0.988484, 0.988...
2           51    MONTAGUE  [-0.437615, -0.574035, -0....  [-0.428008, -0.375312, -0....
3           51  STAR POINT  [-1.123354, -1.038684, -1....  [-2.485504, -2.688291, -2....
4           50    MONTAGUE  [1.381725, 1.559191, 1.537...  [0.445641, 0.408846, 0.335...

pandas Panda Dataframe列无法删除

1条答案

相关问题

热门标签

最新问答