Pandas:将excel列转换为pandas

kupeojn6  于 2023-06-20  发布在  其他
关注(0)|答案(1)|浏览(141)

我正在把我的excel转换成python,下面是excel中的行
符号(B列):列具有不同的符号
日期(C列):列包含日期和时间
价格(D栏):列有价格
下限(E列):列已提及“下限”或空单元格
下限值(F列):列在F2单元格中有此公式-
IF(B2<>B1,D2,IF(E2=“Lower Low”,D2,F1))
我试图使用pandan Dataframe转换F2中的公式,并将其应用到每个Symbol列的组:
为上述列创建mod_df DataFrame:
mod_df['Lower_Value'] = np.where(mod_df['Symbol']!= mod_df['Symbol'].shift(1),mod_df['Price'],np.where(mod_df['Lower Low'] == 'Lower Low',mod_df['Price'],mod_df['Lower_Value'].shift(1)))
我得到这个错误:KeyError:'Lower_Value',有人能建议如何解决这个问题吗?

data= {'Symbol': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'],
        'Date': ['2023-05-12 16:00:00', '2023-05-15 15:00:00', '2023-05-15 22:00:00', '2023-05-16 07:00:00',
                 '2023-05-16 14:00:00', '2023-05-17 07:00:00', '2023-05-17 20:00:00', '2023-05-18 02:00:00'],
        'Price': [0.89845,0.90065,0.90042,0.89841,0.89462,0.89437,0.89455,0.89248],
        'Lower Low': ['', '', '', '', '', 'Lower Low', '', ''],
        'Lower Value': [0.89845, 0.89845, 0.89845, 0.89845, 0.89845, 0.89437, 0.89437, 0.89437],

        }

mod_df = pd.DataFrame(data)

print(mod_df)

Dataframe:
**********
Symbol          Date        Price    Lower Low  
A   12-05-2023 16:00:00    0.89845          
A   15-05-2023 15:00:00    0.90065          
A   15-05-2023 22:00:00    0.90042          
A   16-05-2023 07:00:00    0.89841          
A   16-05-2023 14:00:00    0.89462          
A   17-05-2023 07:00:00    0.89437   Lower Low  
A   17-05-2023 20:00:00    0.89455          
A   18-05-2023 02:00:00    0.89248  

Required Output:
****************
Symbol          Date        Price   Lower Low   Lower Value
A   12-05-2023 16:00:00     0.89845             0.89845
A   15-05-2023 15:00:00     0.90065             0.89845
A   15-05-2023 22:00:00     0.90042             0.89845
A   16-05-2023 07:00:00     0.89841             0.89845
A   16-05-2023 14:00:00     0.89462             0.89845
A   17-05-2023 07:00:00     0.89437 Lower Low   0.89437
A   17-05-2023 20:00:00     0.89455             0.89437
A   18-05-2023 02:00:00     0.89248             0.89437
p1tboqfb

p1tboqfb1#

假设你已经从你的excel文件中创建了mod_df DataFrame,就像你在问题中写的那样,你根本没有一个名为Lower_Value的键。将其更改为Lower_Values,键应正确

**编辑:**这里是一个更新的示例,它与给定示例数据的所需输出匹配

import pandas as pd
import numpy as np

data= {'Symbol': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'],
        'Date': ['2023-05-12 16:00:00', '2023-05-15 15:00:00', '2023-05-15 22:00:00', '2023-05-16 07:00:00',
                 '2023-05-16 14:00:00', '2023-05-17 07:00:00', '2023-05-17 20:00:00', '2023-05-18 02:00:00'],
        'Price': [0.89845,0.90065,0.90042,0.89841,0.89462,0.89437,0.89455,0.89248],
        'Lower Low': ['', '', '', '', '', 'Lower Low', '', ''],
        'Lower Value': [0.89845, 0.89845, 0.89845, 0.89845, 0.89845, 0.89437, 0.89437, 0.89437],

        }

mod_df = pd.DataFrame(data)
mod_df.pop('Lower Value')

## This is your starting DataFrame without the "Lower Value" column
print("STARTING DF\n", mod_df)

## Instantiate the "Lower_Value" column
mod_df["Lower Value"] = np.nan
print("\nNOW HAS NEW COLUMN\n", mod_df)

## Now use your logic to populate the "Lower Value" column
mod_df['Lower Value'] = np.where(mod_df['Symbol'] != mod_df['Symbol'].shift(1),
                                 mod_df['Price'],
                                 np.where(mod_df['Lower Low'] == 'Lower Low', 
                                          mod_df['Price'], 
                                          mod_df['Lower Value'].shift(1)))

## Current logic doesn't fill in the entire column, use this to forward-fill the values
mod_df = mod_df.fillna(method='ffill')
print("\nTARGET OUTPUT\n", mod_df)

生产:

STARTING DF
   Symbol                 Date    Price  Lower Low
0      A  2023-05-12 16:00:00  0.89845           
1      A  2023-05-15 15:00:00  0.90065           
2      A  2023-05-15 22:00:00  0.90042           
3      A  2023-05-16 07:00:00  0.89841           
4      A  2023-05-16 14:00:00  0.89462           
5      A  2023-05-17 07:00:00  0.89437  Lower Low
6      A  2023-05-17 20:00:00  0.89455           
7      A  2023-05-18 02:00:00  0.89248           

NOW HAS NEW COLUMN
   Symbol                 Date    Price  Lower Low  Lower Value
0      A  2023-05-12 16:00:00  0.89845                     NaN
1      A  2023-05-15 15:00:00  0.90065                     NaN
2      A  2023-05-15 22:00:00  0.90042                     NaN
3      A  2023-05-16 07:00:00  0.89841                     NaN
4      A  2023-05-16 14:00:00  0.89462                     NaN
5      A  2023-05-17 07:00:00  0.89437  Lower Low          NaN
6      A  2023-05-17 20:00:00  0.89455                     NaN
7      A  2023-05-18 02:00:00  0.89248                     NaN

TARGET OUTPUT
   Symbol                 Date    Price  Lower Low  Lower Value
0      A  2023-05-12 16:00:00  0.89845                 0.89845
1      A  2023-05-15 15:00:00  0.90065                 0.89845
2      A  2023-05-15 22:00:00  0.90042                 0.89845
3      A  2023-05-16 07:00:00  0.89841                 0.89845
4      A  2023-05-16 14:00:00  0.89462                 0.89845
5      A  2023-05-17 07:00:00  0.89437  Lower Low      0.89437
6      A  2023-05-17 20:00:00  0.89455                 0.89437
7      A  2023-05-18 02:00:00  0.89248                 0.89437

相关问题