Pandas：将excel列转换为pandas

kupeojn6 于 2023-06-20 发布在其他

关注(0)|答案(1)|浏览(141)

我正在把我的excel转换成python，下面是excel中的行
符号（B列）：列具有不同的符号
日期（C列）：列包含日期和时间
价格（D栏）：列有价格
下限（E列）：列已提及“下限”或空单元格
下限值（F列）：列在F2单元格中有此公式-
IF（B2<>B1，D2，IF（E2=“Lower Low”，D2，F1））
我试图使用pandan Dataframe转换F2中的公式，并将其应用到每个Symbol列的组：
为上述列创建mod_df DataFrame：
mod_df['Lower_Value'] = np.where（mod_df['Symbol']！= mod_df['Symbol'].shift（1），mod_df['Price']，np.where（mod_df['Lower Low'] == 'Lower Low'，mod_df['Price']，mod_df['Lower_Value'].shift（1）））
我得到这个错误：KeyError：'Lower_Value'，有人能建议如何解决这个问题吗？

data= {'Symbol': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'],
        'Date': ['2023-05-12 16:00:00', '2023-05-15 15:00:00', '2023-05-15 22:00:00', '2023-05-16 07:00:00',
                 '2023-05-16 14:00:00', '2023-05-17 07:00:00', '2023-05-17 20:00:00', '2023-05-18 02:00:00'],
        'Price': [0.89845,0.90065,0.90042,0.89841,0.89462,0.89437,0.89455,0.89248],
        'Lower Low': ['', '', '', '', '', 'Lower Low', '', ''],
        'Lower Value': [0.89845, 0.89845, 0.89845, 0.89845, 0.89845, 0.89437, 0.89437, 0.89437],

        }

mod_df = pd.DataFrame(data)

print(mod_df)

Dataframe:
**********
Symbol          Date        Price    Lower Low  
A   12-05-2023 16:00:00    0.89845          
A   15-05-2023 15:00:00    0.90065          
A   15-05-2023 22:00:00    0.90042          
A   16-05-2023 07:00:00    0.89841          
A   16-05-2023 14:00:00    0.89462          
A   17-05-2023 07:00:00    0.89437   Lower Low  
A   17-05-2023 20:00:00    0.89455          
A   18-05-2023 02:00:00    0.89248  

Required Output:
****************
Symbol          Date        Price   Lower Low   Lower Value
A   12-05-2023 16:00:00     0.89845             0.89845
A   15-05-2023 15:00:00     0.90065             0.89845
A   15-05-2023 22:00:00     0.90042             0.89845
A   16-05-2023 07:00:00     0.89841             0.89845
A   16-05-2023 14:00:00     0.89462             0.89845
A   17-05-2023 07:00:00     0.89437 Lower Low   0.89437
A   17-05-2023 20:00:00     0.89455             0.89437
A   18-05-2023 02:00:00     0.89248             0.89437

pandas

来源：https://stackoverflow.com/questions/76474643/pandas-converting-excel-column-to-pandas

1条答案

按热度按时间

p1tboqfb1#

假设你已经从你的excel文件中创建了mod_df DataFrame，就像你在问题中写的那样，你根本没有一个名为Lower_Value的键。将其更改为Lower_Values，键应正确

**编辑：**这里是一个更新的示例，它与给定示例数据的所需输出匹配

import pandas as pd
import numpy as np

data= {'Symbol': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'],
        'Date': ['2023-05-12 16:00:00', '2023-05-15 15:00:00', '2023-05-15 22:00:00', '2023-05-16 07:00:00',
                 '2023-05-16 14:00:00', '2023-05-17 07:00:00', '2023-05-17 20:00:00', '2023-05-18 02:00:00'],
        'Price': [0.89845,0.90065,0.90042,0.89841,0.89462,0.89437,0.89455,0.89248],
        'Lower Low': ['', '', '', '', '', 'Lower Low', '', ''],
        'Lower Value': [0.89845, 0.89845, 0.89845, 0.89845, 0.89845, 0.89437, 0.89437, 0.89437],

        }

mod_df = pd.DataFrame(data)
mod_df.pop('Lower Value')

## This is your starting DataFrame without the "Lower Value" column
print("STARTING DF\n", mod_df)

## Instantiate the "Lower_Value" column
mod_df["Lower Value"] = np.nan
print("\nNOW HAS NEW COLUMN\n", mod_df)

## Now use your logic to populate the "Lower Value" column
mod_df['Lower Value'] = np.where(mod_df['Symbol'] != mod_df['Symbol'].shift(1),
                                 mod_df['Price'],
                                 np.where(mod_df['Lower Low'] == 'Lower Low', 
                                          mod_df['Price'], 
                                          mod_df['Lower Value'].shift(1)))

## Current logic doesn't fill in the entire column, use this to forward-fill the values
mod_df = mod_df.fillna(method='ffill')
print("\nTARGET OUTPUT\n", mod_df)

生产：

STARTING DF
   Symbol                 Date    Price  Lower Low
0      A  2023-05-12 16:00:00  0.89845           
1      A  2023-05-15 15:00:00  0.90065           
2      A  2023-05-15 22:00:00  0.90042           
3      A  2023-05-16 07:00:00  0.89841           
4      A  2023-05-16 14:00:00  0.89462           
5      A  2023-05-17 07:00:00  0.89437  Lower Low
6      A  2023-05-17 20:00:00  0.89455           
7      A  2023-05-18 02:00:00  0.89248           

NOW HAS NEW COLUMN
   Symbol                 Date    Price  Lower Low  Lower Value
0      A  2023-05-12 16:00:00  0.89845                     NaN
1      A  2023-05-15 15:00:00  0.90065                     NaN
2      A  2023-05-15 22:00:00  0.90042                     NaN
3      A  2023-05-16 07:00:00  0.89841                     NaN
4      A  2023-05-16 14:00:00  0.89462                     NaN
5      A  2023-05-17 07:00:00  0.89437  Lower Low          NaN
6      A  2023-05-17 20:00:00  0.89455                     NaN
7      A  2023-05-18 02:00:00  0.89248                     NaN

TARGET OUTPUT
   Symbol                 Date    Price  Lower Low  Lower Value
0      A  2023-05-12 16:00:00  0.89845                 0.89845
1      A  2023-05-15 15:00:00  0.90065                 0.89845
2      A  2023-05-15 22:00:00  0.90042                 0.89845
3      A  2023-05-16 07:00:00  0.89841                 0.89845
4      A  2023-05-16 14:00:00  0.89462                 0.89845
5      A  2023-05-17 07:00:00  0.89437  Lower Low      0.89437
6      A  2023-05-17 20:00:00  0.89455                 0.89437
7      A  2023-05-18 02:00:00  0.89248                 0.89437

赞(0）回复(0）举报 2023-06-20

我来回答

Pandas：将excel列转换为pandas

1条答案

相关问题

热门标签

最新问答