如何将数学应用于Pandas Dataframe ,比较2个特定的行和列索引

v7pvogib  于 2022-12-02  发布在  其他
关注(0)|答案(3)|浏览(127)

I have this dataframe

import pandas as pd
import numpy as np
np.random.seed(2022)

# make example data
close = np.sin(range(610)) + 10
high = close + np.random.rand(*close.shape)
open = high - np.random.rand(*close.shape)
low = high - 3
close[2] += 100  
dates = pd.date_range(end='2022-06-30', periods=len(close))

# insert into pd.dataframe
df = pd.DataFrame(index=dates, data=np.array([open, high, low, close]).T, columns=['Open', 'High', 'Low', 'Close'])
print(df)

Output

Open       High       Low       Close
2020-10-29   9.557631  10.009359  7.009359   10.000000
2020-10-30  10.794789  11.340529  8.340529   10.841471
2020-10-31  10.631242  11.022681  8.022681  110.909297
2020-11-01   9.639562  10.191094  7.191094   10.141120
2020-11-02   9.835697   9.928605  6.928605    9.243198
...               ...        ...       ...         ...
2022-06-26  10.738942  11.167593  8.167593   10.970521
2022-06-27  10.031187  10.868859  7.868859   10.321565
2022-06-28   9.991932  10.271633  7.271633    9.376964
2022-06-29   9.069759   9.684232  6.684232    9.005179
2022-06-30   9.479291  10.300242  7.300242    9.548028

The goal here is to compare a specific value in the dataframe, to another value in the dataframe.
Edit: I now know many different ways to achieve this however I have re-written the question so it is more clear for future readers what the original goal was.
For example: Check when the value at 'open' column is less than the value at close column.
One solution for this is using itertuples, I have written an answer below explaining the solution

bvjxkvbb

bvjxkvbb1#

第一步可以通过df.loc["A", "High"] > df.loc["C", "Low"]来完成。要将其应用于所有行,可以执行以下操作:

for i in range(2, len(df)):
    print(df["High"][i-2] > df["Low"][i])

我相信有更好的方法,但这将工作。

hwamh0ep

hwamh0ep2#

您可以对列使用shift操作来向上/向下移动行

`df['High'] > df['Low'].shift(-2)`

要详细说明所发生的情况,请运行下面的命令

df = pd.DataFrame(np.random.randn(5,4), list('ABCDE'), ['Open', 'High', 'Low', 'Close'])
df['Low_shiftup'] = df['Low'].shift(-2)
df.head()
df['High'] > df['Low_shiftup']
kyvafyod

kyvafyod3#

正如我在问题中所解释的,我现在已经找到了这个问题的多种解决方案。
下面是如何使用itertuple来解决这个问题。
首先,创建 Dataframe

import pandas as pd
import numpy as np
np.random.seed(2022)

# make example data
close = np.sin(range(610)) + 10
high = close + np.random.rand(*close.shape)
open = high - np.random.rand(*close.shape)
low = high - 3
close[2] += 100
dates = pd.date_range(end='2022-06-30', periods=len(close))

# insert into pd.dataframe
df = pd.DataFrame(index=dates, data=np.array([open, high, low, close]).T, columns=['Open', 'High', 'Low', 'Close'])
print(df)

现在,我们使用itertuple迭代 Dataframe 的行

for row in df.itertuples():
    o = row.Open
    for r in df.itertuples():
        c = r.Close
        if o < c:
            print('O is less than C')
        else:
            print('O is greater than C')

这将查找开盘价低于收盘价的所有示例
只需添加更多变量和更多if语句,并使用enumerate检查定位,就可以将其扩展为检查同一循环中的其他条件
例如:

for idx, row in enumerate(df.itertuples()):
    o = row.Open
    h = row.High
    for i, r in enumerate(df.itertuples()):
        c = r.Close
        l = r.Low
        if (i > idx) & ((h - 2) > l):
            if o < c:
                print('O is less than C')
            else:
                print('O is greater than C')
        else:
            continue

上面的代码使用enumerate为每个循环添加一个计数器。附加的if语句将只检查行中的“o〈c”,其中“c”的循环计数器大于“o”的循环计数器。
正如您所看到的, Dataframe 中的任何值都可以使用正确的if语句与另一个值进行比较。

相关问题