Python笔记-沪深三百与茅台简单分析(2021年数据)

x33g5p2x  于2022-02-28 转载在 Python  
字(3.4k)|赞(0)|评价(0)|浏览(315)

下载数据

下载的数据分别是沪深三百2021年数据:

import baostock as bs
import pandas as pd

if __name__ == '__main__':
    #### 登陆系统 ####
    lg = bs.login()
    # 显示登陆返回信息
    print('login respond error_code:' + lg.error_code)
    print('login respond  error_msg:' + lg.error_msg)

    #### 获取历史K线数据 ####
    # 详细指标参数,参见“历史行情指标参数”章节
    rs = bs.query_history_k_data_plus("000300.SH",
                                      "date,code,open,high,low,close,preclose,volume,amount,adjustflag,turn,tradestatus,pctChg,peTTM,pbMRQ,psTTM,pcfNcfTTM,isST",
                                      start_date='2021-01-01', end_date='2021-12-31')  # frequency="d"取日k线,adjustflag="3"默认不复权
    print('query_history_k_data_plus respond error_code:' + rs.error_code)
    print('query_history_k_data_plus respond  error_msg:' + rs.error_msg)

    #### 打印结果集 ####
    data_list = []
    while (rs.error_code == '0') & rs.next():
        # 获取一条记录,将记录合并在一起
        data_list.append(rs.get_row_data())
    result = pd.DataFrame(data_list, columns=rs.fields)
    #### 结果集输出到csv文件 ####
    result.to_csv("D:/stock/000300.SH.csv", encoding="gbk", index=False)
    print(result)

    #### 登出系统 ####
    bs.logout()

    pass

茅台2021年数据:

import baostock as bs
import pandas as pd

if __name__ == '__main__':
    #### 登陆系统 ####
    lg = bs.login()
    # 显示登陆返回信息
    print('login respond error_code:' + lg.error_code)
    print('login respond  error_msg:' + lg.error_msg)

    #### 获取历史K线数据 ####
    # 详细指标参数,参见“历史行情指标参数”章节
    rs = bs.query_history_k_data_plus("600519.SH",
                                      "date,code,open,high,low,close,preclose,volume,amount,adjustflag,turn,tradestatus,pctChg,peTTM,pbMRQ,psTTM,pcfNcfTTM,isST",
                                      start_date='2021-01-01', end_date='2021-12-31')  # frequency="d"取日k线,adjustflag="3"默认不复权
    print('query_history_k_data_plus respond error_code:' + rs.error_code)
    print('query_history_k_data_plus respond  error_msg:' + rs.error_msg)

    #### 打印结果集 ####
    data_list = []
    while (rs.error_code == '0') & rs.next():
        # 获取一条记录,将记录合并在一起
        data_list.append(rs.get_row_data())
    result = pd.DataFrame(data_list, columns=rs.fields)
    #### 结果集输出到csv文件 ####
    result.to_csv("D:/stock/600519.SH.csv", encoding="gbk", index=False)
    print(result)

    #### 登出系统 ####
    bs.logout()

    pass

Python分析数据

# -*- coding: utf-8 -*-

import pandas as pd

if __name__ == '__main__':
    hs300 = pd.read_csv("000300.SH.csv", index_col="date")
    maoTai = pd.read_csv("600519.SH.csv", index_col="date")

    stock_list = [maoTai, hs300]
    df = pd.concat([stock.pctChg for stock in stock_list], axis=1)
    df.columns = ['maoTai', 'hs300']
    df = df.sort_index(ascending=True)
    print(df.describe())

    # 填充数据
    df = df.fillna(0)   # 这个可以去掉
    returns = (df + 1).product() - 1
    print('累计收益率\n', returns)

    import seaborn as sns
    import matplotlib.pyplot as plt
    import matplotlib as mpl

    sns.set()
    mpl.rcParams['font.family'] = 'sans-serif'
    mpl.rcParams['font.sans-serif'] = 'SimHei'

    plt.figure(figsize=(10, 5))
    for col in df.columns:
        plt.plot(df[col], label=col)
    plt.title('日收益率时序图(2021)', fontsize=20)
    plt.legend()
    plt.show()

    pass

后台打印如下:

D:\python\content\python.exe D:/PythonProject/demo/demo22.py
           maoTai       hs300
count  243.000000  243.000000
mean     0.000420   -0.000151
std      0.023567    0.011708
min     -0.069911   -0.035325
25%     -0.012650   -0.006741
50%      0.000323    0.000398
75%      0.014569    0.006918
max      0.095041    0.031595
累计收益率
 maoTai    0.035688
hs300    -0.051986
dtype: float64

日收益时序图如下:

相关解释:

①df.descibe()打印如下信息:

maoTai       hs300
count  243.000000  243.000000
mean     0.000420   -0.000151
std      0.023567    0.011708
min     -0.069911   -0.035325
25%     -0.012650   -0.006741
50%      0.000323    0.000398
75%      0.014569    0.006918
max      0.095041    0.031595

代表的含义:

count:每一列数据有多少行;

mean:改列的平均值;

std:标准偏差(方差开根号);

min:最小值;

25%:有25%的样本比这个值高;

50%:有50%的样本比这个值高(这个值是中位数);

75%:有75%的样本比这个值高;

max:最大值;

②product()这个函数是计算这一列的乘积,代码里面是这样的:

returns = (df + 1).product() - 1
print('累计收益率\n', returns)

这里,原始的df是这样的:

这些都是当天的收益率,+1,后把这些列都乘起来再-1,算出累计收益率。

相关文章