Python:进程正在进行,开始日期时间和结束日期时间为小时级别

oxf4rvwz  于 2023-03-24  发布在  Python
关注(0)|答案(1)|浏览(259)

去年我一直在跟踪我的游戏时间--只是为了获得我关心的数据和学习python。现在我想知道(和绘图--但还不重要)a玩游戏最多的时间(小时:0到23)在整个时间和所有活动上-自跟踪开始以来每天。
样品:
| 会话标识|游戏标识|开始日期时间|结束日期时间|
| - ------|- ------|- ------|- ------|
| 零零一|七十四|2023年2月22日13时15分|2023年2月22日15时30分|
| 002|一百二十七|2023年2月23日13时30分|2023-02-23 13:45:00|
| 零零三|七十四|2023年2月24日14时40分|2023年2月24日15时00分|
最后我想看到这个信息-计算列不需要:
| 一天中的小时|夏季游戏时数|平均每天玩游戏的小时数|计算|
| - ------|- ------|- ------|- ------|
| 十三|一点|0.33|(0.75 + 0.25)/ 3天|
| 十四|一点三三|0.44|(1.00 + 0.33)/ 3天|
| 十五|0.5分|0.17|(0.5)/ 3天|
总之,我不只是想看看我在哪几个小时玩过(玩过:1,没有玩0),而且还包括我玩的特定小时的比例。
我在网上看到过一些方法,但几乎所有的方法都只是对每个月或每天的是或否事件进行计数/求和,而不是计算一天/一小时的比例。
所以,我很高兴你有任何线索去哪里找。

yb3bgrhw

yb3bgrhw1#

设置:

import pandas as pd

# Load your data into a DataFrame
data = {
    'session_id': [1, 2, 3],
    'game_id': [74, 127, 74],
    'start_datetime': ['2023-02-22 13:15:00', '2023-02-23 13:30:00', '2023-02-24 14:40:00'],
    'end_datetime': ['2023-02-22 15:30:00', '2023-02-23 13:45:00', '2023-02-24 15:00:00']
}

df = pd.DataFrame(data)

# Convert the 'start_datetime' and 'end_datetime' columns to datetime objects
df['start_datetime'] = pd.to_datetime(df['start_datetime'])
df['end_datetime'] = pd.to_datetime(df['end_datetime'])

# Calculate the duration of each gaming session
df['duration'] = df['end_datetime'] - df['start_datetime']

# Initialize an empty dictionary to store the hours played
hours_played = {i: 0 for i in range(24)}

诀窍是将每个会话分成几个小时:

# Break down each session into hours and sum the proportion of hours played
for _, row in df.iterrows():
    start = row['start_datetime']
    end = row['end_datetime']
    duration = row['duration']

    # Loop over the hours involved
    while start < end:

        # Calculate the end of the hour currently considered
        hour_start = start.replace(minute=0, second=0)
        hour_end = hour_start + pd.Timedelta(hours=1)

        played = min(hour_end, end) - start  # Here take what ends first (the hour or the session) and substract the start time
        hours_played[start.hour] += played.total_seconds() / 3600  # Here add the time played to the current value in the dictionary
        
        start = hour_end  # For the (possible) next iteration of the while look, set the start to the end of the hour currently considered

# Calculate the average hours played per day
total_days = (df['end_datetime'].max() - df['start_datetime'].min()).days + 1
avg_hours_played = {hour: hours / total_days for hour, hours in hours_played.items()}

# Create a DataFrame to display the results
results = pd.DataFrame(list(avg_hours_played.items()), columns=['hour_of_day', 'avg_hours_played_per_day'])
results['sum_hours_played'] = [hours_played[hour] for hour in results['hour_of_day']]
results = results[['hour_of_day', 'sum_hours_played', 'avg_hours_played_per_day']]
print(results)

我希望我的评论是可以理解的

相关问题