用户可以选择是否要在groupby语句中包含列或不包含列。如果他们选择包含列,则列名将添加到列表中,因此列表的范围可以是1-6个值。有三个附加列将始终在group by语句中使用,但用户选择的列将具有可变长度
到目前为止,我已经尝试了以下所有导致错误
categorical_fields = []
card_name_match=input("Do you want to include card name first name matches passenger first name: y/n")
if card_name_match=="y":
categorical_fields.append("name_match")
fare_class_1_cat=input("Do you want to include Fare Class 1 Category: y/n")
if fare_class_1_cat=="y":
categorical_fields.append("FARECLASS1_cat")
fare_class_2_cat=input("Do you want to include Fare Class 2 Category: y/n")
if fare_class_2_cat=="y":
categorical_fields.append("FARECLASS2_cat")
distance_cat=input("Do you want to include Distance: y/n")
if distance_cat=="y":
categorical_fields.append("distance_category")
int_or_domestic=input("Do you want to include if flight was international or domestic: y/n")
if int_or_domestic=="y":
categorical_fields.append("international_or_domestic")
journey_type=input("Do you want to include journey type of one way, round trip, or different 2nd arrival destination: y/n")
if journey_type=="y":
categorical_fields.append("dep_to_arr")
airline_score = airline.groupby([categorical_fields,'category','score','mop']).agg(count=('fs_sham','count'),dollars=('fs_dollars','sum')).reset_index()
ValueError: Grouper and axis must be same length
categorical_fields.extend(['category','score','mop'])
group_columns = airline.groupby(categorical_fields)
airline_score = airline.groupby(group_columns).agg(count=('fs_sham','count'),dollars=('fs_dollars','sum')).reset_index()
ValueError: Grouper for '<class 'pandas.core.groupby.generic.DataFrameGroupBy'>' not 1-dimensional
categorical_fields.extend(['category','score','mop'])
airline_score = airline.groupby(airline.columns.isin([categorical_fields])).agg(count=('fs_sham','count'),dollars=('fs_dollars','sum')).reset_index()
ValueError: Grouper and axis must be same length
2条答案
按热度按时间z9smfwbn1#
看看你的 airline_score grouper:
它是什么形状的?就像这样:
这不是1D列表。相反,构建一个实际的1D grouper:
aelbi1ox2#
IIUC的主要问题是,您正在使用列(* 由用户选择 *)-即使在扩展之后,其长度也恰好小于DataFrame
airline
列-作为groupby
的分组器。这就是为什么这个ValueError
被触发。我将给予一个可能的修复建议,并尝试使您的代码更清晰: