Python在动态填充的列上生成唯一计数

x6h2sr28 于 2022-12-05 发布在 Python

关注(0)|答案(1)|浏览(121)

我是python和pandas的新手，当我动态填充df2（ Dataframe ）时，我遇到了一个问题，它没有给出正确的唯一计数结果。我有一个产品文件，其中Product_ID，VendorID是关键字（组合关键字）

df=pd.read_csv("Product_file.csv")
df2 = df['Product_ID'] + df['VendorID']
print(df2)
print('type....' , type(df2))

uniq = df2.unique().size

print（'uniue values '，str（uniq））〉〉〉得到的结果是9，这是预期的。
然而，当我通过从表（sqlite3）阅读来动态填充时，该表将PK作为进入索引4的列之一
索引=4将如下所示

PK-KEY(Table-column) 
Product_ID, VendorID
Store_no, Loc_no
Dist_no

result_list = row[4].strip().split(',')
     #print('length of list  :', len(result_list))
     values = 0
     unique_key = ' '
     overall_key = ' '
     values = range(len(result_list))
     index = 0
     i = 1
     

     for index in values:
          #print('index  value >>>  ', i)
          if i == len(result_list):
               df_concat = df_concat + 'df['+ "'" + result_list[index] + "'" + ']'
               break
          elif (i < len(result_list)):
               df_concat = df_concat + ' df['+ "'" + result_list[index] + "'" + ']' + ' +'
          i = i + 1

print(df_concat)
#print(type(df_concat))
df_strip = df_concat.strip(' ')
print('df_strip ->>>>>'  , df_strip)
df_Series = pd.Series(df_strip)

unique_key =  df_Series.unique().size
overall_key = df_Series.count()

获取值为1，不知道如何解决。非常感谢您的帮助。

python-3.x

来源：https://stackoverflow.com/questions/74628307/python-pandas-unique-count-on-the-columns-which-is-populated-dynamically

1条答案

按热度按时间

aiazj4mn1#

我通过在堆栈溢出流中搜索得到了答案，PK键存储在result_list（list）类型中，我使用了.apply方法来获得计数。

df['Combined'] = df[result_list].apply(lambda row: '_'.join(row.values.astype(str)), axis=1)
         print('combined values =======  ', df['Combined'])
         overall_count = len(df['Combined'])
         unique_count =  len(df['Combined'].unique())

赞(0）回复(0）举报 2022-12-05

我来回答

Python在动态填充的列上生成唯一计数

1条答案

相关问题

热门标签

最新问答