pandas 如何使用python迭代包含文本的行并创建二元语法

rqdpfwrv 于 2023-02-07 发布在 Python

关注(0)|答案(2)|浏览(98)

在excel文件中，我有5列和20行，其中一行包含文本数据，如下所示df['Content']行包含：

0 this is the final call
1 hello how are you doing 
2 this is me please say hi
..
.. and so on

我想创建二元模型，同时它仍然附加到它的原始表。
我尝试应用below函数来遍历行

def find_bigrams(input_list):
    bigram_list = []
    for i in range(len(input_list)-1):
        bigram_list.append(input_list[1:])
        return bigram_list

并尝试使用以下方法将该行应用回其表：

df['Content'] = df['Content'].apply(find_bigrams)

但我得到了以下错误：

0     None
1     None
2     None

我希望输出如下

Company  Code      Content
0  xyz      uh-11     (this,is),(is,the),(the,final),(final,call)
1  abc      yh-21     (hello,how),(how,are),(are,you),(you,doing)

pandas

来源：https://stackoverflow.com/questions/75328361/how-to-iterate-through-rows-which-contains-text-and-create-bigrams-using-python

2条答案

按热度按时间

2g32fytz1#

input_list实际上不是一个列表，而是一个字符串。
尝试以下功能：

def find_bigrams(input_text):
    input_list = input_text.split(" ")
    bigram_list = list(map(tuple, zip(input_list[:-1], input_list[1:])))
    return bigram_list

赞(0）回复(0）举报 2023-02-07

igetnqfo2#

您可以使用itertools.permutations()

s.str.split().map(lambda x: list(itertools.permutations(x,2))[::len(x)])

赞(0）回复(0）举报 2023-02-07

我来回答

pandas 如何使用python迭代包含文本的行并创建二元语法

2条答案

相关问题

热门标签

最新问答