我目前有一个从www.example.com抓取的Pandas Dataframe reddit.com/r/cryptomoonshots,代码如下:
df = pd.DataFrame([vars(post) for post in reddit.subreddit('cryptomoonshots').hot(limit=100)])
df = df[["title","score","url"]]
df.head()
生成一个可读的df:
title score
3 Valor Game Token | Next X100 Gems | Insane Mar... 1135
4 Legends of Aragon token launch | NFT Game is a... 1085
5 TetheRhino Tomorrow Presale 16:00 UTC on DxSal... 833
6 GYM NETWORK The First DeFi Aggregator With Int... 442
7 Puli (PULI) is taking the BSC scene by storm! ... 1482
由于这些帖子的前1-3个单词描述了硬币本身被先令,我想把它们与一个列表配对,然后对它们进行相应的分类。例如,“Beagle Coin”将作为一个字符串的一部分,在一个包含名为Dogs =['Beagle',etc]的列表中找到。
迭代部分并不难,但是我们如何生成一个列表来匹配这些内容呢?
我尝试使用wordnet和itertools:
from nltk.corpus import wordnet as wn
from itertools import chain
dogs = list(chain(*[i.lemma_names for i in wn.all_synsets() if "dog" in i.definition]))
但它给了我错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-33-7da682828270> in <module>
1 from itertools import chain
----> 2 dogs = list(chain(*[i.lemma_names for i in wn.all_synsets() if "dog" in i.definition]))
<ipython-input-33-7da682828270> in <listcomp>(.0)
1 from itertools import chain
----> 2 dogs = list(chain(*[i.lemma_names for i in wn.all_synsets() if "dog" in i.definition]))
TypeError: argument of type 'method' is not iterable
1条答案
按热度按时间waxmsbnn1#
代码失败行中的
definition
是一个方法,而不是某个可迭代类型的示例变量。代码的if "dog" in i.definition
部分失败,因为in
子句需要一个可迭代对象。调用definition
方法应该可以修复该错误。如果您希望获取字符串列表作为dogs
对象,您还需要***调用***lemma_names
方法。因此您的列表解析行应该是: