返回后面或前面有空格的某个字符或单词- Regex Python

cetgtptt  于 2022-11-18  发布在  Python
关注(0)|答案(1)|浏览(151)

尝试使用正则表达式只选择衣服的尺寸
所以我是Python的新手,我试图选择行,找到这些大小,但与其他词混淆。我使用正则表达式,但未能获得所需的结果。
编码:

df = pd.DataFrame({"description":["Silver","Red","GOLD","Black Leather","S","L","S","XL","XXL","Noir Matt"," 150x160L","140M"]})
df.description.apply(lambda x : x if re.findall(r"(?!\s+\d+)(S|M|X*L)(?!\s+\d+)",str(x)) else np.nan).unique()

输出量:

array(['Silver', nan, 'Black Leather', 'S', 'L', 'XL', 'XXL', 'Noir Matt',
       ' 150x160L', '140M'], dtype=object)

预期值:

array([ 'S', 'L', 'XL', 'XXL',nan], dtype=object)
lhcgjxsq

lhcgjxsq1#

我觉得你需要用

import pandas as pd
df = pd.DataFrame({"description":["Silver","Red","GOLD","Black Leather","S","L","S","XL","XXL","Noir Matt"," 150x160L","140M"]})
df['description'][df['description'].str.match(r'^(?:S|M|X*L)$')].unique()
# => array(['S', 'L', 'XL', 'XXL'], dtype=object)

使用Series.str.match(r'^(?:S|M|X*L)$'),您可以将description数据行中完全符合SM、零个或多个X,然后是L值的部分设为子集。

相关问题