已关闭,此问题需要details or clarity。目前不接受答复。
**想改善这个问题吗?**通过editing this post添加详细信息并澄清问题。
昨天关门了。
Improve this question
如何在re.sub()方法中使用正则表达式替换双引号中的多个前导和尾随空格/制表符?
Regex应该只应用于文件第一行。
输入:
" Column 1 ","Column 2 "," Column 3 "," Column 4 "
" Record 11 "," Record 12 "," Record 13 "," Record 14 "
" Record 21 "," Record 22 ","Record 23 "," Record 24 "
" Record 31 "," Record 32 "," Record 33"," Record 34"
" Record 41 "," Record 42 "," Record 43 "," Record 44 "
预期输出:
"Column 1","Column 2","Column 3","Column 4"
" Record 11 "," Record 12 "," Record 13 "," Record 14 "
" Record 21 "," Record 22 ","Record 23 "," Record 24 "
" Record 31 "," Record 32 "," Record 33"," Record 34"
" Record 41 "," Record 42 "," Record 43 "," Record 44 "
使用以下正则表达式,但无法捕获单个空格:
[^\n(\w)\"]\s+\"|\"\s+[^\n(\w)\"]
注意:列和行将有所不同
2条答案
按热度按时间yrwegjxp1#
要在Python中使用正则表达式替换双引号内的多个前导和尾随空格/制表符,可以使用re模块。举个例子
1cosmwyk2#
不要使用正则表达式来解析CSV等结构化数据,其中双引号可能会在双引号内被doubling them转义,从而使简单的正则表达式模式容易失败,而健壮的正则表达式模式则不必要地复杂。
相反,使用
csv.reader
将CSV正确地读取为列序列,将列Map到str.strip
方法以去除前导和尾随空格,并使用csv.writer
和quoting=csv.QUOTE_ALL
选项生成所有列都用双引号括起来的输出:给定示例输入,上面的代码将输出:
演示:https://replit.com/@blhsing/QuestionableAfraidPublisher