regex shell 脚本：从文件中的所有字符串中删除首双引号和末双引号("

vtwuwzda 于 2023-03-04 发布在 Shell

关注(0)|答案(3)|浏览(305)

我想删除文件中所有的外部双引号，并将多个空行合并为一行。
例如，

"UPDATE TableA 
 SET country_code = "FR",
 WHERE name = "A";


"
"UPDATE TableA 
 SET name = "A's B"
 WHERE country_code = "FR";


"

到

UPDATE TableA 
 SET country_code = "FR",
 WHERE name = "A";

UPDATE TableA 
 SET name = "A's B"
 WHERE country_code = "FR";

我怎么能那样做？谢谢!
注意：我尝试了以下sed命令

sed -e 's/^"\|"$//g' test.sql > test_output.sql

但是，它也会删除where子句中最后的所有双引号。

regex

来源：https://stackoverflow.com/questions/75584312/shell-script-remove-first-and-last-double-quote-from-all-strings-in-a-file

3条答案

按热度按时间

bxgwgixi1#

使用下面的Perl一行程序：

perl -00ne 's{\A[\s"]+|[\s"]+\z}{}g; print "$_\n\n" if $_;' test1.txt

UPDATE TableA 
 SET country_code = "FR",
 WHERE name = "A";

UPDATE TableA 
 SET name = "A's B"
 WHERE country_code = "FR";

s{\A[\s"]+|[\s"]+\z}{}g;删除（替换为空字符串）以下模式：* 或者 * 字符串的开头（\A），后跟一个或多个空格（\s）或双引号（"）*，或者 * 字符串的结尾相同（\z）。
正则表达式使用以下修饰符：
/g：重复匹配模式。
Perl一行程序使用以下命令行标志：
-e：告诉Perl查找内联代码，而不是文件中的代码。
-n：一次循环输入一行，默认情况下将其分配给$_。
-00：逐段阅读文件，而不是一次读一行。

- 另见：**

perldoc perlrun：如何执行Perl解释器：命令行开关
perldoc perlre : Perl regular expressions (regexes)
perldoc perlre：Perl正则表达式（正则表达式）：量词;字符类和其他特殊转义;Assert;捕获组
perldoc perlrequick : Perl regular expressions quick start

赞(0）回复(0）举报 2023-03-04

ndh0cuux2#

使用gnu-awk可以执行以下操作：

awk -v RS='(^|\n)"[^;]*;[^"]*"' '{
   print gensub(/(^|\n)"\s*|\s*"$/, "\\1", "g", RT)}' file

UPDATE TableA
 SET country_code = "FR",
 WHERE name = "A";

UPDATE TableA
 SET name = "A's B"
 WHERE country_code = "FR";

正则表达式模式(^|\n)"[^;]*;[^"]*"匹配一个带引号的块，该块必须包含一个;，如OP的输入所示。

赞(0）回复(0）举报 2023-03-04

bvhaajcl3#

注意：我尝试了以下sed命令

sed -e 's/^"\|"$//g' test.sql > test_output.sql

但是，它也会删除where子句中最后的所有双引号。
是的，在你的替换中，模式的|"$部分注意到了这一点，但我不清楚它的用途是什么，另外，它似乎不会合并空行。
但此sed解决方案将同时实现这两个功能：

sed -e 's/^"//; :1; $ { s/^.*\n//; n; }; /^[[:space:]\n]*$/ { N; s/\n"//; b1; }; s/^.*\n/\n/' \
  test.sql > test_output.sql

sed表达式更清晰的形式是：

# Remove an "outer" quote if this line contains one
s/^"//

:1
# On the last line, collapse any accumulated blank lines and terminate
$ { s/^.*\n//; n; }

# If the pattern space contains only whitespace and newlines then
#   - append a newline plus the next line of input
#   - remove an outer quote, if any, from the added text
#   - branch to label 1, above
/^[[:space:]\n]*$/ { N; s/\n"//; b1; }

# If there are any newlines in the pattern space then remove everything through
# the last one
s/^.*\n/\n/

# No more commands: print the pattern space, replace it with
# the next line, and go back to the beginning

赞(0）回复(0）举报 2023-03-04

我来回答

regex shell 脚本：从文件中的所有字符串中删除首双引号和末双引号("

3条答案

相关问题

热门标签

最新问答