提取mailto值并删除字符串中的html标记(如果有

vwoqyblh 于 2022-09-21 发布在 Ruby

关注(0)|答案(1)|浏览(193)

我希望从给定的字符串中提取mailto值，同时还希望删除html标记。

EX->"<mailto:demomail@gmail.com|demomail@gmail.com> helo<p> bye </p>"
输出->demomail@gmail.com helo再见

如果我使用这个->gsub(/<[^>]*>/,'')
输出->直升机再见

如果我使用这个->ActionView::Base.full_sanitizer.sanitize(html_string, :tags => %w(img br p), :attributes => %w(src style))
输出->直升机再见

你能建议我怎样才能得到我的预期产量吗？
预期输出->demomail@gmail.com helo再见

ruby

来源：https://stackoverflow.com/questions/73782840/extract-the-mailto-value-and-remove-html-tag-if-any-in-the-string

1条答案

按热度按时间

kninwzqo1#

问题是mailto值在HTML标记内，所以当您删除这些HTML标记时，您也会删除mailto值。构造一个复杂的正则表达式来处理它绝对是可能的，但我认为将mailto值与字符串的其余部分分开提取要容易得多。我将使用一个捕获组来执行此操作，该捕获组提取"mailto:"和"|"之间的值。然后，您可以使用已有的gsub方法处理完整的字符串，从而获得输出值的其余部分。

s = "<mailto:demomail@gmail.com|demomail@gmail.com> helo<p> bye </p>"

# Find the "mailto" value

s.match(/mailto:([^|]*)/)
=> #<MatchData "mailto:demomail@gmail.com" 1:"demomail@gmail.com">

# Full result with the matched email and the rest of the string with HTML tags removed

s.match(/mailto:([^|]*)/)[1] + s.gsub(/<[^>]*>/, "")
=> "demomail@gmail.com helo bye "

如果字符串不是以<mailto>标记开头，则可以仅用匹配的电子邮件地址替换整个标记，然后删除后面的其他标记：

s = "this is <mailto:demomail@gmail.com|demomail@gmail.com> helo<p> bye </p>"

# Replace mailto tag with the email, then process the rest

# '1' is a backreference to the first match

s.gsub(/<mailto:([^|]*)[^>]*>/, '1').gsub(/<[^>]*>/, "")
=> "this is demomail@gmail.com helo bye "

# Alternatively, you can just process the mailto tag differently in the gsub block

s.gsub(/<[^>]*>/) do |tag|
  tag.include?("mailto:") ? tag.match(/mailto:([^|]*)/)[1] : ""
end
=> "this is demomail@gmail.com helo bye "

赞(0）回复(0）举报 2022-09-21

我来回答

提取mailto值并删除字符串中的html标记(如果有

1条答案

相关问题

热门标签

最新问答