使用正则表达式解析字符串python3

azpvetkf 于 2023-03-09 发布在 Python

关注(0)|答案(3)|浏览(148)

我正在尝试从以下字符串访问gSecureToken：

$("#ejectButton").on("click", function(e) {
            $("#ejectButton").prop("disabled", true);
            $.ajax({
                url : "/apps_home/eject/",
                type : "POST",
                data : { gSecureToken : "7b9854390a079b03cce068b577cd9af6686826b8" },
                dataType : "json",
                success : function(data, textStatus, xhr) {
                    $("#smbStatus").html('');
                    $("#smbEnable").removeClass('greenColor').html('OFF');
                    showPopup("MiFi Share", "<p>Eject completed. It is now safe to remove your USB storage device.</p>");
                },
                error : function(xhr, textStatus, errorThrown) {
                    //undoChange($toggleSwitchElement);
                    // If auth session has ended, force a new login with a fresh GET.
                    if( (xhr.status == 401) || (xhr.status == 403) || (xhr.status == 406) ) window.location.replace(window.location.href);
                }
            });

如何使用正则表达式解析字符串中的值？我知道一旦解析了它，我就可以将其作为JSON加载。
我目前的代码没有使用正则表达式，它只是使用BeautifulSoup来解析一些html。

from bs4 import BeautifulSoup

class SecureTokenParser:

    @staticmethod
    def parse_secure_token_from_html_response(html_response):
        soup = BeautifulSoup(html_response, 'html.parser')
        for script_tag in soup.find_all("script", type="text/javascript"):
            print(script_tag)

我知道这并不多，但我认为这是一个很好的起点，可以将内容打印到终端，我如何使用regex解析出gSecureToken，然后将其作为JSON加载呢？

python-3.x

来源：https://stackoverflow.com/questions/57318195/using-regex-to-parse-string-python3

3条答案

按热度按时间

igetnqfo1#

不需要依赖于像BeautifulSoup这样的大封装;您可以使用Python re包轻松解析出gSecureToken的值。
我假设您只想解析出gSecureToken的值，然后，您可以创建一个正则表达式模式：

import re

pattern = r'{\s*gSecureToken\s*:\s*"([a-z0-9]+)"\s*}'

然后，我们可以使用，例如，您的测试字符串：

test_str = """
$("#ejectButton").on("click", function(e) {
            $("#ejectButton").prop("disabled", true);
            $.ajax({
                url : "/apps_home/eject/",
                type : "POST",
                data : { gSecureToken : "7b9854390a079b03cce068b577cd9af6686826b8" },
                dataType : "json",
                success : function(data, textStatus, xhr) {
                    $("#smbStatus").html('');
                    $("#smbEnable").removeClass('greenColor').html('OFF');
                    showPopup("MiFi Share", "<p>Eject completed. It is now safe to remove your USB storage device.</p>");
                },
                error : function(xhr, textStatus, errorThrown) {
                    //undoChange($toggleSwitchElement);
                    // If auth session has ended, force a new login with a fresh GET.
                    if( (xhr.status == 401) || (xhr.status == 403) || (xhr.status == 406) ) window.location.replace(window.location.href);
                }
            });
"""

最后，我们可以在测试字符串中搜索正则表达式：

match = re.search(pattern, test_str)
matching_string = match.groups()[0]
print(matching_string)

它给出了我们想要的值：

7b9854390a079b03cce068b577cd9af6686826b8

您可以通过访问以下链接了解此正则表达式的工作原理：www.regexr.com/4ihpd

赞(0）回复(0）举报 2023-03-09

j8ag8udp2#

非正则表达式、非BS4选项：

html_response = [your string above]

splt = html_string.split(' : { ')
splt[1].split('},\n')[0]

输出：
'gSecureToken：“7854390a079b03cce068b577cd9af6686826b8”

赞(0）回复(0）举报 2023-03-09

bt1cpqcv3#

您不会向我们展示print()显示了什么，但可以想象它类似于下面的s。
使用以下命令解析它：

import re

def parse_token(s: str):
    token_re = re.compile(r'"gSecureToken": "(\w{40})"')
    m = token_re.search(s)
    return m.group(1)

s = '{"url": "/apps_home/eject/", "type": "POST", "data": {"gSecureToken": "7b9854390a079b03cce068b577cd9af6686826b8"}, "dataType": "json"}'
print(parse_token(s))
print(dict(data=dict(gSecureToken=parse_token(s))))

如果固定的40个字符太过严格，可以使用\w+。手册页位于：https://docs.python.org/3/library/re.html
您的“......然后将其作为JSON加载？”评论似乎与此无关，因为要求我们使用正则表达式进行解析，看起来好像没有留给JSON处理的解析任务了（我可能会从一开始就使用json.loads()，而不是使用正则表达式，因为数据似乎是JSON格式的）。

赞(0）回复(0）举报 2023-03-09

我来回答

使用正则表达式解析字符串python3

3条答案

相关问题

热门标签

最新问答