我正在尝试为Antlr4 Python3.g4语法文件生成解析树,以解析python3代码

but5z9lq  于 2023-02-01  发布在  Python
关注(0)|答案(1)|浏览(264)

我正在使用ANTLR 4并尝试为我的python文件生成一个解析树。我使用了ANTLR 4文档中的语法文件python3.g4。我安装了antlr 4-python3-runtime,并运行了以下命令:

antlr4 -Dlanguage=Python3 Python3.g4

这生成了我的解析器和词法分析器文件。
在www.example.com中Python3Lexer.py,我遇到以下错误:

from typing.io import TextIO

所以我把它改成了

from typing import TextIO

我还创建了这个名为www.example.com的文件pythonparser.py,它与解析器和lexer文件位于同一个文件夹中,用于调用解析器:

import sys
from antlr4 import *
from Python3Lexer import Python3Lexer
from Python3Parser import Python3Parser

def main(argv):
    input_stream = FileStream(argv[1])
    lexer = Python3Lexer(input_stream)
    stream = CommonTokenStream(lexer)
    parser = Python3Parser(stream)
    tree = parser.single_input()

if __name__ == '__main__':
    main(sys.argv)

我还创建了一个test.py文件,它与antlr语法位于同一个文件夹中,其中包含:

print("hello world")

我试着在这个文件上运行语法来解析它,使用的命令是:

python3 pythonparser.py test.py

我不知道该怎么办,因为它对我不起作用。
我收到此错误消息:

Traceback (most recent call last):
  File "/Users/Fari/Developer/PRJ/project/antlr/pythonparser.py", line 3, in <module>
    from Python3Lexer import Python3Lexer
  File "/Users/Fari/Developer/PRJ/project/antlr/Python3Lexer.py", line 19, in <module>
    LanguageParser = getattr(importlib.import_module('{}Parser'.format(module_path)), '{}Parser'.format(language_name))
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/Fari/Developer/PRJ/project/antlr/Python3Parser.py", line 446, in <module>
    class Python3Parser ( Parser ):
  File "/Users/Fari/Developer/PRJ/project/antlr/Python3Parser.py", line 450, in Python3Parser
    atn = ATNDeserializer().deserialize(serializedATN())
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/antlr4/atn/ATNDeserializer.py", line 60, in deserialize
    self.reset(data)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/antlr4/atn/ATNDeserializer.py", line 90, in reset
    temp = [ adjust(c) for c in data ]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/antlr4/atn/ATNDeserializer.py", line 90, in <listcomp>
    temp = [ adjust(c) for c in data ]
             ^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/antlr4/atn/ATNDeserializer.py", line 88, in adjust
    v = ord(c)
        ^^^^^^
TypeError: ord() expected string of length 1, but int found

我不知道我哪里做错了。

d6kp6zgx

d6kp6zgx1#

Python语法有很多,你需要的是:

下载完这两个语法后,需要运行与这两个语法文件所在的文件夹相同的文件transformGrammar.py,对它们进行预处理。
现在将这两个类下载到同一个文件夹中:

完成这些之后,生成lexer和parser Python类:

java -jar antlr-4.11.1-complete.jar *.g4 -Dlanguage=Python3

如果现在运行该文件:

from antlr4 import *
from Python3Lexer import Python3Lexer
from Python3Parser import Python3Parser

def main():
    input_stream = InputStream('print("hello world")\n')
    lexer = Python3Lexer(input_stream)
    stream = CommonTokenStream(lexer)
    parser = Python3Parser(stream)
    tree = parser.single_input()
    print(tree.toStringTree(recog=parser))

if __name__ == '__main__':
    main()

将打印以下输出:

(single_input (simple_stmts (simple_stmt (expr_stmt (testlist_star_expr (test (or_test (and_test (not_test (comparison (expr (xor_expr (and_expr (shift_expr (arith_expr (term (factor (power (atom_expr (atom (name print)) (trailer ( (arglist (argument (test (or_test (and_test (not_test (comparison (expr (xor_expr (and_expr (shift_expr (arith_expr (term (factor (power (atom_expr (atom "hello world"))))))))))))))))) ))))))))))))))))))) \n))

请注意,我没有更改任何其他内容(不需要将typing.io更改为typing)。

  • Python 3.10.9语言
  • 抗肿瘤药物4.11.1

编辑

当我把下面的内容放到文件中时:

#!/usr/bin/env bash
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3Lexer.g4
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3Parser.g4
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3/transformGrammar.py
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3/Python3LexerBase.py 
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3/Python3ParserBase.py
wget https://www.antlr.org/download/antlr-4.11.1-complete.jar

python3 transformGrammar.py

pip install antlr4-python3-runtime

java -jar antlr-4.11.1-complete.jar *.g4 -Dlanguage=Python3

cat << EOF > main.py
from antlr4 import *
from Python3Lexer import Python3Lexer
from Python3Parser import Python3Parser

def main():
    input_stream = InputStream('print("hello world")\n')
    lexer = Python3Lexer(input_stream)
    stream = CommonTokenStream(lexer)
    parser = Python3Parser(stream)
    tree = parser.single_input()
    print(tree.toStringTree(recog=parser))

if __name__ == '__main__':
    main()
EOF

python3 --version

python3 main.py

然后运行这个文件,我得到了以下输出:

...

antlr-4.11.1-complete.jar              100%[============================================================================>]   3,38M  9,33MB/s    in 0,4s

2023-01-31 10:51:47 (9,33 MB/s) - ‘antlr-4.11.1-complete.jar’ saved [3547867/3547867]

Altering Python3Lexer.g4
Writing ...
Altering Python3Parser.g4
Writing ...
Requirement already satisfied: antlr4-python3-runtime in /opt/homebrew/lib/python3.10/site-packages (4.11.1)
Python 3.10.9
(single_input (simple_stmts (simple_stmt (expr_stmt (testlist_star_expr (test (or_test (and_test (not_test (comparison (expr (xor_expr (and_expr (shift_expr (arith_expr (term (factor (power (atom_expr (atom (name print)) (trailer ( (arglist (argument (test (or_test (and_test (not_test (comparison (expr (xor_expr (and_expr (shift_expr (arith_expr (term (factor (power (atom_expr (atom "hello world"))))))))))))))))) ))))))))))))))))))) \n))

相关问题