在python中，DEDENT标记究竟是如何生成的？

8aqjt8rx 于 2023-01-22 发布在 Python

关注(0)|答案(3)|浏览(273)

我正在阅读一个关于Python词法分析的文档，它描述了如何生成INDENT和DEDENT令牌的过程。我将描述张贴在这里。
连续行的缩进级别用于使用堆栈生成INDENT和DEDENT标记，如下所示。
在读取文件的第一行之前，将单个零压入堆栈;这将永远不会再弹出。压入堆栈的数字将始终严格地从底部到顶部递增。在每个逻辑行的开始，该行的缩进级别与堆栈的顶部进行比较。如果相等，则什么也不发生。如果较大，则将其压入堆栈，并生成一个INDENT令牌。如果较小，它必须是堆栈上出现的数字之一;栈上所有较大的数都被弹出，并且对于每个弹出的数生成DEDENT令牌。2在文件结束时，对于栈上剩余的大于零的每个数生成DEDENT令牌。
我试着理解DEDENT部分，但没有成功，有人能给予一个比参考更好的解释吗？

python-3.x

来源：https://stackoverflow.com/questions/40960123/how-exactly-a-dedent-token-is-generated-in-python

3条答案

按热度按时间

s4n0splo1#

由于Python有时候比英语更容易，这里是这个描述的Python的粗略翻译。你可以看到真实世界的解析器（我自己写的）是这样工作的here。

import re
code = """
for i in range(10):
   if i % 2 == 0:
     print(i)
   print("Next number")
print("That's all")

for i in range(10):
   if i % 2 == 0:
       print(i)
print("That's all again)

for i in range(10):
   if i % 2 == 0:
      print(i)
  print("That's all")
"""
def get_indent(s) -> int:
    m = re.match(r' *', s)
    return len(m.group(0))
def add_token(token):
    print(token)
INDENT="indent"
DEDENT="dedent"
indent_stack = [0]
# Before the first line of the file is read, a single zero is pushed on the stack
for line in code.splitlines():
    print("processing line:", line)
    indent = get_indent(line)
    # At the beginning of each logical line, the line’s 
    # indentation level is compared to the top of the stack. 
    if indent > indent_stack[-1]:
        # If it is larger, it is pushed on the stack, 
        # and one INDENT token is generated.
        add_token(INDENT)
        indent_stack.append(indent)
    elif indent < indent_stack[-1]:
        while indent < indent_stack[-1]:
            #  If it is smaller, ...
            # all numbers on the stack that are larger are popped off,
            # and for each number popped off a DEDENT token is generated.
            add_token(DEDENT)
            indent_stack.pop()
        if indent != indent_stack[-1]:
            # it must be one of the numbers occurring on the stack; 
            raise IndentationError
while indent_stack[-1]>0:
     # At the end of the file, a DEDENT token is generated for each number 
     # remaining on the stack that is larger than zero.
     add_token(DEDENT)
     indent_stack.pop()

下面是输出：

processing line: 
processing line: for i in range(10):
processing line:    if i % 2 == 0:
indent
processing line:      print(i)
indent
processing line:    print("Next number")
dedent
processing line: print("That's all")
dedent
processing line: 
processing line: for i in range(10):
processing line:    if i % 2 == 0:
indent
processing line:        print(i)
indent
processing line: print("That's all again)
dedent
dedent
processing line: 
processing line: for i in range(10):
processing line:    if i % 2 == 0:
indent
processing line:       print(i)
indent
processing line:   print("That's all")
dedent
dedent
  File "<string>", line unknown
IndentationError

赞(0）回复(0）举报 2023-01-22

0kjbasz62#

假设我们有一个源文件，每个缩进级别使用4个空格，并且我们当前处于第三个缩进级别。缩进堆栈的内容将是[0, 4, 8, 12]-初始值0加上第一次遇到的每个新缩进级别。现在，考虑下一行代码的前导空格数......

如果它是12（匹配当前栈顶），则没有缩进变化，没有任何特殊情况发生。
如果大于12，则生成INDENT标记，并将新值添加到堆栈中。
如果是8，则生成一个DEDENT标记，12个从堆栈中弹出。
如果是4，你会得到两个DEDENT，12和8都会被弹出。
如果它是0，或者源文件在这一点结束，则会得到三个DEDENT，并弹出12、8、4。
如果它小于12，就会产生一个“不一致缩进”错误，因为不可能知道缩进到了前一级代码的哪一级。

注意，只考虑包含实际代码的行--如果一行只包含空格或注解，则其前导空格的大小是无关紧要的。
这个过程的全部要点是，恰好生成一个DEDENT以对应于每个INDENT，发生在缩进级别返回到（或低于）相应缩进之前存在的量的点。

赞(0）回复(0）举报 2023-01-22

yqyhoc1h3#

Ilya V.Schurov已经为这个问题提供了详细的答案。然而，如果你想要一个更简单的答案：对于每一个被缩进（与0相比）的逻辑行，都会生成一个INDENT标记，对于每一个被缩进的逻辑行，都会生成一个DEDENT标记。在文件的末尾，堆栈上的每一个数字都会生成一个DEDENT标记（例如Ilya的答案中的最后一个print("That's all")）。

赞(0）回复(0）举报 2023-01-22

我来回答

在python中，DEDENT标记究竟是如何生成的？

3条答案

相关问题

热门标签

最新问答