在读/写形状文本时保留文本格式python pptx

fae0ux8s 于 2023-04-19 发布在 Python

关注(0)|答案(2)|浏览(228)

我期待在形状的文本执行文本替换。我使用的代码类似于下面的片段：

# define key/value
SRKeys, SRVals = ['x','y','z'], [1,2,3]

# define text
text = shape.text

# iterate through values and perform subs
for i in range(len(SRKeys)):
    # replace text
    text = text.replace(SRKeys[i], str(SRVals[i]))

# write text subs to comment box
shape.text = text

但是，如果初始的shape.text有格式化的字符（例如粗体），那么在读取时会删除格式化。有解决方案吗？
我唯一能想到的是迭代字符并检查格式，然后在写入shape.text之前添加这些格式。

python

来源：https://stackoverflow.com/questions/59994218/preserve-text-format-on-read-write-to-shape-text-python-pptx

2条答案

按热度按时间

idfiyjo81#

@usr2564301在正确的轨道上。字符格式（又名“字体”）在运行级别指定。这就是运行;所有共享相同字符格式的字符的“运行”（序列）。
当你赋值给shape.text的时候，你会用一个新的运行替换所有的运行，这个新的运行具有默认的格式。如果你想保留格式，你需要保留那些没有直接参与文本替换的运行。
这不是一个小问题，因为不能保证在单词边界上会出现连读。试着打印出几个段落的连读，我想你会明白我的意思。
在粗略的伪代码中，我认为这是您需要采取的方法：

在段落中搜索目标文本以确定其第一个字符的偏移量。
遍历段落中的所有运行，保持每次运行之前有多少个字符的运行总数，可能类似于（run_idx，prefix_len，length）：（0，0，8）、（1，8，4）、（2，12，9）等。
确定哪个运行是涉及搜索字符串的开始运行、结束运行和中间运行。
在搜索词的开头分割第一次运行，在搜索词的结尾分割最后一次运行，并删除除第一次“中间”运行之外的所有运行。
将中间运行的文本更改为替换文本，并从先前（原始开始）运行中复制格式。

这将保留不涉及搜索字符串的任何运行，并保留“替换”单词中“匹配”单词的格式。
这需要一些当前API不直接支持的操作。对于那些操作，您需要使用较低级别的lxml调用来直接操作XML，尽管您可以从python-pptx对象中获得所需的所有现有元素，而无需自己解析XML。

赞(0）回复(0）举报 2023-04-19

uajslkp62#

下面是我使用的代码的改编版本（灵感来自@scanny的答案），它替换了幻灯片上所有形状（带文本框架）的文本。

from pptx import Presentation

prs = Presentation('../../test.pptx')
slide = prs.slides[1]

# iterate through all shapes on slide
for shape in slide.shapes:
    if not shape.has_text_frame:
        continue
        
    # iterate through paragarphs in shape
    for p in shape.text_frame.paragraphs:
        # store formats and their runs by index (not dict because of duplicate runs)
        formats, newRuns = [], []

        # iterate through runs
        for r in p.runs:
            # get text
            text = r.text

            # replace text
            text = text.replace('s','xyz')

            # store run
            newRuns.append(text)

            # store format
            formats.append({'size':r.font.size,
                            'bold':r.font.bold,
                            'underline':r.font.underline,
                            'italic':r.font.italic})

        # clear paragraph
        p.clear()

        # iterate through new runs and formats and write to paragraph
        for i in range(len(newRuns)):
            # add run with text
            run = p.add_run()
            run.text = newRuns[i]

            # format run
            run.font.bold = formats[i]['bold']
            run.font.italic = formats[i]['italic']
            run.font.size = formats[i]['size']
            run.font.underline = formats[i]['underline']

prs.save('../../test.pptx')

赞(0）回复(0）举报 2023-04-19

我来回答

在读/写形状文本时保留文本格式python pptx

2条答案

相关问题

热门标签

最新问答