python 如何PDF导出一个或多个Confluence数据中心/服务器空间?

rmbxnbpk  于 2023-02-28  发布在  Python
关注(0)|答案(1)|浏览(76)

如何基于对所有可用空间的搜索将一个或多个Confluence空间导出到PDF?信息稀缺,所以我将此作为问答来帮助他人。
我读过大量API的弃用、替换和问题报告,我知道Confluence仍然不允许通过现代的RESTful API导出PDF,只能通过其长期不受支持的SOAP API。
我读过的一些比较有用的内容包括:
https://jira.atlassian.com/browse/CONFSERVER-9901https://community.atlassian.com/t5/Confluence-questions/RPC-Confluence-export-fails-with-TYPE-PDF/qaq-p/269310https://developer.atlassian.com/server/confluence/remote-api-specification-for-pdf-export/
下面的SO示例与所需的类似,但它不搜索空间,这需要在2015年6月之前的某个时间使用不同的端点。使用Ruby和PHP也代表在我的团队中引入了一种新语言,我们更愿意坚持使用C#、Python,在紧急情况下使用Java.How to export a Confluence "Space" to PDF using remote API

bmp9r5qi

bmp9r5qi1#

下面的Python脚本是使用Python 3.11和Confluence Server 7.19测试的。它写得很短,并不完美,所以可以根据需要随意修改。

Python 3代码

# Saves one or more Confluence spaces to PDF files. On-prem installs only. SOAP API must be enabled/unblocked
# Be sure to: pip install zeep and change the URL and YOUR_KEY_FILTER_HERE below
# Charles Burns (https://stackoverflow.com/users/161816/charles-burns), February 2023

import shutil
import logging
from getpass import getpass
from datetime import datetime, timezone
from requests import Session
from requests.auth import HTTPBasicAuth
from zeep import Client
from zeep.transports import Transport

confluence = "on-prem-confluence.net" # Your company's Confluence URI
user = input("Confluence login name: ")
password = getpass()

print("Authorizing on " + confluence + "...")
session = Session()
session.auth = HTTPBasicAuth(user, password)
getSpacesClient = Client('https://' + confluence + '/rpc/soap-axis/confluenceservice-v2?WSDL', transport=Transport(session=session))
token = getSpacesClient.service.login(user, password)

print("Getting list of spaces to export...")
allSpaces = getSpacesClient.service.getSpaces(token)
spaces = list(filter(lambda s: s.key.startswith("YOUR_KEY_FILTER_HERE"), allSpaces))
print("Found {} spaces (filtered from {} total): {}".format(len(spaces), len(allSpaces), ", ".join([s.name for s in spaces])))
pdfExportClient = Client('https://' + confluence + '/rpc/soap-axis/pdfexport?WSDL', transport=Transport(session=session))

for space in spaces:
    print("Beginning export of '{}' from {}".format(space.name, space.url))
    try:
        url = siteExportUrl = pdfExportClient.service.exportSpace(token, space.key)
    except Exception as e:
        logging.exception("ERROR EXPORTING " + space.name)
        break
    print("    Downloading exported PDF from {}".format(url))
    fileName = "{}UTC_{}.pdf".format(datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S"), space.key)
    file = session.get(siteExportUrl, stream=True)
    with open(fileName, 'wb') as f:
        shutil.copyfileobj(file.raw, f)
    print("    Export complete: {}\n".format(fileName))

示例输出

Confluence login name: charlesburns
Password: 
Authorizing on on-prem-confluence.net...
Getting list of spaces to export...
Found 31 spaces (filtered from 4601 total): Some Space, Some Other Space, Yet Another Space
Beginning export of 'Some Space' from https://on-prem-confluence.net/display/MYKEY
    Downloading exported PDF from https://on-prem-confluence.net/download/temp/pdfexport-20230224/MYKEY.pdf
    Export complete: 20230225-000215UTC_MYKEY.pdf

Beginning export of 'Some Space' from https://on-prem-confluence.net/display/MYKEY
    Downloading exported PDF from https://on-prem-confluence.net/download/temp/pdfexport-20230224/MYKEY.pdf
    Export complete: 20230225-000215UTC_MYKEY.pdf

成功导出后,PDF文件将与脚本位于同一文件夹中。

遇到的错误和可能的原因

| 错误|注|
| - ------|- ------|
| ValueError:标记名称"Object []"无效|SOAP API可能已禁用,请咨询管理员|
| requests.exceptions.HTTPError:401客户端错误|密码错误或无权导出空间|
| requests.exceptions.ConnectTimeout|Confluence示例关闭或URL不正确|

相关问题