python-3.x 从模板生成多个文档时的格式问题(邮件合并)

egdjgwm8  于 2023-08-08  发布在  Python
关注(0)|答案(1)|浏览(71)

它一直是真正令人沮丧的经验,我试图做邮件合并(如果我可以调用我的要求)。我有一个word模板文档,包含占位符,如{{JointHolder}} {{JointHolderPAN}},{{Holding}},{{Nominee}},₹ {{UndrawnCapital| number:“N 0”}} Excel文件包含名为“Mappings”的工作表,该工作表将这些占位符Map到标题名称。
此占位符格式特定于我的许可工具。现在,我把我的尝试转向Python来保持这些占位符的完整性。VBA是否是我的需求的更好选择?
我已经达到了占位符被Excel文件中的值替换的地步。但占位符的实际格式并不保留。例如,粗体、字体类型、字体大小、上标、左右居中对齐。这些格式对我的要求至关重要。
主要的问题是我需要生成一个表的基础上的关键字“REPEAT”,这是在需要重复的行的第一个单元格。这样不行
预期表输出-
x1c 0d1x的数据
对于最小可重现性示例-Link to the sample word template, excel file and python project .py file
你的指导非常感谢。
编辑:我以前用过这个工具-https://www.pdfmachine.com/pdfmachine-merge/。现在我想完全摆脱这个工具,因此尝试Python。
编辑:我现在已经切换到VBA作为一个实验。VBA宏文件位于共享文件夹中。链接上面提供。
表格生成要求-
我有表在word模板和大部分占位符内的表。这些也可以是嵌套表。

  • 如果在TableDisplayX列中找到“y”,则需要重复某些行
  • 这样的行的第一个单元格强制性地包含单词“REPEAT”和“TableDisplayX”,其中X可以是任何整数,如1、2、3、4、5、6等。
  • TableDisplayX是MailMerge工作表中的一个标头,如果特定行需要出现在为PATRTICULAR INVESTOR生成的表中,则它包含'y'值
  • 同一行中的其他单元格包含1个或多个占位符。
  • 生成的表中的值将从具有这些占位符表示的标头的列中提取
  • 请注意,此生成表的每一行都包含来自MailMerge工作表中对应于该投资者的不同连续行的数据。

为了解决这些生成的表的复杂性,我建议-

  • 当前代码应停止替换第一个单元格包含“REPEAT”字的行中的占位符
  • 在占位符的正常替换之后(目前由程序处理),代码应该开始查找带有“REPEAT”字的表行
  • 分别处理每个“REPEAT”行
  • 找到对应于“TableDisplayX”的头,其中X是整数,例如TableDisplay 1是我的MailMerge工作表中的头
  • 为那个投资者找出这个领域的Y的数目。这是要插入的数据行数
  • 在不同单元格中查找同一行中的所有其他标题
  • 例如,如果“y”的数量是5,则重复行必须是5。这些重复行必须正好在“REPEAT”行的下面;包括这一排所以新插入的行必须是4。
  • 从所有标题中获取特定投资者的数据,并将其放置在这些表行中。
  • 对其他“REPEAT”行重复上述步骤
c2e8gylq

c2e8gylq1#

先试试我的代码。到目前为止,你是否满意?

Sub CreateDocuments()
    
    ' Define the workbook and worksheet
    Dim wb As Workbook
    Set wb = ThisWorkbook
    Dim wsMailMerge As Worksheet:    Set wsMailMerge = wb.Sheets("MailMerge")
    Dim wsMappings As Worksheet:    Set wsMappings = wb.Sheets("Mappings")
    
    Rem test to temply comment out
    Application.ScreenUpdating = False
    Application.Calculation = xlManual
    Application.DisplayAlerts = False
    
    ' Define the last row in the sheet
    Dim LastRow As Long
    LastRow = wsMailMerge.Cells(wsMailMerge.Rows.count, 1).End(xlUp).Row
    
    
    
    Dim WordApp As Object, myDoc As Object, wdRng As Object, TableHeader As String, placeholders As Object, tblREPEAT As Object
    Dim tbl As Object 'Word.Table
    Dim Find1 As Excel.Range, Find2 As Excel.Range, Find3 As Excel.Range, header As String
    Dim Timer2 As Single
    
    ' Open the Word application
    Set WordApp = CreateObject("Word.Application")
    WordApp.Visible = True
    
    Rem test to temply comment out
    WordApp.ScreenUpdating = False
    
    
    ' Define a Dictionary to store which investors have been processed
    Dim ProcessedInvestors As Object
    Set ProcessedInvestors = CreateObject("Scripting.Dictionary")
    
    
    Set placeholders = CreateObject("Scripting.Dictionary")
    
    ' Loop through each row in the MailMerge sheet
    Dim r As Long
    For r = 2 To LastRow ' Assuming your data starts from row 2 (row 1 contains headers)
        
        
        ' Get the name of the current investor
        Dim InvestorName As String
        InvestorName = wsMailMerge.Cells(r, 1).value ' Assuming column A contains the investor names
    
        ' Check if the investor has already been processed
        If Not ProcessedInvestors.Exists(InvestorName) Then
            ' Mark the investor as processed
            ProcessedInvestors.Add InvestorName, True
            
            ' Open the Word document
            Set myDoc = WordApp.Documents.Open(wb.Path & "\sample2 - Template SOA - Copy.docx") ' Use the workbook's path and the name of your Word template
            
            Rem get the Table include `REPEAT`
            For Each tbl In myDoc.Tables
                If InStr(tbl.Cell(2, 1).Range.Text, "REPEAT") > 0 Then
                    Rem  back up Placeholders in a row of the table
                    Rem use the clipboard, whilst the system needs the clipboard, causes a problem
'                    tbl.Rows(2).Range.Copy
'                    myDoc.Range(tbl.Cell(tbl.Rows.count, 1).Range.Start, tbl.Cell(tbl.Rows.count, 1).Range.Start).Paste
                    Rem implement without the clipboard
                    myDoc.Range(tbl.Cell(tbl.Rows.count, 1).Range.Start, tbl.Cell(tbl.Rows.count, 1).Range.Start).FormattedText = tbl.Rows(2).Range.FormattedText
                    Rem clear the `REPEAT"TableDisplay2"` in the backup row
                    Set wdRng = myDoc.Range(tbl.Cell(3, 1).Range.Start, tbl.Cell(3, 1).Range.Start)
                    wdRng.MoveUntil "{"
                    wdRng.Start = tbl.Cell(3, 1).Range.Start
                    wdRng.Text = vbNullString
                    Set tblREPEAT = tbl 'Store tables with variable tblREPEAT
                    'Exit For ' if only one table include `REPEAT`
                End If
            Next tbl
            
            ' Loop through each column in the row
            Dim c As Long
            
            Dim clumCount_wsMailMerge As Long: clumCount_wsMailMerge = wsMailMerge.Cells(r, Columns.count).End(xlToLeft).Column
            
            'For c = 1 To wsMailMerge.Cells(r, Columns.count).End(xlToLeft).Column
            For c = 1 To clumCount_wsMailMerge
                
'                If c = 176 Then Stop ' just for check
                
                
                Rem Your placeholder and the header of the wsMailMerge is NOT case-sensitive!!
                
                ' Define the placeholder and the value to replace it with
                Dim placeholder 'As String
                placeholder = "{{" & Trim(wsMailMerge.Cells(1, c).value) & "}}" ' The placeholder is the header in the Mappings sheet
                'placeholder = "{{" & Trim(wsMailMerge.Cells(1, c).Formula) & "}}" ' The placeholder is the header in the Mappings sheet
                
                
                'If placeholder = "{{Notes47}}" Then Stop 'just for check
'                If placeholder = "{{Address4}}" Then Stop 'just for check
        
                 Dim value As String
                    value = wsMailMerge.Cells(r, c).value ' The value is the cell value in the current row
                    'If Trim(wsMailMerge.Cells(1, c).value) = "SOADate" Then Stop
                    
                    
                    ' Check for special formatting instructions
                    Dim formatting As String
                    GoSub getFormat
                    
                    'If placeholder Like "*{{SOADate}}*" Then Stop
                    
    '                If value = vbNullString Then Stop 'just for test
                    
                    Set wdRng = myDoc.Content
                    
                    ' Replace the placeholders in the Word document with the values from the record
                    'With myDoc.Content.Find
                    With wdRng.Find
                        .Text = placeholder
                        '.Replacement.Text = value
                        
                        .Wrap = wdFindContinue
                        .MatchWholeWord = True
                        
                        '.Execute Replace:=wdReplaceAll
                        Do While .Execute()
                            Rem don't do this in the record rows of the table including `REPEAT"TableDisplay2"` this time
                            If .Found Then
                                If .Parent.InRange(myDoc.Range(tblREPEAT.Cell(2, 1).Range.Start, tblREPEAT.Cell(3, tblREPEAT.Columns.count).Range.End)) Then
                                    Exit Do
                                Else
                                    .Parent.Text = value  'ie. .Replacement.Text = value and preserve the format
                                End If
                            End If
                        Loop
                    End With
NextPlaceholder:
            Next c
            ''''''''''''''''''''''''''''''''''''''''''''''''''''''
            
               Timer2 = Timer()
            ' Loop through each table
            'Dim tbl As Table
            'Dim tbl As Object 'Word.Table
            
            For Each tbl In myDoc.Tables
                If InStr(tbl.Cell(2, 1).Range.Text, "REPEAT") > 0 Then
                'If tbl Is tblREPEAT Then
                'If tbl.Range.Start = tblREPEAT.Range.Start Then
                    
                    ' Initialize row counter
                    Dim rowCounter As Integer
                    rowCounter = 1
                    
                    Rem  must be without merged cells by this way
                    ' Loop through each row
                    'Dim rw As Object 'Word.Row
                    'For Each rw In tbl.Rows
                        ' Loop through each cell
                        Dim cel As Object 'Word.Cell
                        'For Each cel In rw.Cells
                        For Each cel In tbl.Range.Cells
                            ' Check if the cell contains "REPEAT"
                            If InStr(cel.Range.Text, "REPEAT") > 0 Then
                                ' Extract the "TableDisplayX" placeholder
                                Dim startPos As Integer
                                startPos = InStr(cel.Range.Text, "TableDisplay")
                                If startPos > 0 Then
                                    Rem if length of `X` is longer then 1 then this will be wrong
                                    'TableHeader = Mid(cel.Range.Text, startPos, 13) ' There are 13 characters in "TableDisplayX"
                                    
                                    TableHeader = Mid(cel.Range.Text, startPos, InStr(startPos, cel.Range.Text, ChrW(8221)) - startPos)   ' There are 13 characters in "TableDisplayX"
                                    
                                    Rem clear the `REPEAT"TableDisplay2"`
                                    Set wdRng = myDoc.Range(cel.Range.Start + startPos, cel.Range.Start + startPos)
                                    wdRng.MoveUntil ChrW(8221)
                                    wdRng.Start = cel.Range.Start
                                    wdRng.End = wdRng.End + 1
                                    wdRng.Text = vbNullString
                                
                                Else
                                    MsgBox "TableDisplay column not found", vbCritical
                                    
                                    Rem don't forget to restore
                                    'Exit Sub
                                    GoTo ExitSub
                                    
                                End If
                                
                                

                                
                                
                                Set Find2 = wsMailMerge.Range("1:1").Find(TableHeader, lookat:=xlWhole, LookIn:=xlFormulas)
                                ' Determine the number of 'y' values in the "TableDisplayX" column for the current investor
                                'Dim TableDisplayColumn As Range
                                Dim TableDisplayColumn As Excel.Range
                                'Set TableDisplayColumn = wsMailMerge.Range(Cells(2, Find2.Column), Cells(LastRow, Find2.Column)) ' Replace "TableDisplayX" with the actual column letter
                                Set TableDisplayColumn = wsMailMerge.Range(wsMailMerge.Cells(2, Find2.Column), wsMailMerge.Cells(LastRow, Find2.Column)) ' Replace "TableDisplayX" with the actual column letter
                                
                                Dim FolioColumn As Range
                                Set FolioColumn = wsMailMerge.Range("A2:A" & LastRow)
                
                                Dim count As Integer
                                count = Application.WorksheetFunction.CountIfs(TableDisplayColumn, "y", FolioColumn, InvestorName)
                                                                
'                                tbl.Cell(tbl.Rows.count, 1).Split NumColumns:=2
'                                tbl.Cell(tbl.Rows.count, 1).Width = tbl.Cell(tbl.Rows.count - 1, 1).Width
'                                tbl.Cell(tbl.Rows.count, 2).Width = tbl.Cell(tbl.Rows.count - 1, 2).Width
                                
                                If count > 1 Then
                                
                                    'tbl.Rows(tbl.Rows.count - 1).Range.Copy
                                    ' Insert (n-1) rows below the repeating row
                                    Dim i As Integer
                                    For i = 1 To count - 1 - 1
                                        'rw.Range.Rows.Add
                                        
                                        'myDoc.Range(tbl.Cell(3, 1).Range.Start, tbl.Cell(3, 1).Range.Start).Paste
                                        'tbl.Rows.Add tbl.Rows(tbl.Rows.count)
                                        
                                        Rem add new rows like the backup one
                                        myDoc.Range(tbl.Cell(tbl.Rows.count, 1).Range.Start, tbl.Cell(tbl.Rows.count, 1).Range.Start).FormattedText = tbl.Rows(3).Range.FormattedText
    
                                        
                                    Next i
                                    
                                    
                                    'just for test
    '                                tbl.Range.Document.Tables.Add tbl.Range.Document.Range(tbl.Cell(tbl.Rows.count, 1).Range.Start, tbl.Cell(tbl.Rows.count, 1).Range.Start), count - 1, tbl.Columns.count
                                    
                                    
    '                                tbl.Cell(tbl.Rows.count, 1).Merge tbl.Cell(tbl.Rows.count, 2)
                                    
                                    
                                    ' Populate the cells in the new table
                                    'Dim newRow As Row
                                    Dim newCell As Word.Cell
                                    Dim rowIndex As Integer
                                    Dim firstRow As Long
                                    
                                    firstRow = GetFirstRowWithY(TableDisplayColumn, FolioColumn, InvestorName)
                    
                                    rowIndex = 0
                                    'For Each newRow In tbl.Range.Rows
                                        'If rowCounter = count Then
                                            'For Each newCell In newRow.Range.Cells
                                            
                                            rowCounter = 0
                                            
                                            Rem `myDoc.Range(...).Cells` get the range of new cells
                                            For Each newCell In myDoc.Range(tbl.Range.Cells(6 + 1).Range.Start, tbl.Cell(count + 1, tbl.Columns.count).Range.End).Cells
                                                ' Define the placeholder and the value to replace it with
        '                                        Dim placeholder As String ' this already declare before
                                                If rowCounter = 0 Then
                                                    rowCounter = newCell.rowIndex
                                                ElseIf rowCounter < newCell.rowIndex Then
                                                    rowIndex = rowIndex + 1 ' Increment row counter
                                                    rowCounter = newCell.rowIndex
                                                End If
                                                If newCell.ColumnIndex = 1 Then
                                                    
                                                    With newCell.Range.Find
                                                        .ClearAllFuzzyOptions
                                                        .ClearFormatting
                                                        .Wrap = wdFindStop
                                                        .Forward = True
                                                        .MatchWildcards = True 'use Wildcards to find
                                                        .Text = "\{\{*\}\}"
                                                        Do While .Execute
                                                            If .Parent.End > newCell.Range.End Then Exit Do 'if over the range of curent cell then exit
                                                            Rem Take note of the placeholder and its Range
                                                            'placeholders.Add VBA.Mid(.Parent.Text, 3, VBA.Len(.Parent.Text) - 4), .Parent '3=len("{{")+1
                                                            placeholders.Add .Parent.Text, .Parent.Duplicate  '3=len("{{")+1
                                                        Loop
                                                    End With
                                                    
                                                Else
                                                    Rem Take note of the placeholder and its Range
                                                    placeholders.Add VBA.Left(newCell.Range.Text, VBA.Len(newCell.Range.Text) - 2), newCell.Range ' The placeholder is the header in the Mappings sheet
                                                End If
                                                                                                
                                                'Dim value As String ' Already declared!!
                                                For Each placeholder In placeholders.keys
                                                    'Set Find3 = wsMappings.Range("A:A").Find(placeholder, lookat:=xlWhole, LookIn:=xlValues)
                                                    Set Find3 = wsMappings.Range("A:A").Find(placeholder, lookat:=xlWhole, LookIn:=xlValues)
                                                    If Not Find3 Is Nothing Then
                                                        header = Find3.Offset(0, 1).value
                                                        'Set Find3 = wsMailMerge.Rows(1).Find(header, lookat:=xlWhole, LookIn:=xlFormulas)
                                                        Set Find3 = wsMailMerge.Range("1:1").Find(header, lookat:=xlWhole, LookIn:=xlFormulas)
                                                        value = wsMailMerge.Cells(firstRow + rowIndex + 1, Find3.Column).value ' The value is the cell value in the current row
                                                        ' Replace the placeholders in the Word document with the values from the record
                                                        'newCell.Range.Text = value
                                                        GoSub getFormat:
                                                        placeholders(placeholder).Text = value
                                                    End If
                                                Next placeholder
                                                placeholders.RemoveAll
                                            Next newCell
                                        'End If
'                                        rowIndex = rowIndex + 1
                                    'Next newRow
                                    
                                Else
                                    Rem if there is no more row
                                    tbl.Rows(3).Delete 'delete the backup placeholdere copy one
                                End If
'                                GoTo currentDocDone: ' if only one table include `REPEAT`
                                GoTo nextTable
                            End If
                            
                        Next cel
                        ' Increment row counter
                        'rowCounter = rowCounter + 1
                    'Next rw
                
'                Else
'                    GoTo nextTable
                End If
nextTable:
            Next tbl
    

currentDocDone:
    
             MsgBox "time = " & (Timer() - Timer2)
                    
            ''''''''''''''''''''''''''''''''''''''''''''''''''''''
            
            On Error Resume Next
            ' Save the Word document
            myDoc.SaveAs wb.Path & "\" & wsMailMerge.Cells(r, 1).value & ".docx" ' Use the workbook's path and the investor's name from column A
            
            Rem temply comment out for test
            'myDoc.Close SaveChanges:=False
            
            
            On Error GoTo 0
            
            If Err.Number <> 0 Then
                MsgBox "Word doc save error", vbExclamation
            End If
        
        End If
    Next r
    
    Rem You can not do it here, cause WordApp'll be referred to later
'     Close the Word application
'    WordApp.Quit
    

ExitSub:

    WordApp.ScreenUpdating = True
    
    Rem temply comment out for test
'   Close the Word application
'    WordApp.Quit

    Application.ScreenUpdating = True
    Application.Calculation = xlAutomatic
    Application.DisplayAlerts = True
    
Exit Sub

getFormat:
            Set Find1 = Nothing
            formatting = ""
            
            Set Find1 = wsMappings.Range("A:A").Find(placeholder, lookat:=xlWhole, LookIn:=xlValues)
            If Not Find1 Is Nothing Then
                formatting = Find1.Offset(0, 2).value
            Else
                GoTo NextPlaceholder
            End If
            
            'If Not IsError(formatting) Then
            If formatting <> vbNullString And Not IsError(formatting) And formatting <> "SuperScript" Then
            
                Select Case formatting
                    Case "MMMM dd, yyyy"
                        value = Format(value, "MMMM dd, yyyy")
                    Case "N0"
                        value = Format(value, "#,##0")
                    Case "N1"
                        value = Format(value, "#,##0.0")
                    Case "N2"
                        value = Format(value, "#,##0.00")
                    Case "N3"
                        value = Format(value, "#,##0.000")
                    Case "N4"
                        value = Format(value, "#,##0.0000")
                    Case "P0"
                        value = Format(value, "0%")
                    Case "P1"
                        value = Format(value, "0.0%")
                    Case "P2"
                        value = Format(value, "0.00%")
                    Case "P3"
                        value = Format(value, "0.000%")
                    Case "P4"
                        value = Format(value, "0.0000%")
                    Case "NewLine"
                    
                        If value <> vbNullString Then
                            'value = value & Chr(10)
                            
                            value = Chr(10) & value
                        End If
                        
                    Case "SuperScript"
                        ' Handle superscript here
                End Select

    
            End If
            Return
End Sub

字符串
你的代码和示例仍然有很多问题和不清楚的地方。你只能判断这是否是你想要的。我只是尽量配合你写的代码来实现它。
您的代码和示例似乎仍然存在一些问题和不明确的方面。这取决于您来确定这是否正是您正在寻找的。我的目标是与您编写的代码一起工作,尽我所能实现预期的结果。祝你好运!如果我有时间,我会尝试看看如何在Python中实现它。

相关问题