我有多个Word文档文件,每个文件都包含如下文本(未显示段落标记):
Some Title1 ' <--- Some title ending with paragraph mark
(Apr 3, 2023 - 9:00am) ' <--- Date - time ending with paragraph mark
(Interviewee: Harry) ' <--- Interviewee name to pick to only add interviewee lines (name and text) to outrow array.
' <--- blank line ending with paragraph mark
(00:00:00 - 00:00:02) ' <--- timestamp ending with paragraph mark
Harry: Okay, thank you. ' <--- Speaker: Text ending with paragraph mark
(00:00:02 - 00:00:06)
Tom: Hi, Harry, hello. Are you okay?
(00:00:06 - 00:00:09)
Harry: Yeah, I'm good, thank you. How are you doing? Happy Monday to you.
(00:00:09 - 00:00:12)
Tom: It's a nice Monday today, so it's quite bright for a change.
字符串
由于有许多文档文件,我想复制整个内容(所有段落)从每个文档文件到excel工作表Sheet2
,追加每个内容到最后一个非空白行。完成后,我想使用Excel中的TextToColumns功能将文本拆分为单独的列,如图所示:
| 日期时间|时间戳记|发言人|文字档| Text |
| ------------|------------|------------|------------| ------------ |
| (2023年4月3日-上午9:00)|(00:00:00 - 00:00:02)|哈利|好的谢谢||
| | | (00你好吗?|祝你周一快乐。| How are you doing? Happy Monday to you.|
| (2023年4月5日-晚上19:00)|(00:00:00 - 00:00:04)|吉尔|我很好。||
| | | (00:00:06 - 00:00:12) | Jill | I'm busy. |
...
目前我只能循环和复制粘贴的文档内容表。一旦合并到工作表中,我想将此内容转换到如上所示的表格中。另外,如果有一种方法可以将所有的文档内容收集到一个数组或ado记录集中,然后一次性将数组/记录集内容直接传输到工作表,这将加快代码速度并保存一些时间。
Option Explicit
Sub ParseTranscriptToExcelSheet()
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Dim wdApp As Object ' Word application
Dim wdDoc As Object ' Word document
Dim tbl As Object ' Table
Dim para As Object ' Paragraph
Dim row As Integer ' Row index for the table
Dim i As Long
Dim oFileDialog As FileDialog
Dim vSelectedItem As Variant
' declare worksheets
Dim ws1 As Worksheet
Dim ws2 As Worksheet
Set ws1 = ThisWorkbook.Sheets(1) ' contains button to run code
Set ws2 = ThisWorkbook.Sheets(2)
' Add a header row to the worksheet 2
ws2.Range("A1:E1").Value = Array("Title", "DateTime", "Timestamp", "Speaker", "Text")
' Initialize the row index for the table
row = ws2.Cells(ws2.Rows.Count, 1).End(xlUp).row + 1
' Open the Word document containing the transcript
On Error Resume Next
Set wdApp = GetObject(, "Word.Application")
On Error GoTo 0
If wdApp Is Nothing Then
Set wdApp = CreateObject("Word.Application")
End If
With wdApp
.Visible = False
End With
' ReDim sContent(1 To 1)
Set oFileDialog = Application.FileDialog(msoFileDialogFilePicker)
With oFileDialog
.Title = "Select Word Files"
.AllowMultiSelect = True
.Filters.Add "Word files", "*.doc*", 1
If .Show = -1 Then
ws2.Activate
For Each vSelectedItem In .SelectedItems
Set wdDoc = wdApp.Documents.Open(vSelectedItem)
With wdDoc
' sContent(UBound(sContent)) = .Content.formattedtext.text
.Content.Copy
ws2.Cells(row, 1).PasteSpecial (xlPasteValues)
Application.CutCopyMode = False
DoEvents
.Close savechanges:=False
row = ws2.Cells(ws2.Rows.Count, 1).End(xlUp).row + 1
End With
' ReDim Preserve sContent(1 To UBound(sContent) + 1) As String
Next vSelectedItem
' ReDim Preserve sContent(1 To UBound(sContent) - 1) As String
Else
MsgBox "No files selected"
End If
End With
wdApp.Quit
Set wdDoc = Nothing
Set wdApp = Nothing
Set ws1 = Nothing
Set ws2 = Nothing
Application.DisplayAlerts = True
Application.ScreenUpdating = True
End Sub
型
1条答案
按热度按时间wn9m85ua1#
试试这个:
字符串
编辑:这里有一个版本,它填充一个数组,并在最后写入Excel工作表
型