excel openPyXL -在取消合并期间为单元格范围赋值

6jjcrrmo  于 2023-03-20  发布在  其他
关注(0)|答案(6)|浏览(193)

因此,我有Excel文件,每个文件中有几个工作表,我正在编写脚本,该脚本将从选定的工作表中收集数据,如果文件中存在这些数据,并将其合并到一个大工作表中。通常情况下,它正在工作,通过文件迭代,如果所需的工作表存在,它会找到包含数据的单元格范围,并将其附加到数据框中。我现在需要做的事情是添加标题行(列名)到Dataframe,但在工作表中这些是多行标题。
为了使它看起来相同的 Dataframe ,我需要取消合并单元格在顶部标题行和复制值从第一个单元格到其余的范围内,这是合并之前)。
我正在使用OpenPyXL访问excel工作表。我的函数接收工作表作为唯一参数。它看起来像这样:

def checkForMergedCells(sheet):
    merged = ws.merged_cell_ranges
    for mergedCell in merged:
        mc_start, mc_stop = str(mergedCell).split(':')
        cp_value = sheet[mc_start]
        sheet.unmerge_cells(mergedCell)
        cell_range = sheet[mergedCell]
        for cell in cell_range:
            cell.value = cp_value

问题是cell_range返回一个元组,最后得到错误消息:
属性错误:'tuple'对象没有属性'value'在下面您可以看到调试期间的screencap,它显示传入每个变量的值。
Debugger running

oxiaedzo

oxiaedzo1#

按索引访问通常会返回元组的元组,除非您尝试获取单个单元格或行。对于编程访问,应使用iter_rows()iter_cols()
您可能想花一些时间来了解一下utils模块。

from openpyxl.utils import range_boundaries

for group in ws.merged_cell_ranges:
     min_col, min_row, max_col, max_row = range_boundaries(group)
     top_left_cell_value = ws.cell(row=min_row, column=min_col).value
     for row in ws.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
         for cell in row:
             cell.value = top_left_cell_value
1zmg4dgp

1zmg4dgp2#

其他答案的问题

关于@Charlie Clark的选定答案和使用http://thequickblog.com/merge-unmerge-cells-openpyxl-in-python中代码的其他答案,您可以更容易地取消合并单元格,而无需处理range_boundaries和那些转换。
我也得到了问题与选定的答案,其中一些合并的单元格将取消合并,但其他不会,一些未合并的单元格将填补与数据,我想要的,但其他不会。
问题是worksheet.merged_cells.ranges是一个迭代器,这意味着它是延迟求值的,因此当调用worksheet.unmerge_cells()时,对象worksheet.merged_cells会发生变化,并且在再次迭代合并的单元格范围时会出现副作用。

更好的解决方案

在我的例子中,我希望像这样取消合并单元格,同时复制边框、字体和对齐信息:

+-------+------+
+-------+------+    | Date  | Time |
| Date  | Time |    +=======+======+
+=======+======+    | Aug 6 | 1:00 |
|       | 1:00 | -> +-------+------+
| Aug 6 | 3:00 |    | Aug 6 | 3:00 |
|       | 6:00 |    +-------+------+
+-------+------+    | Aug 6 | 6:00 |
                    +-------+------+

对于当前最新版本的openpyxl==3.0.9,我发现以下几点最适合我:

from copy import copy

from openpyxl import load_workbook, Workbook
from openpyxl.cell import Cell
from openpyxl.worksheet.cell_range import CellRange
from openpyxl.worksheet.worksheet import Worksheet

def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
    """
    Unmerges all merged cells in the given ``worksheet`` and copies the content
    and styling of the original cell to the newly unmerged cells.

    :param worksheet: The Excel worksheet containing the merged cells.
    """

    # Must convert iterator to list to eagerly evaluate all merged cell ranges
    # before looping over them - this prevents unintended side-effects of
    # certain cell ranges from being skipped since `worksheet.unmerge_cells()`
    # is destructive.
    all_merged_cell_ranges: list[CellRange] = list(
        worksheet.merged_cells.ranges
    )

    for merged_cell_range in all_merged_cell_ranges:
        merged_cell: Cell = merged_cell_range.start_cell
        worksheet.unmerge_cells(range_string=merged_cell_range.coord)

        # Don't need to convert iterator to list here since `merged_cell_range`
        # is cached
        for row_index, col_index in merged_cell_range.cells:
            cell: Cell = worksheet.cell(row=row_index, column=col_index)
            cell.value = merged_cell.value

            # (Optional) If you want to also copy the original cell styling to
            # the newly unmerged cells, you must use shallow `copy()` since
            # cell style properties are proxy objects which are not hashable.
            #
            # See <https://openpyxl.rtfd.io/en/stable/styles.html#copying-styles>
            cell.alignment = copy(merged_cell.alignment)
            cell.border = copy(merged_cell.border)
            cell.font = copy(merged_cell.font)

# Sample usage
if __name__ == "__main__":
    workbook: Workbook = load_workbook(
        filename="workbook_with_merged_cells.xlsx"
    )
    worksheet: Worksheet = workbook["My Sheet"]

    unmerge_and_fill_cells(worksheet=worksheet)
    workbook.save(filename="workbook_with_unmerged_cells.xlsx")

浓缩溶液

以下是一个较短的版本,没有注解,也没有复制样式:

from openpyxl.worksheet.worksheet import Worksheet

def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
    for merged_cell_range in list(worksheet.merged_cells.ranges):
        worksheet.unmerge_cells(range_string=merged_cell_range.start_cell)

        for row_col_indices in merged_cell_range.cells:
            worksheet.cell(*row_col_indices).value = merged_cell.value
ctrmrzij

ctrmrzij3#

下面的代码来自http://thequickblog.com/merge-unmerge-cells-openpyxl-in-python/是为我工作。

import openpyxl 
from openpyxl.utils import range_boundaries
wbook=openpyxl.load_workbook("openpyxl_merge_unmerge.xlsx")
sheet=wbook["unmerge_sample"]
for cell_group in sheet.merged_cells.ranges:
    min_col, min_row, max_col, max_row = range_boundaries(str(cell_group))
    top_left_cell_value = sheet.cell(row=min_row, column=min_col).value
    sheet.unmerge_cells(str(cell_group))
    for row in sheet.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
        for cell in row:
            cell.value = top_left_cell_value
wbook.save("openpyxl_merge_unmerge.xlsx")
exit()
jv2fixgn

jv2fixgn4#

之前的答案都不管用,所以我精心设计了这个答案,测试了一下,它对我很有效。

from openpyxl.utils import range_boundaries
wb = load_workbook('Example.xlsx')

sheets = wb.sheetnames  ##['Sheet1', 'Sheet2']
for i,sheet in enumerate(sheets):
    ws = wb[sheets[i]]
    
    # you need a separate list to iterate on (see explanation #2 below)
    mergedcells =[]  
    for group in ws.merged_cells.ranges:
        mergedcells.append(group)
    
    for group in mergedcells:
        min_col, min_row, max_col, max_row = group.bounds 
        top_left_cell_value = ws.cell(row=min_row, column=min_col).value
        ws.unmerge_cells(str(group))   # you need to unmerge before writing (see explanation #1 below)
        for irow in range(min_row, max_row+1):
            for jcol in range(min_col, max_col+1): 
                ws.cell(row = irow, column = jcol, value = top_left_cell_value)

@Дмитр о о олександ р ович几乎是正确的,但我不得不修改一些东西来修正他的答案:
1.你会得到一个AttributeError: 'MergedCell' object attribute 'value' is read-only错误,因为你需要在改变它们的值之前取消合并合并单元格。https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1228
1.你不能直接在ws.merged_cells.ranges上迭代,因为在python中迭代一个'ranges'列表对象并改变它(例如用一个unmerge_cells函数或一个pop函数)将导致只改变一半的对象(见这里:https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1085)。您需要创建一个不同的列表并对其进行迭代。

kuhbmx9i

kuhbmx9i5#

在我执行以下操作之前,我一直收到错误和弃用警告:

from openpyxl.utils import range_boundaries

for group in sheet.merged_cells.ranges: # merged_cell_ranges deprecated
    display(range_boundaries(group._get_range_string())) # expects a string instead of an object
    min_col, min_row, max_col, max_row = range_boundaries(group._get_range_string())
    top_left_cell_value = sheet.cell(row=min_row, column=min_col).value
    for row in sheet.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
        for cell in row:
            cell.value = top_left_cell_value
e3bfsja2

e3bfsja26#

所有以前的解决方案都给了我某种错误,可能是由于openpyxl的不同版本,在当前版本(3.0.10)中,我发现这个解决方案对我来说是有效的:

for m_range in list(ws.merged_cells.ranges):
    merged_cell = m_range.start_cell
    ws.unmerge_cells(range_string=str(m_range))

    for row_col_indices in m_range.cells:
        ws.cell(*row_col_indices).value = merged_cell.value

相关问题