python 如何使文件创建成为原子操作?

mrphzbgm  于 2023-02-15  发布在  Python
关注(0)|答案(7)|浏览(172)

我使用Python在一个操作中将大量文本写入文件:

open(file, 'w').write(text)

如果脚本被中断,文件写入没有完成,我希望没有文件,而不是部分完成的文件。可以这样做吗?

sauutmhj

sauutmhj1#

将数据写入临时文件,当数据成功写入时,将文件重命名为正确的目标文件,例如

with open(tmpFile, 'w') as f:
    f.write(text)
    # make sure that all data is on disk
    # see http://stackoverflow.com/questions/7433057/is-rename-without-fsync-safe
    f.flush()
    os.fsync(f.fileno())    
os.replace(tmpFile, myFile)  # os.rename pre-3.3, but os.rename won't work on Windows

根据文件www.example.comhttp://docs.python.org/library/os.html#os.replace
将文件或目录src重命名为dst。如果dst为非空目录,则将引发OSError。如果dst存在并且是文件,则在用户具有权限的情况下将以静默方式替换它。如果srcdst位于不同的文件系统上,则此操作可能会失败。如果成功,重命名将是原子操作(这是POSIX要求)。
注:

  • 如果src和dest位置不在同一文件系统上,则可能不是原子操作
  • 如果在电源故障、系统崩溃等情况下性能/响应性比数据完整性更重要,则可以跳过os.fsync步骤
unhi4e5o

unhi4e5o2#

这是一个使用Python tempfile实现原子写入的简单代码片段。

with open_atomic('test.txt', 'w') as f:
    f.write("huzza")

或者甚至从同一文件阅读和写入:

with open('test.txt', 'r') as src:
    with open_atomic('test.txt', 'w') as dst:
        for line in src:
            dst.write(line)

使用两个简单的上下文管理器

import os
import tempfile as tmp
from contextlib import contextmanager

@contextmanager
def tempfile(suffix='', dir=None):
    """ Context for temporary file.

    Will find a free temporary filename upon entering
    and will try to delete the file on leaving, even in case of an exception.

    Parameters
    ----------
    suffix : string
        optional file suffix
    dir : string
        optional directory to save temporary file in
    """

    tf = tmp.NamedTemporaryFile(delete=False, suffix=suffix, dir=dir)
    tf.file.close()
    try:
        yield tf.name
    finally:
        try:
            os.remove(tf.name)
        except OSError as e:
            if e.errno == 2:
                pass
            else:
                raise

@contextmanager
def open_atomic(filepath, *args, **kwargs):
    """ Open temporary file object that atomically moves to destination upon
    exiting.

    Allows reading and writing to and from the same filename.

    The file will not be moved to destination in case of an exception.

    Parameters
    ----------
    filepath : string
        the file path to be opened
    fsync : bool
        whether to force write the file to disk
    *args : mixed
        Any valid arguments for :code:`open`
    **kwargs : mixed
        Any valid keyword arguments for :code:`open`
    """
    fsync = kwargs.pop('fsync', False)

    with tempfile(dir=os.path.dirname(os.path.abspath(filepath))) as tmppath:
        with open(tmppath, *args, **kwargs) as file:
            try:
                yield file
            finally:
                if fsync:
                    file.flush()
                    os.fsync(file.fileno())
        os.rename(tmppath, filepath)
s3fp2yjn

s3fp2yjn3#

因为很容易把细节搞得一团糟,我建议使用一个小型的库,库的优点是它可以处理所有这些细节,并且是社区的reviewed and improved
一个这样的库是 * untitaker * 的python-atomicwrites,它甚至有适当的Windows支持:

此库目前未维护。作者评论:
[...],我想这是一个弃用这个包的好时机,Python 3有os.replace和os.rename,它们对于大多数用例来说可能已经足够好了。

    • 原建议:**

来自自述文件:

from atomicwrites import atomic_write

with atomic_write('foo.txt', overwrite=True) as f:
    f.write('Hello world.')
    # "foo.txt" doesn't exist yet.

# Now it does.

通过PIP安装:

pip install atomicwrites
pjngdqdw

pjngdqdw4#

我使用这段代码原子地替换/写入一个文件:

import os
from contextlib import contextmanager

@contextmanager
def atomic_write(filepath, binary=False, fsync=False):
    """ Writeable file object that atomically updates a file (using a temporary file).

    :param filepath: the file path to be opened
    :param binary: whether to open the file in a binary mode instead of textual
    :param fsync: whether to force write the file to disk
    """

    tmppath = filepath + '~'
    while os.path.isfile(tmppath):
        tmppath += '~'
    try:
        with open(tmppath, 'wb' if binary else 'w') as file:
            yield file
            if fsync:
                file.flush()
                os.fsync(file.fileno())
        os.rename(tmppath, filepath)
    finally:
        try:
            os.remove(tmppath)
        except (IOError, OSError):
            pass

用法:

with atomic_write('path/to/file') as f:
    f.write("allons-y!\n")

它基于this recipe

wrrgggsh

wrrgggsh5#

完成后只需链接文件即可:

with tempfile.NamedTemporaryFile(mode="w") as f:
    f.write(...)
    os.link(f.name, final_filename)

如果你想玩花样:

@contextlib.contextmanager
def open_write_atomic(filename: str, **kwargs):
    kwargs['mode'] = 'w'
    with tempfile.NamedTemporaryFile(**kwargs) as f:
        yield f
        os.link(f.name, filename)
ej83mcc0

ej83mcc06#

这个页面上的答案是相当古老的,现在有图书馆为您做这件事。
特别是safer是一个库,旨在帮助防止程序员错误损坏文件,套接字连接,或通用流.它是相当灵活的,除其他事项外,它可以选择使用内存或临时文件,您甚至可以保留临时文件,以防失败.
他们的例子正是你想要的:

# dangerous
with open(filename, 'w') as fp:
    json.dump(data, fp)
    # If an exception is raised, the file is empty or partly written
# safer
with safer.open(filename, 'w') as fp:
    json.dump(data, fp)
    # If an exception is raised, the file is unchanged.

它位于PyPI中,只需使用pip install --user safer安装即可,或者从https://github.com/rec/safer获取最新版本

tquggr8v

tquggr8v7#

用于Windows循环文件夹和重命名文件的原子解决方案。经过测试,原子自动化,您可以增加概率,以最大限度地降低风险,而不是具有相同的文件名的事件。您的字母符号组合随机库使用随机。选择方法,用于数字字符串(随机。随机。范围(50,9999999,2)。您可以根据需要更改数字范围。

import os import random

path = "C:\\Users\\ANTRAS\\Desktop\\NUOTRAUKA\\"

def renamefiles():
    files = os.listdir(path)
    i = 1
    for file in files:
        os.rename(os.path.join(path, file), os.path.join(path, 
                  random.choice('ABCDEFGHIJKL') + str(i) + str(random.randrange(31,9999999,2)) + '.jpg'))
        i = i+1

for x in range(30):
    renamefiles()

相关问题