使用Python查找断开的符号链接

ewm0tg9j 于 2023-01-24 发布在 Python

关注(0)|答案(9)|浏览(105)

如果我在一个损坏的symlink上调用os.stat()，python会抛出一个OSError异常。这使得它对查找它们很有用。然而，os.stat()可能会抛出类似的异常还有其他一些原因。有没有更精确的方法在Linux下用Python检测损坏的symlinks？

python

来源：https://stackoverflow.com/questions/20794/find-broken-symlinks-with-python

9条答案

按热度按时间

uplii1fm1#

Python中有一句话很常见，那就是请求原谅比请求许可容易。虽然我在真实的生活中并不喜欢这句话，但它确实适用于很多情况。通常，你会希望避免在同一个文件上链接两个系统调用的代码，因为你永远不知道代码中两个调用之间的文件会发生什么。

一个典型的错误是这样写：

if os.path.exists(path):
    os.unlink(path)

第二个调用（os.unlink）可能会失败，如果在if测试后被其他东西删除，引发一个Exception，并停止执行函数的其余部分。（你可能认为这在真实的生活中不会发生，但上周我们刚刚从代码库中发现了另一个类似的bug--这是一个让一些程序员挠头并声称“海森堡bug”的bug。
所以，在你的特殊情况下，我可能会：

try:
    os.stat(path)
except OSError, e:
    if e.errno == errno.ENOENT:
        print 'path %s does not exist or is a broken symlink' % path
    else:
        raise e

这里的麻烦在于，对于不存在的符号链接和损坏的符号链接，stat返回相同的错误代码。
所以，我想你别无选择，只能打破原子性，做一些类似

if not os.path.exists(os.readlink(path)):
    print 'path %s is a broken symlink' % path

赞(0）回复(0）举报 2023-01-24

wecizke32#

这不是原子，但它起作用。
os.path.islink(filename) and not os.path.exists(filename)
事实上，通过RTFM（阅读神奇手册），我们看到
操作系统.路径.存在（路径）
如果path引用现有路径，则返回True。如果符号链接断开，则返回False。
它还说：
在某些平台上，如果未授予对请求的文件执行os.stat（）的权限，则此函数可能返回False，即使该路径实际存在。
因此，如果您担心权限问题，应该添加其他子句。

赞(0）回复(0）举报 2023-01-24

g6ll5ycj3#

lstat（）可能会有帮助，如果lstat（）成功而stat（）失败，则可能是链接断开。

赞(0）回复(0）举报 2023-01-24

jvlzgdj94#

我可以提一下不使用python测试硬链接吗？/bin/test有FILE 1-ef FILE 2条件，当文件共享一个inode时，该条件为真。
因此，类似find . -type f -exec test \{} -ef /path/to/file \; -print这样的代码可以用于对特定文件的硬链接测试。
这使我读到了man test，以及提到的-L和-h，它们都作用于一个文件，如果该文件是一个符号链接，则返回true，但是这并不能告诉您目标是否丢失。
我确实发现，如果文件可以打开，head -0 FILE1将返回一个退出代码0，如果不能打开，则返回1，在指向常规文件的符号链接的情况下，这将测试其目标是否可以读取。

赞(0）回复(0）举报 2023-01-24

50few1ms5#

os.path
您可以尝试使用realpath（）来获取符号链接指向的内容，然后尝试使用is file来确定它是否是一个有效的文件。
（我现在还不能尝试，所以你得试一下，看看你能得到什么）

赞(0）回复(0）举报 2023-01-24

mtb9vblg6#

我使用了这个变体，当符号链接断开时，它将为路径.exists返回false，为路径.islink返回true，因此结合这两个事实，我们可以使用以下代码：

def kek(argum):
    if path.exists("/root/" + argum) == False and path.islink("/root/" + argum) == True:
        print("The path is a broken link, location: " + os.readlink("/root/" + argum))
    else:
        return "No broken links fond"

赞(0）回复(0）举报 2023-01-24

aamkag617#

我不是一个python的家伙，但它看起来像os.readlink（）？我在perl中使用的逻辑是使用readlink（）来查找目标，使用stat（）来测试目标是否存在。
编辑：我开发了一些演示readlink的perl，我相信perl的stat和readlink以及python的os.stat（）和os.readlink（）都是系统调用的 Package 器，所以这应该可以很好地翻译为概念验证代码：

wembley 0 /home/jj33/swap > cat p
my $f = shift;

while (my $l = readlink($f)) {
  print "$f -> $l\n";
  $f = $l;
}

if (!-e $f) {
  print "$f doesn't exist\n";
}
wembley 0 /home/jj33/swap > ls -l | grep ^l
lrwxrwxrwx    1 jj33  users          17 Aug 21 14:30 link -> non-existant-file
lrwxrwxrwx    1 root     users          31 Oct 10  2007 mm -> ../systems/mm/20071009-rewrite//
lrwxrwxrwx    1 jj33  users           2 Aug 21 14:34 mmm -> mm/
wembley 0 /home/jj33/swap > perl p mm
mm -> ../systems/mm/20071009-rewrite/
wembley 0 /home/jj33/swap > perl p mmm
mmm -> mm
mm -> ../systems/mm/20071009-rewrite/
wembley 0 /home/jj33/swap > perl p link
link -> non-existant-file
non-existant-file doesn't exist
wembley 0 /home/jj33/swap >

赞(0）回复(0）举报 2023-01-24

pexxcrt28#

我也遇到过类似的问题：如何捕获断开符号链接，即使它们发生在某个父目录中？2我还想记录所有这些符号链接（在一个处理大量文件的应用程序中），但不要重复太多。
以下是我所做的，包括单元测试。

文件实用程序.py：

import os
from functools import lru_cache
import logging

logger = logging.getLogger(__name__)

@lru_cache(maxsize=2000)
def check_broken_link(filename):
    """
    Check for broken symlinks, either at the file level, or in the
    hierarchy of parent dirs.
    If it finds a broken link, an ERROR message is logged.
    The function is cached, so that the same error messages are not repeated.

    Args:
        filename: file to check

    Returns:
        True if the file (or one of its parents) is a broken symlink.
        False otherwise (i.e. either it exists or not, but no element
        on its path is a broken link).

    """
    if os.path.isfile(filename) or os.path.isdir(filename):
        return False
    if os.path.islink(filename):
        # there is a symlink, but it is dead (pointing nowhere)
        link = os.readlink(filename)
        logger.error('broken symlink: {} -> {}'.format(filename, link))
        return True
    # ok, we have either:
    #   1. a filename that simply doesn't exist (but the containing dir
           does exist), or
    #   2. a broken link in some parent dir
    parent = os.path.dirname(filename)
    if parent == filename:
        # reached root
        return False
    return check_broken_link(parent)

单元测试：

import logging
import shutil
import tempfile
import os

import unittest
from ..util import fileutil

class TestFile(unittest.TestCase):

    def _mkdir(self, path, create=True):
        d = os.path.join(self.test_dir, path)
        if create:
            os.makedirs(d, exist_ok=True)
        return d

    def _mkfile(self, path, create=True):
        f = os.path.join(self.test_dir, path)
        if create:
            d = os.path.dirname(f)
            os.makedirs(d, exist_ok=True)
            with open(f, mode='w') as fp:
                fp.write('hello')
        return f

    def _mklink(self, target, path):
        f = os.path.join(self.test_dir, path)
        d = os.path.dirname(f)
        os.makedirs(d, exist_ok=True)
        os.symlink(target, f)
        return f

    def setUp(self):
        # reset the lru_cache of check_broken_link
        fileutil.check_broken_link.cache_clear()

        # create a temporary directory for our tests
        self.test_dir = tempfile.mkdtemp()

        # create a small tree of dirs, files, and symlinks
        self._mkfile('a/b/c/foo.txt')
        self._mklink('b', 'a/x')
        self._mklink('b/c/foo.txt', 'a/f')
        self._mklink('../..', 'a/b/c/y')
        self._mklink('not_exist.txt', 'a/b/c/bad_link.txt')
        bad_path = self._mkfile('a/XXX/c/foo.txt', create=False)
        self._mklink(bad_path, 'a/b/c/bad_path.txt')
        self._mklink('not_a_dir', 'a/bad_dir')

    def tearDown(self):
        # Remove the directory after the test
        shutil.rmtree(self.test_dir)

    def catch_check_broken_link(self, expected_errors, expected_result, path):
        filename = self._mkfile(path, create=False)
        with self.assertLogs(level='ERROR') as cm:
            result = fileutil.check_broken_link(filename)
            logging.critical('nothing')  # trick: emit one extra message, so the with assertLogs block doesn't fail
        error_logs = [r for r in cm.records if r.levelname is 'ERROR']
        actual_errors = len(error_logs)
        self.assertEqual(expected_result, result, msg=path)
        self.assertEqual(expected_errors, actual_errors, msg=path)

    def test_check_broken_link_exists(self):
        self.catch_check_broken_link(0, False, 'a/b/c/foo.txt')
        self.catch_check_broken_link(0, False, 'a/x/c/foo.txt')
        self.catch_check_broken_link(0, False, 'a/f')
        self.catch_check_broken_link(0, False, 'a/b/c/y/b/c/y/b/c/foo.txt')

    def test_check_broken_link_notfound(self):
        self.catch_check_broken_link(0, False, 'a/b/c/not_found.txt')

    def test_check_broken_link_badlink(self):
        self.catch_check_broken_link(1, True, 'a/b/c/bad_link.txt')
        self.catch_check_broken_link(0, True, 'a/b/c/bad_link.txt')

    def test_check_broken_link_badpath(self):
        self.catch_check_broken_link(1, True, 'a/b/c/bad_path.txt')
        self.catch_check_broken_link(0, True, 'a/b/c/bad_path.txt')

    def test_check_broken_link_badparent(self):
        self.catch_check_broken_link(1, True, 'a/bad_dir/c/foo.txt')
        self.catch_check_broken_link(0, True, 'a/bad_dir/c/foo.txt')
        # bad link, but shouldn't log a new error:
        self.catch_check_broken_link(0, True, 'a/bad_dir/c')
        # bad link, but shouldn't log a new error:
        self.catch_check_broken_link(0, True, 'a/bad_dir')

if __name__ == '__main__':
    unittest.main()

赞(0）回复(0）举报 2023-01-24

0wi1tuuw9#

对于Python 3，可以使用pathlib模块，从它的文档中可以看到：
如果路径指向符号链接，exists()返回符号链接是否 * 指向 * 现有文件或目录。
这个也行。

import pathlib

path = pathlib.Path("/path/to/somewhere")

if path.is_symlink() and not path.exists():
    print(f"found dangling symlink at {path}")

赞(0）回复(0）举报 2023-01-24

我来回答

使用Python查找断开的符号链接

9条答案

相关问题

热门标签

最新问答