gitpython和git diff

7tofc5zh  于 2023-06-28  发布在  Git
关注(0)|答案(9)|浏览(191)

我只想从一个git仓库中得到一个文件的差异。现在,我正在使用gitpython实际获取提交对象和git更改的文件,但我想只对文件更改的部分进行依赖分析。有没有办法从git python中获取git diff?还是我必须逐行阅读来比较每个文件?

iyfamqjs

iyfamqjs1#

如果你想访问diff的内容,试试这个:

repo = git.Repo(repo_root.as_posix())
commit_dev = repo.commit("dev")
commit_origin_dev = repo.commit("origin/dev")
diff_index = commit_origin_dev.diff(commit_dev)

for diff_item in diff_index.iter_change_type('M'):
    print("A blob:\n{}".format(diff_item.a_blob.data_stream.read().decode('utf-8')))
    print("B blob:\n{}".format(diff_item.b_blob.data_stream.read().decode('utf-8')))

这将打印每个文件的内容。

rjzwgtxy

rjzwgtxy2#

你可以使用GitPython和git命令“diff”,只需要使用每个提交的“tree”对象或者你想要看到差异的分支,例如:

repo = Repo('/git/repository')
t = repo.head.commit.tree
repo.git.diff(t)

这将打印包含在本次提交中的所有文件的“所有”差异,所以如果你想要每个差异,你必须遍历它们。
对于实际的分支,它是:

repo.git.diff('HEAD~1')

希望对你有帮助,问候。

2skhul33

2skhul333#

Git不存储差异,正如你所注意到的。给定两个blob(更改之前和之后),您可以使用Python的difflib模块来比较数据。

zhte4eai

zhte4eai4#

我建议你使用PyDriller(它在内部使用GitPython)。更容易用途:

for commit in Repository("path_to_repo").traverse_commits():
    for modified_file in commit.modified_files: # here you have the list of modified files
        print(modified_file.diff)
        # etc...

您也可以通过执行以下操作来分析单个提交:

for commit in RepositoryMining("path_to_repo", single="123213")
kuhbmx9i

kuhbmx9i5#

如果您希望重新创建与标准git diff显示的内容接近的内容,请尝试:

# cloned_repo = git.Repo.clone_from(
#     url=ssh_url,
#     to_path=repo_dir,
#     env={"GIT_SSH_COMMAND": "ssh -i " + SSH_KEY},
# ) 
for diff_item in cloned_repo.index.diff(None, create_patch=True):
    repo_diff += (
        f"--- a/{diff_item.a_blob.name}\n+++ b/{diff_item.b_blob.name}\n"
        f"{diff_item.diff.decode('utf-8')}\n\n"
        )
0wi1tuuw

0wi1tuuw6#

如果你想在两次提交之间对一个文件执行git diff,这是一种方法:

import git
   
repo = git.Repo()
path_to_a_file = "diff_this_file_across_commits.txt"
   
commits_touching_path = list(repo.iter_commits(paths=path))
   
print repo.git.diff(commits_touching_path[0], commits_touching_path[1], path_to_a_file)

这将显示对指定文件执行的两次最新提交之间的差异。

g52tjvyc

g52tjvyc7#

repo.git.diff("main", "head~5")
vkc1a9a2

vkc1a9a28#

PyDriller +1

pip install pydriller

但是使用新的API:

Breaking API: ```
from pydriller import Repository

for commit in Repository('https://github.com/ishepard/pydriller').traverse_commits():
    print(commit.hash)
    print(commit.msg)
    print(commit.author.name)

    for file in commit.modified_files:
        print(file.filename, ' has changed')
kuarbcqp

kuarbcqp9#

你是这样做的

import git
repo = git.Repo("path/of/repo/")

# the below gives us all commits
repo.commits()

# take the first and last commit

a_commit = repo.commits()[0]
b_commit = repo.commits()[1]

# now get the diff
repo.diff(a_commit,b_commit)

相关问题