将Git文件夹转换为子模块?

dnph8jn4  于 2023-01-19  发布在  Git
关注(0)|答案(9)|浏览(231)

通常情况下,您正在编写某种类型的项目,过一段时间后,您会发现项目的某个组件实际上作为独立组件非常有用(可能是库)。如果您很早就有这种想法,那么很有可能大多数代码都在其自己的文件夹中。
有没有办法将Git项目中的某个子目录转换为子模块?
理想情况下,这个目录中的所有代码都会从父项目中移除,子模块项目会添加到它的位置,并带有所有适当的历史记录,这样所有父项目提交都指向正确的子模块提交。

yjghlzjz

yjghlzjz1#

要将子目录隔离到其自己的仓库中,请在原始仓库的克隆上使用filter-branch

git clone <your_project> <your_submodule>
cd <your_submodule>
git filter-branch --subdirectory-filter 'path/to/your/submodule' --prune-empty -- --all

然后,只需删除原始目录并将子模块添加到父项目中即可。

sshcrbum

sshcrbum2#

首先将目录更改为将成为子模块的文件夹。然后:

git init
git remote add origin <repourl>
git add .
git commit -am 'first commit in submodule'
git push -u origin master
cd ..
rm -rf <folder> # the folder which will be a submodule
git commit -am 'deleting folder'
git submodule add <repourl> <folder> # add the submodule
git commit -am 'adding submodule'
km0tfn4u

km0tfn4u3#

我知道这是一个老线索,但这里的答案挤压任何相关的提交在其他分支。
一个简单的方法来克隆和保留所有这些额外的分支和提交:
1 -确保您有这个git别名

git config --global alias.clone-branches '! git branch -a | sed -n "/\/HEAD /d; /\/master$/d; /remotes/p;" | xargs -L1 git checkout -t'

2 -克隆远程、拉取所有分支、更改远程、筛选目录、推送

git clone git@github.com:user/existing-repo.git new-repo
cd new-repo
git clone-branches
git remote rm origin
git remote add origin git@github.com:user/new-repo.git
git remote -v
git filter-branch --subdirectory-filter my_directory/ -- --all
git push --all
git push --tags
oprakyz7

oprakyz74#

现状

假设我们有一个名为repo-old的存储库,其中包含一个子目录sub,我们希望将其转换为一个子模块,它有自己的存储库repo-sub
还旨在将原始的存储库repo-old转换成修改的存储库repo-new,其中涉及先前存在的子目录sub的所有提交现在将指向我们提取的子模块存储库repo-sub的对应提交。
∮让我们改变
git filter-branch的帮助下,可以通过两步过程实现这一点:
1.从repo-oldrepo-sub的子目录提取(已在接受的answer中提及)
1.子目录从repo-old替换为repo-new(使用正确的提交Map)

备注:我知道这个问题是老问题了,而且已经提到过git filter-branch有点过时,可能很危险。但另一方面,它可能会帮助其他人使用转换后易于验证的个人仓库。所以请警告!请让我知道是否有任何其他工具可以做同样的事情,而不被过时,并且可以安全使用!

下面我将解释我是如何在git 2.26.2版本的linux上实现这两个步骤的,旧版本可能在一定程度上可以工作,但需要测试。
为了简单起见,我将把自己限制在原来的存储库repo-old中只有一个master分支和一个origin远程的情况下,还要注意的是,我依赖于前缀为temp_的临时git标签,这些标签在这个过程中会被删除,所以如果已经有类似的标签,你可能需要调整下面的前缀。最后,请注意,我还没有广泛的测试这一点,可能会有角落的情况下,食谱失败。所以请备份一切之前继续
下面的bash片段可以连接成一个大脚本,然后在repo repo-org所在的文件夹中执行。不建议将所有内容直接复制并粘贴到命令窗口中(尽管我已经成功地测试了这一点)!

0.准备

变量

# Root directory where repo-org lives
# and a temporary location for git filter-branch
root="$PWD"
temp='/dev/shm/tmp'

# The old repository and the subdirectory we'd like to extract
repo_old="$root/repo-old"
repo_old_directory='sub'

# The new submodule repository, its url
# and a hash map folder which will be populated
# and later used in the filter script below
repo_sub="$root/repo-sub"
repo_sub_url='https://github.com/somewhere/repo-sub.git'
repo_sub_hashmap="$root/repo-sub.map"

# The new modified repository, its url
# and a filter script which is created as heredoc below
repo_new="$root/repo-new"
repo_new_url='https://github.com/somewhere/repo-new.git'
repo_new_filter="$root/repo-new.sh"

筛选器脚本

# The index filter script which converts our subdirectory into a submodule
cat << EOF > "$repo_new_filter"
#!/bin/bash

# Submodule hash map function
sub ()
{
    local old_commit=\$(git rev-list -1 \$1 -- '$repo_old_directory')

    if [ ! -z "\$old_commit" ]
    then
        echo \$(cat "$repo_sub_hashmap/\$old_commit")
    fi
}

# Submodule config
SUB_COMMIT=\$(sub \$GIT_COMMIT)
SUB_DIR='$repo_old_directory'
SUB_URL='$repo_sub_url'

# Submodule replacement
if [ ! -z "\$SUB_COMMIT" ]
then
    touch '.gitmodules'
    git config --file='.gitmodules' "submodule.\$SUB_DIR.path" "\$SUB_DIR"
    git config --file='.gitmodules' "submodule.\$SUB_DIR.url" "\$SUB_URL"
    git config --file='.gitmodules' "submodule.\$SUB_DIR.branch" 'master'
    git add '.gitmodules'

    git rm --cached -qrf "\$SUB_DIR"
    git update-index --add --cacheinfo 160000 \$SUB_COMMIT "\$SUB_DIR"
fi
EOF
chmod +x "$repo_new_filter"

1.子目录提取

cd "$root"

# Create a new clone for our new submodule repo
git clone "$repo_old" "$repo_sub"

# Enter the new submodule repo
cd "$repo_sub"

# Remove the old origin remote
git remote remove origin

# Loop over all commits and create temporary tags
for commit in $(git rev-list --all)
do
    git tag "temp_$commit" $commit
done

# Extract the subdirectory and slice commits
mkdir -p "$temp"
git filter-branch --subdirectory-filter "$repo_old_directory" \
                  --tag-name-filter 'cat' \
                  --prune-empty --force -d "$temp" -- --all

# Populate hash map folder from our previously created tag names
mkdir -p "$repo_sub_hashmap"
for tag in $(git tag | grep "^temp_")
do
    old_commit=${tag#'temp_'}
    sub_commit=$(git rev-list -1 $tag)

    echo $sub_commit > "$repo_sub_hashmap/$old_commit"
done
git tag | grep "^temp_" | xargs -d '\n' git tag -d 2>&1 > /dev/null

# Add the new url for this repository (and e.g. push)
git remote add origin "$repo_sub_url"
# git push -u origin master

2.子目录替换

cd "$root"

# Create a clone for our modified repo
git clone "$repo_old" "$repo_new"

# Enter the new modified repo
cd "$repo_new"

# Remove the old origin remote
git remote remove origin

# Replace the subdirectory and map all sliced submodule commits using
# the filter script from above
mkdir -p "$temp"
git filter-branch --index-filter "$repo_new_filter" \
                  --tag-name-filter 'cat' --force -d "$temp" -- --all

# Add the new url for this repository (and e.g. push)
git remote add origin "$repo_new_url"
# git push -u origin master

# Cleanup (commented for safety reasons)
# rm -rf "$repo_sub_hashmap"
# rm -f "$repo_new_filter"

**备注:**如果新创建的存储库repo-newgit submodule update --init期间挂起,请尝试递归地重新克隆存储库一次:

cd "$root"

# Clone the new modified repo recursively
git clone --recursive "$repo_new" "$repo_new-tmp"

# Now use the newly cloned one
mv "$repo_new" "$repo_new-bak"
mv "$repo_new-tmp" "$repo_new"

# Cleanup (commented for safety reasons)
# rm -rf "$repo_new-bak"
gk7wooem

gk7wooem5#

这是可以做到的,但并不简单。如果你搜索git filter-branchsubdirectorysubmodule,会有一些不错的评论。它本质上需要创建你的项目的两个克隆,使用git filter-branch删除除了一个子目录之外的所有内容。然后只删除另一个仓库中的子目录,这样就可以建立第二个仓库作为第一个仓库的子模块。

lmyy7pcs

lmyy7pcs6#

@knittl使用filter-branch的当前答案让我们非常接近期望的效果,但是当尝试时,Git向我抛出了一个警告:

WARNING: git-filter-branch has a glut of gotchas generating mangled history
         rewrites.  Hit Ctrl-C before proceeding to abort, then use an
         alternative filtering tool such as 'git filter-repo'
         (https://github.com/newren/git-filter-repo/) instead.  See the
         filter-branch manual page for more details; to squelch this warning,
         set FILTER_BRANCH_SQUELCH_WARNING=1.

在这个问题被提出和回答9年后,filter-branch被弃用,取而代之的是git filter-repo。事实上,当我查看我使用git log --all --oneline --graph的git历史时,它充满了不相关的提交。
那么如何使用git filter-repo呢?Github有一篇很好的文章概述了here。(注意,您需要独立于git安装它。我使用的是python版本的pip3 install git-filter-repo
如果他们决定移动/删除该条,我将总结和概括他们的程序如下:

git clone <your_old_project_remote> <your_submodule>
cd <your_submodule>
git filter-repo --path path/to/your/submodule
git remote set-url origin <your_new_submodule_remote>
git push -u origin <branch_name>

在那里,您只需要将新的存储库注册为您希望它所在的子模块:

cd <path/to/your/parent/module>
git submodule add <your_new_submodule_remote>
git submodule update
git commit
pkln4tw6

pkln4tw67#

这是就地转换,您可以像处理任何过滤分支一样将其取消(我使用git fetch . +refs/original/*:*)。
我有一个带有utils库的项目,这个库在其他项目中已经开始有用了,我想把它的历史分割成一个子模块。我没有想到先看SO,所以我自己写了一个,它在本地构建历史,所以速度快了一点,之后如果你想的话,你可以设置helper命令的.gitmodules文件等等。然后把子模块历史推到你想要的任何地方。
剥离的命令在这里,文档在注解中,在后面的未剥离的命令中。将它作为自己的命令运行,设置subdir,如果你要拆分utils目录,就像subdir=utils git split-submodule一样。它很黑客,因为它是一次性的,但我在Git历史记录中的Documentation子目录上测试过它。

#!/bin/bash
# put this or the commented version below in e.g. ~/bin/git-split-submodule
${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}
${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
    | git cat-file --batch-check='%(objectname)' | uniq`)
[[ $pathcheck = *:* ]] || {
    subfam=($( set -- ${fam[@]}; shift;
        for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
            git rev-parse -q --verify $tpar:"$subdir"
        done
    ))
    git rm -rq --cached --ignore-unmatch  "$subdir"
    if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
        git update-index --add --cacheinfo 160000,$subfam,"$subdir"
    else
        subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
            | git commit-tree $GIT_COMMIT:"$subdir" $(
                ${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
            ` &&
        git update-index --add --cacheinfo 160000,$subnew,"$subdir"
    fi
}
${debug+set +x}
#!/bin/bash
# Git filter-branch to split a subdirectory into a submodule history.

# In each commit, the subdirectory tree is replaced in the index with an
# appropriate submodule commit.
# * If the subdirectory tree has changed from any parent, or there are
#   no parents, a new submodule commit is made for the subdirectory (with
#   the current commit's message, which should presumably say something
#   about the change). The new submodule commit's parents are the
#   submodule commits in any rewrites of the current commit's parents.
# * Otherwise, the submodule commit is copied from a parent.

# Since the new history includes references to the new submodule
# history, the new submodule history isn't dangling, it's incorporated.
# Branches for any part of it can be made casually and pushed into any
# other repo as desired, so hooking up the `git submodule` helper
# command's conveniences is easy, e.g.
#     subdir=utils git split-submodule master
#     git branch utils $(git rev-parse master:utils)
#     git clone -sb utils . ../utilsrepo
# and you can then submodule add from there in other repos, but really,
# for small utility libraries and such, just fetching the submodule
# histories into your own repo is easiest. Setup on cloning a
# project using "incorporated" submodules like this is:
#   setup:  utils/.git
#
#   utils/.git:
#       @if _=`git rev-parse -q --verify utils`; then \
#           git config submodule.utils.active true \
#           && git config submodule.utils.url "`pwd -P`" \
#           && git clone -s . utils -nb utils \
#           && git submodule absorbgitdirs utils \
#           && git -C utils checkout $$(git rev-parse :utils); \
#       fi
# with `git config -f .gitmodules submodule.utils.path utils` and
# `git config -f .gitmodules submodule.utils.url ./`; cloners don't
# have to do anything but `make setup`, and `setup` should be a prereq
# on most things anyway.

# You can test that a commit and its rewrite put the same tree in the
# same place with this function:
# testit ()
# {
#     tree=($(git rev-parse `git rev-parse $1`: refs/original/refs/heads/$1));
#     echo $tree `test $tree != ${tree[1]} && echo ${tree[1]}`
# }
# so e.g. `testit make~95^2:t` will print the `t` tree there and if
# the `t` tree at ~95^2 from the original differs it'll print that too.

# To run it, say `subdir=path/to/it git split-submodule` with whatever
# filter-branch args you want.

# $GIT_COMMIT is set if we're already in filter-branch, if not, get there:
${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}

${debug+set -x}
fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
    | git cat-file --batch-check='%(objectname)' | uniq`)

[[ $pathcheck = *:* ]] || {
    subfam=($( set -- ${fam[@]}; shift;
        for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
            git rev-parse -q --verify $tpar:"$subdir"
        done
    ))

    git rm -rq --cached --ignore-unmatch  "$subdir"
    if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
        # one id same for all entries, copy mapped mom's submod commit
        git update-index --add --cacheinfo 160000,$subfam,"$subdir"
    else
        # no mapped parents or something changed somewhere, make new
        # submod commit for current subdir content.  The new submod
        # commit has all mapped parents' submodule commits as parents:
        subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
            | git commit-tree $GIT_COMMIT:"$subdir" $(
                ${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
            ` &&
        git update-index --add --cacheinfo 160000,$subnew,"$subdir"
    fi
}
${debug+set +x}
jgzswidk

jgzswidk8#

官方git项目现在推荐使用git-filter-repo

# install git-filter-repo, see [1] for install via pip, or other OS's.
sudo apt-get install git-filter-repo 

# copy your repo; everything EXCEPT the subdir will be deleted, and the subdir will become root.
# --no-local is required to prevent git from hard linking to files in the original, and is checked by `filter-branch`
git clone working-dir/.git working-dir-copy --no-local
cd working-dir-copy

# extract the desired subdirectory and its history.
git filter-repo --subdirectory-filter foodir

# foodir is now its own directory. Push it to github/gitlab etc
git remote add origin user@hosting/project.git
git push -u origin --all
git push -u origin --tags

这也要感谢this gist
编辑:对于LFS用户(可怜的人),git clone并不能拉取一个映像的整个LFS历史,这会导致git push失败。

// Original branch needs to get history of all images
git lfs fetch --all

// clone needs to copy the history
git lfs install --skip-smudge
git lfs pull working-dir --all

https://github.com/newren/git-filter-repo/blob/main/INSTALL.md

7eumitmz

7eumitmz9#

如果可以接受将以前的历史记录保存在 parent folder only 中,一个简单的解决方案是删除subfolder from the index,并在相同的路径中启动一个新的存储库或子模块。
1.将subdir加到.gitignore

  1. rm -r --cached subdir
  2. git add .gitignore && git commit
  3. cd subdir && git init && git add .
    1.提交新subdir存储库中的初始文件
    git help rm开始:
    --cached:使用此选项仅从索引中取消暂存和删除路径。工作树文件,无论是否修改,都将保持不变。
    在生产代码中使用过submodules之后,我可以说这是一个很好的解决方案,特别是因为它记录了项目的依赖关系。
    对于一个简单的项目,或者如果没有其他开发人员,或者没有很强的依赖性,文件夹结构更方便,子模块可能有点太多了。但是,如果你选择走这条路,跳过步骤1,继续相应的操作。

相关问题