具有多个项目的服务器的 GIT 存储库布局

我喜欢设置 Subversion 的方式之一是,我可以拥有一个包含多个项目的主存储库。当我想做一个项目的时候,我可以只检查那个项目。像这样

\main
\ProductA
\ProductB
\Shared

那么

svn checkout http://.../main/ProductA

作为 git 的新用户,我想在投入到特定的工作流程之前,探索一下这个领域的最佳实践。据我所知,git 把所有东西都存储在一个单元中。位于项目树根的 git 文件夹。所以我只能做两件事中的一件。

  1. 为每个 Product 设置一个单独的项目。
  2. 建立一个单一的大规模项目和存储在子文件夹中的产品。

产品之间存在依赖关系,因此单个大型项目似乎是合适的。我们将使用一个所有开发人员都可以共享代码的服务器。我已经在 SSH & HTTP 上完成了这个工作,我喜欢这个部分。然而,SVN 中的存储库已经有很多 GB 大小,因此在每台机器上拖动整个存储库似乎是一个糟糕的主意——尤其是因为我们要为过多的网络带宽付费。

我可以想象 Linux 内核项目存储库同样很大,所以必须有一个适当的方法来处理 Git,但是我还没有弄明白。

是否有使用非常大的多项目存储库的指导方针或最佳实践?

62062 次浏览

The guideline is simple, in regards to Git limits:

  • one repo per project
  • a main project with submodules.

The idea is not to store everything in one giant git repo, but build a small repo as a main project, which will reference the right commits of other repos, each one representing a project or common component of its own.


The OP Paul Alexander comments:

This sounds similar to the "externals" support provided by subversion.
We tried this and found it extremely cumbersome to constantly update the version references in the externals since the projects are developed concurrently with dependencies on each other. Is there another option??

@Paul: yes, instead of updating the version from the main project, you either:

  • develop your subprojects directly from within the main project (as explained in "True Nature of submodules"),
  • or you reference in a sub-repo an origin towards the same sub-repo being developed elsewhere: from there you just have to pull from that sub-repo the changes made elsewhere.

In both case, you have to not forget to commit the main project, to record the new configuration. No "external" property to update here. The all process is much more natural.

Honestly, this sounds like a real pain and anything that requires developers to do something manually each time is just going to be a regular source of bugs an maintenance.
I suppose I'll look into automating this with some scripts in the super project.

I replied:

Honestly, you may have been right... that is until latest Git release 1.7.1.
git diff and git status both learned to take into account submodules states even if executed from the main project.
You simply cannot miss submodule modification.

That being said:

GitSlave allows you to manage several independent repos as one. Each repo can be manipulated by regular git commands, while gitslave allows you to additionally run a command over all repos.

super-repo
+- module-a-repo
+- module-b-repo


gits clone url-super-repo
gits commit -a -m "msg"

Repo-per-project has advantages with componentization and simplified builds with tools like Maven. Repo-per-project adds protection by limiting the scope of what the developer is changing - in terms of erroneous commits of garbage.