如何删除一个太大的文件在提交时,我的分支是在主5提交

我被这个问题困住了一整天,在这里寻找答案: (..。

背景

我一个人在做一个项目,直到现在我还使用 github 来保存我的工作,而不是在我的电脑上。 不幸的是,我向本地存储库添加了一个非常大的文件: 300mb (超过了 Github 的限制)。

我的所作所为

我会努力把我的所作所为写成一部历史:

  1. 我(默默地)将所有内容添加到索引中:

    git add *
    
  2. I committed changes :

    git commit -m "Blablabla"
    
  3. I tried to push to origin master

    git push origin master
    

    这花了一段时间,所以我只是 CTRL + C,并重复步骤2和3四次,直到我意识到一个文件太大,推到 github。

  4. 我犯了一个可怕的错误,删除了我的大文件(我不记得我是否做了 git rm 或简单的 rm)

  5. 我按照(https://help.github.com/articles/remove-sensitive-data)上的说明进行操作

  6. 当我尝试 git 过滤器分支时,我得到以下错误: “无法重写分支: 您有非暂存的更改。”

先谢谢你!

76807 次浏览

It seems your only problem is having unstaged changes. You didn't give any detail as to what was actually out of sync, so it's a shot in the dark, but assuming you simple-rmd the file in step 4, you'd bring it back from the index with:

git checkout large_file

If not, you're on your own. Your goal is to make sure both your index and your working tree are in the same state. This shows as git status reporting nothing to commit, working directory clean.

The nuclear option to ensure a clean tree would be git reset --hard. If you want to try that, do backup your tree+repo beforehand.

Once your working copy is clean, you can proceed with your steps 5 and 6.

When you deleted your file, that will be a change and that is the unstaged change that git is complaining about. If you do a git status you should see the file listed as removed/deleted. To undo this change you should git checkout -- <filename>. Then the file will be back and your branch should be clean. You can also git reset --hard this will bring your repo back to the status where you made your commit.

I am assuming that it is the last commit that has the very large file that you want to remove. You can do a git reset HEAD~ Then you can redo the commit (not adding the large file). Then you should be able to git push without a problem.

Since the file is not in the last commit then you can do the final steps without a problem. You just need to get your changes either committed or removed.

http://git-scm.com/book/en/Git-Tools-Rewriting-History

The github solution is pretty neat. I did a few commits before pushing, so it's harder to undo. Githubs solution is : Removing the file added in an older commit

If the large file was added in an earlier commit, you will need to remove it from your repository history. The quickest way to do this is with The BFG (a faster, simpler alternative to git-filter-branch):

bfg --strip-blobs-bigger-than 50M
# Git history will be cleaned - files in your latest commit will *not* be touched

https://help.github.com/articles/working-with-large-files/

https://rtyley.github.io/bfg-repo-cleaner/

A simple solution I used:

  1. Do git reset HEAD^ for as many commits you want to undo, it will keep your changes and your actual state of your files, just flushing the commits of them.

  2. Once the commits are undone, you can then think about how to re-commit your files in a better way, e.g.: removing/ignoring the huge files and then adding what you want and then committing again. Or use Git LFS to track those huge files.


Edit: this answer is also acceptable if for instance your commits needed authentication (e.g.: username and email) and that you need to add the proper credentials after having commited. You can undo things the same way.

Question: would someone have a way to just cherrypick the commit that is bad and change it directly? I'm asking especially in the case of someone who would just need to re-authenthify his commits like in here, but in a case where the files needs not to be changed. Only commits to authentify.

This is in reference to the BFG post above, I would comment directly, but I have no idea how to do so as a low reputation new user.

You may want to do a 'git gc' to repack first.

I had issues getting BFG to work until I did so, this appears to be a common issue if you've only been working in a local repo and are prepping stuff to put up on a remote for the first time.

Relevant google hit which twigged me to it: https://github.com/rtyley/bfg-repo-cleaner/issues/65

Here is what worked for me:

  1. Download and install BFG Repo-Cleaner (BFG), which is available here. My download was bfg-1.13.0.jar.
  2. A potentially helpful location to move the downloaded jar file, in my case bfg-1.13.0.jar, to is your ${JAVA_HOME}/lib. That is what I did because I want the Java specific libraries like these in a somewhat sensible location since they are not like ordinary Windows installations. You may wish to rename the jar file simply as bfg.jar to keep things simple - so below, where I use bfg.jar, I actually mean bfg-1.13.0.jar in my case.
  3. Run java -jar ${JAVA_HOME}/lib/bfg.jar --delete-files <file_name> --no-blob-protection .; you should replace the whole of <file_name> with the specific file name that is causing the issue - note that the path to the file is not necessary ONLY the file name by itself.
  4. Run git reflog expire --expire=now --all && git gc --prune=now --aggressive to complete the BFG cleaning job
  5. Finally, run git push origin main --force to complete pushing any outstanding local commits as you desire.
  6. If you have done everything up until this point successfully then your problem has been solved
  7. Going forward, always check that you do not inadvertently add very large files in directories to Git if you wish to avoid this problem reoccurring.

I continue to run into this problem over and over again, and I don't seem to learn not to do it. The solutions offered here have worked for me before, but for some reason not this time, but here is what did work (from https://medium.com/analytics-vidhya/tutorial-removing-large-files-from-git-78dbf4cf83a):

to remove the large file

git rm --cached <filename>

Then, to edit the commit

git commit --amend -C HEAD

Then you can push your amended commit with

git push

Copy newest Repo state

cp -r original_repo repo_tmp

Reset Original Repo to state before large file was commited

cd original_repo && git reset --hard {commit_before_large_file}

Remove .git from repo_tmp, so we only get the contents

cd .. && rm -rf repo_tmp/.git

Copy & Replace repo_tmp (newest repo state) to the original_repo folder

cp -r repo_tmp original_repo

Now Add, Commit & Push and you are good to go

git add . && git commit -m "be gone large file" && git push