The Git Commit Hash

栏目: IT技术 · 发布时间: 4年前

内容简介:This might look familiar to you. The sign of a commit which is built up of a few components: the commit hash - a 40 character long string, followed by the author, date and lastly the commit message.This blog post will focus on theReading this, I assume you
commit 13c988d4f15e06bcdd0b0af290086a3079cdadb0
Author: Mike Street
Date:   Sun Mar 3 16:04:33 2019 +0000

    Initial commit

This might look familiar to you. The sign of a commit which is built up of a few components: the commit hash - a 40 character long string, followed by the author, date and lastly the commit message.

This blog post will focus on the commit hash , a seemingly random mish-mash of letters and numbers that you sometimes have to copy and paste about. What is it? How is it made? Can it change? All those questions answered in this blog post (hopefully).

Reading this, I assume you have basic knowledge of Git and have at least committed a few times and hopefully used branches.

This blog post is intended as a primer to the commit hash. There is a lot more logic & magic behind the scenes that goes into making and using the git commit hash which is beyond the scope of this article

What is a hash?

Before we dive into the git specifics, I thought I would give a very brief overview of what a hash is.

There are many different hashing algorithms - MD5 and SHA-1 are examples of these. What a hash allows you to do is take an arbitrary amount of content (be it one word, 100 words or the whole contents of a JavaScript library) and produce a unique fixed string of characters representing that. The length of the string is dependent on which method you choose.

The string (in “theory”) cannot be reverse engineered (e.g. given the hash it is difficult to work out the contents), but would allow you to compare two things to see if they are the same.

Note: I put theory in " , because there are workarounds to hashing; so be wary if you are using a hashing algorithm to encode passwords

One use case of hashes is to compare the contents of two files. You may have seen the integrity check on script tags appearing ( MDN docs for reference ). This stores a hash of the file you expect and compares it with a hash of the file the browser downloads. If the hashes match, you know the files are the same. This saves computers (or humans) meticulously checking each line of a file.

Different hashing algorithms have different methods of hashing (and different length resulting hashes). Of course, collisions can happen (as mentioned on Stack Overflow ) but they are rare and would be extremely unlikely in a git scenario (especially within the same repository).

How it is made?

The commit hash is an SHA-1 hash made up of a few properties from the commit itself. As mentioned above, it is a lot more complex than this post can go into, but understanding the fundamentals is a great first step.

The git hash is made up of the following:

  • The commit message
  • The file changes
  • The commit author (and committer- they can be different)
  • The date
  • The parent commit hash

When you take all these into consideration, hopefully you will begin to see how various actions might impact how the commit hash is formed.

One other thing to note is the Git tree doesn't really "exist" as such - it is constructed by following the parent hash. This may seem like the same thing, but there are some slight nuances.

If a commit hash isn't contained in another commit hash as a parent, then that can create an orphaned commit. The exception to this is branches and you HEAD, which point to a specific commit hash.

Git is entirely a text based system - in your files you will have a .git folder, which contains all the commits, branches and other information about your repository as text files - it is worth a look around.

How can it change?

So now we know and understand what the hash is made off, how can it change and what impact would that have on the repository and git history.

Amending a Commit

If you amend the commit message, or the files in a commit, this will change the git hash. You can amend the last commit by running git commit --amend . This allows you to edit the message, what files and changes are included in a commit and even the author.

All of these are things the hash is based on, so amending the commit will change the hash

Cherry Picking

If you have made a commit on a different branch and wish to have it on your current branch, you may be advised to git cherry-pick <hash> . This will work, however, the hash for the commit will change. This is because it will have a new parent commit and so, a new hash will need to be calculated.

This may also cause issues when you come to merge the other branch, as git will see two commits with different hashes that apply the same change - so be careful if you ever cherry pick

Rebasing

Rebasing one branch onto another will have a similar effect to cherry picking. When you rebase, it essentially removes all of your commits on your branch, updates it with the source branch and then reapplies your commits afterwards.

This changes the parent commit of your first commit and so, all of the following commits also need new hashes generating as their parent has changed.

Rebasing can quickly become a mess! It's a hugely powerful tool, but with great power comes great responsibility.

Squash

If you choose to squash commits while merging, a brand new hash will be generated as it is all of your previous changes in one. Because of this. merge/pull requests can get confused and believe they have not been merged in yet as they compare hashes between branches.

How does it impact the repository?

Changing the commit hash is fine as long as you haven't shared the commit . Git relies on these hashes to navigate its timeline and history between different sources, so changing these can have unwanted side effects. Think of it as a Sci-fi movie, you can go back in time but you shouldn't affect someone else's timeline.

If you make a change, and a hash changes, Git will see this as a new commit. If the original one has been pushed and shared, Git will see two different commits in your timeline and want to merge the two different variants when you later want to push again.

if you are working in your own, or are confident in how it can affect things you can force push ( git push --force ) to tell the target repository (generally origin ) that your version is the source of truth.

The End

I hope that has helped further your understanding of Git and that next time something happens, you know why and how! Let me know if you have any questions, need something expanding or have any other feedback.


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

UML基础与Rose建模教程

UML基础与Rose建模教程

人民邮电出版社 / 人民邮电出版社 / 2006-1 / 36.00元

《UML基础与Rose建模教程》全面,详细地介绍了 UML 的基础知识和Rational Rose的使用方法,并通过4个综合性的案例,展示了使用UML和Rose进行软件建模的具体方法和步骤。全书共分20章,前4章是基础部分,介绍了UML和Rose的基础知识;第5章到第13章是《UML基础与Rose建模教程》的重点,介绍了UML的9种图和组成元素,以及相关的建模技术,并在每章的最后介绍了各种图在Ro......一起来看看 《UML基础与Rose建模教程》 这本书的介绍吧!

SHA 加密
SHA 加密

SHA 加密工具

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具