Inside Git: How It Works and the Role of the .git Folder

Most of the developers know about Git and use Git a lot, but do you know how it internally works? It takes snapshots of our projects and stores them in its object database, where each object contains a hash key generated using the SHA-1 algorithm. We’ll dive deep into how it works and how it saves our codebase snapshots into an object database.

What is the .git folder?

Before understanding how Git works, we need to understand about .git folders. .git folder is the backbone of Git, all the information and data is saved into this folder. When we initialize a Git repository using the command, git init a hidden folder is created inside our working directory named .git by Git.

You can verify this by running ls -a command in the current working directory. If you’ll see the content of this folder by running a command ls .git, you’ll see a lot of things cooking there.

There are files HEAD, config, and description and folders hooks, info, objects, and refs. We’ll try to understand most of these in detail, and this will eventually help us understand how Git works.

How does Git work?

Now we’ve understood the .git folder with which the working of Git is possible, now we’ll dive into this folder and see how things work internally.

First we’ll understand objects, there are three types of Git objects commit, blob, and tree. By the end of this article you’ll understand all of these three object types. Every object is assigned with a hash key.

The every snapshot taken by Git is saved in the object database, which I mentioned earlier. The object database is inside the .git/objects directory. We’ve fresh project where we initialized Git and have not committed anything yet. If we try to see what is inside objects. By running a command ls .git/objects so we see there are two default folders into and pack which are by default added by Git for optimisations.

Now I’ll make a commit into this repository to see what changes inside the objects directory.

If I check again by running, ls .git/objects, there are three new folders inside objects. These folders are named with the first 2 digits of their commit hash.

To check the commit we made and it’s hash key run git log on terminal

This commit has 67 as first 2 digits and we’ll go into the 67 folder to check what is inside it in .git/objects.

It contains the hash of the commit we saw earlier with git log:

67225eab133b6e98c2fe054404554272506f7ba0 = commit hash

67+225eab133b6e98c2fe054404554272506f7ba0 = folder name + hash file inside folder.

Now to check the content of this hash file, Git provide us with a low-level command cat-file which we’ll use to see the content of this file.

So with git cat-file <commit hash> -t(to check the type of object) -p(to check content of this object) . We got to know that this object is of type commit. So one type of object is now cleared.

There is a tree with a hash reference inside the commit object content. Tree is an object used to store directories in our project. A tree can contain other trees or blobs to make a complete tree hierarchy. This tree hash starts with fd which I earlier mentioned that inside objects there are folders named first 2 digits of their hash, now we’ll see if there is any folder inside .git/objects named fd.

You can see that one folder is named as fd inside objects. Inside this, there is a hash file similar to the earlier one that we saw. If we try to see the type of this object and content:

You can see the type of this object is tree and it contains only a blob, of which a hash reference is given. And it contains a filename, which has the reason why it’s stored here instead of storing it in blob. I earlier said a tree contains blobs and other trees as well. Here is only a blob because this commit only contains one file, which is app.js.

With this the second type of object is now cleared. Now let’s move forward and see blob. In tree content above we was getting blog hash reference starting with 69, we’ll find a folder in .git/objects.

When we try to check the type of this object it’s blob it’s content is the actual code of this file, which was there when we committed it into the repository. Congratulations, now you have an understanding of all three object types commit tree and blob.

We only saw about the commits and how Git stores data and snapshots internally. But we have many more things of Git, like branches, and more. If you enjoy this article and understood the whole thing. Comment on this article and I’ll write another article on the remaining things of Git.

Conclusion

Git works under the hood inside the hidden .git folder, which is inside our working directory. Git stores everything in its object database, and every object inside it has a hash key. Objects are of three types: commit, tree, and blob. We saw how all these types are linked with each other.

I encourage you to check this all by yourself to see that it’s actually true. Hope you enjoyed reading this article. Thanks for giving your valuable time.