All data is saved in git objects. All git objects has a 40bits id which generated by SHA-1 hashing the object's content. There're 4 types of objects:
- blob object: File contents is saved in blob object. No filenames/permissions etc, only contents is saved here.
- tree object: Directory structure is saved here. Tree object's content is just a list of its children, either blob object or tree object. A list item will contain either a SHA-1 hash point to a blob object with filename/permissions/etc. or a hash point to a tree object. Here we have got a data structure(tree) which can represent a file system.
- commit object: Now we need history. A commit object simple contains a pointer to tree, one or many pointer to parents(also commits) and some booking data like commiter. Commit objects in fact forms a tree graph on a higher layer of blob/tree.
- tag object: Tag object is just for referencing an object conviniently. A tag object can have a pointer point to any other git object and a tag, then you can use the tag to reference any object(like an important commit) in your git repo.
- Branch is just a file in .git/refs/heads/ dir contains the SHA-1 hash of the most recent commit to that branch. When you create a branch in git, git just create a file contains a 40 bytes hash in .git/refs/heads/, and update .git/HEAD to point to it. With your development moves on, git will find current branch in HEAD and update the branch file in refs/heads correctly.
- Remote is a pointer to branch(so it's also a branch) in other people's copies of the same repo. If you get the code by clone instead of 'git init', git will add a default 'origin/master' remote branch for you automatically. 'origin' point to the remote copy location, and 'master' means which branch on remote you cloned from.
A fetch will merge all updates on a remote branch to your local. By default it will merge in changes on origin/master, but you can fetch updates on other place like origin/cool. After a series of fetch/merge your history graph will looks like a mess, rebase will help. Rebase will leave orphan objects in your repo(you can use 'git gc' to clean it) and should not be used on a repo which can be fetched by others.