I am trying to learn git. Can someone suggest a tutorial? I have followed this one. It seems a possible workflow is
Create a repo with git init
Create some source files in that directory.
When work should be saved, type git commit to see what files have been changed and should be saved.
Use for example git add 1.f90 2.f90 to specify which files should be saved.
Use git commit -m "some message" to save those files
Use git log to show a history of commits.
Use git checkout commit_name to revert to a previous commit.
Use git checkout master to get to the latest commit.
If you don’t want git commit to show unsaved .o .mod. .exe files, since they are created by the compiler, create a .gitignore file with contents
*.exe
*.mod
*.o
My habit has been to store commit messages at the top of source files, for example
program xseasonal_tables
! 02/23/2023 10:19 AM optionally detrends the raw data
! 02/22/2023 03:40 PM reads and analyzes seasonal data for multiple fields
I can continue to do this and use the same messages in the git commit -m commands.
I wrote an introduction to concepts and usage 7 years ago for Sci comp classes I occasionally teach. It is brief but covers a good fraction of git concepts and topics used frequently.
Other than your manual approach in the header files, are you familiar with any other version control tools? What is your motivation for learning git? If you have no other restrictions, I would recommend Mercurial as a fabulous alternative to git. Just because git is the biggest, doesn’t make it the best.
Is there a way to have the local files in read-only until you say (with an appropriate command) “from now I’m working on these files”? The command would also inform the server that the files are being edited, thus preventing them being edited at the same time in another local instance. git commit would set again the files in read-only.
This is my memento about Git: git basics · vmagnin/gtk-fortran Wiki · GitHub
And at its top, there is that link toward the Pro Git book online: Git - Book
The Chapter 2 “Git Basics” may help you.
Note also that in the left margin you will find translations of that book in many languages.
I’m not aware of such a command, but the normal git workflow suggests everyone to create an own branch for local changes, aka feature-branches.
Branches in git are very lightweight, you can see them as pointers to a commit. This pointer moves forward when you commit changes so that it always points to the newest commit of the branch. When your branch is ready to get merged into the main branch, you first merge the main branch into your local branch git merge master, resolve any conflicts, and finally, you merge your branch into master git merge feature-xyz. Note that merging always merges another branch into your current branch. You can safely switch branches using git switch branchname.
Branching is the most powerful and important feature of git. As soon as you know the basics (init, add, commit, add commit, etc.) and advanced basics (clone, add, commit, push, … pull, add, commit, push, … pull, etc.), you definitely should learn how to use branches effectively.
There is a beautiful interactive website to learn exactly this, which includes a sandbox to test things out. I highly recommend completing this tutorial:
You might also wonder what a good git commit message should look like:
If you have further questions about git, I’d be happy to help you.
As @Carltoffel mentioned, better to use git status for this (and lots of things). I practically have that one on repeat all the time, and recommend it to new users to type it after doing practically anything.
This will restore your files to the state at that commit, but if you do want to “throw away” some amount of your changes, better is to do git reset --hard commit_hash
Part of the motivation for a version control system (VCS) is to no longer need to clutter the source code with change-logs. Once you’re comfortable with git, no need to keep putting git commit -m "message"s in the source code files.
git was designed to avoid this “bottle-neck”. There is no such “central server” other than by convention of a particular project (although GitHub/GitLab/ect. have taken on this role for the majority of users). It doesn’t alleviate the needs for some communication/coordination across teams ahead of time, but does make merging changes to the same files later as easy as possible.
I’ve had good luck with GitLab’s tutorials:
And as referenced already, The “Pro Git” book:
My Fortran programs have parameter files (typically less than 100 lines) which have things such as model specifications and the names of data files to be processed. Am I correct that parameter files should be part of the git repo? I think make files and the like should also be committed. The question git - How to manage large data files with GitHub? - Stack Overflow discusses whether raw data files, which could be very large, should be in the repo.
A bit off-topic, but much of the modeling in the financial industry is done in Excel spreadsheets. People back up foo.xls as foo1.xls, foo2.xls etc. I think the use of git with informative commit messages would be beneficial, because it is harder to diff two spreadsheets than two ASCII source files.
Generally, every file needed by somebody else to run your project should be committed. Ideally they should just have to make a git clone, a cd and then use the build system.
It should probably be considered carefully, case by case.
Some additional commands that I often use and that could be useful:
git diff --cached - show diff for staged changes before commit it.
git show - show changes of last commit.
git commit --amend - edit last commit message and commit (also new changes could be added with git add file_list to add new changes to the last commit)
git rebase -i - interactive edit history of commits, i.e. delete commits, move within history, squash, edit (it’s better to read manual before and test on separate testing branch)
git restore . - skip unstaged changes of tracked files
git commit - open texteditor to write short commit message (in first line) and long description starting after empty second line
For some actions I prefer mentioned above lazygit but it require to remember some shortcuts.
Spend some time at looking at the .git directory what is there and its meaning.
I found quite important to understand the low level construction of git, not all the details but just the main ones. I found that quite instructive, it let you understand better all the high level commands.
For example look at what is stored in the .git/objects directory. With the git cat-file command you can have a look at the objects stored there. They are all stored by an hexadecimal number corresponding to an hash key. And there you find blobs corresponding to the content of the files, trees that are basically directories and commits.
For example if you change the content of a file the hash key will change and so it will go in another blob.
The name of a file is present in a tree together with its hash key. So you cannot change the name of a file or the content of a file without changing the content of a tree and then its hash key.
So whenever you change anything you can only add things to the .git/objects directory.
And a commit has a reference to a previous commit (always by its hash key) and a tree (always by its hash key).
Then you can realize that branches are just text file with an hexadecimal number corresponding to a particular commit, and with a new git commit this hexadecimal number is updated to point to the new commit.
And the present situation of the working directory is just the name of the current branch that is stored in the .git/HEAD file.
another useful command is: git log --all --graph
Learn the low level behaviour and everything (well almost) will become crystal clear.
How often should you commit, and granular should a commit be? If I write a new procedure in mod.f90 and call it in main.f90, I could have a single commit for the changes to the two files. But since git log --name-only shows the commit message and the name(s) of files changed in the commit, if only one file is changed in a commit, it is obvious what file the commit message refers to. So I think I will
Write or modify a procedure.
Create a test for the procedure in a new or existing file.
Use separate commits for the file containing the procedure and the file containing the caller when I think the procedure is correct.
A drawback of very granular commits is that someone may inadvertently check out a version of the project that is incomplete – code with a new procedure that is never called. But if successive commit messages say “created function foo” and “wrote program to test function foo”, this should not happen often.
Not really a problem. But if you use test-driven development and write the tests before the procedure, it could be more serious (I think a commit should always leave the project in a state where it can be built).
A commit should be something that can be considered as a whole. It could be just one line fixing a bug, or a new procedure in a file, or coherent modifications in several files. It depends.
It also depends on the number of developers working on the project. If you are alone in the project, there is no risk. If you are in a team, you must be more careful.
And if you fear users could be annoyed, the solution is to work on each new feature in a dev branch that you will merge in the main branch only when it is completed and tested.