Mastodon

Forkin'

Forkin'

If you're like me, you use git an awful lot. Primarily through the command line. Though sometimes from EMACS. And also there are ponies for some reason.

Even if you're not like me (but you still use git), a fork-based workflow can be confusing sometimes. How do you manage pulling upstream changes? How do you create your PRs?

What follows is an account of how I use git in a fork-based workflow, specifically with GitHub.

Clones and Remotes

First I'll clone the repository that I want to fork. Why? Because having the main repository as origin makes things easy for me.

$ git clone git@github.com:organization/repository.git

I got a lot of feedback from this particular thing. Apparently some people really dislike origin being the original repository. I did so for a few reasons:

  • It's a continual reminder that this is not my house. This is someone else's house and I need to take my shoes off.
  • I don't get my fingers used to typing the command to force-push to origin master, which I would have to do with my fork as origin (see later for more).
  • It's more easy and natural for me to pull upstream changes from origin master.
  • If I ever want to see what upstream is doing, I've already got it loaded and readily available.

Next I'll click the big Fork button in GitHub and fork the repository to my organization or personal account.

I treat this fork as a staging ground for pull requests and nothing more. I don't actively try to keep things synchronized with origin.

Now that there's a fork, I add it as a new remote in my original clone:

$ git remote add personal git@github.com:me/repository.git

Fantastic. One directory, two remotes, zero problems.

Workflow

Very simply, my workflow is as follows:

  1. Grab the latest changes from upstream
  2. Do my work and commit locally
  3. Force push (yes, really!) to the fork
  4. Navigate to the fork and create a pull request against upstream
  5. Once upstream has merged the PR, I reset the state of my local repository

I suppose I should take a minute and state explicitly that I only use this workflow on ancillary repositories that I don't have direct commit access to. These are mostly repositories that I or my teammates touch reasonably infrequently.

Having said that, let's go over the steps individually

Grab the latest changes from upstream

We want our repository to be as close to in-sync as possible when we start the work so there's a smaller chance of merge conflicts when the resultant PR gets accepted. With that in mind, we issue the command:

$ git pull --rebase origin master

Congrats, the repo is up-to-date with origin. 🎊

Work locally and commit

This one I feel is pretty self-explanatory. Modify your files and git add and git commit as you see fit.

Now your local repository is n commits ahead of origin/master.

Force push to the fork

This sounds scary if you're not as familiar with git or have worked on a team full of people who've been burned by force pushes before (i.e. you have been on one of my teams).

Force pushing rewrites history - that's its whole point. A regular push will gently nudge the remote's (origin's) state to match the state of your local repository, using commits and merges. When you force push, however, you're aggressively overwriting the remote's state with your local state. When someone else on your team tries to pull, they won't be able to because their repository state can't be cajoled via commits and merges to match the remote - there's no path from Point A to Point B so the pull operation fails. Recovering from that failure is additional work for every other person who has cloned the repo and wants to get their local state to match the remote's.

I hope that was followable.

Now, why is force pushing okay in this scenario? Because as stated before our fork is a staging ground for pull requests and nothing more. We don't really care about pulling from it - it's essentially write-only. The sole reader of the fork will be the maintainer of the upstream repository, when she merges your PR.

So here we go, with a (potentially) scary command:

$ git push -f personal master

... and we hold our breath, and ...

Create a PR to upstream

At this point I navigate in my browser to my fork's repo in GitHub. There, just above the file list, you'll see the Pull request link toward the right:

Click the link, fill out the pertinent information, and click Create Pull Request. Boom. PR created.

Changes?

If the maintainer of the upstream repository requests any changes, you make them as normal and then just $ git push personal master. No need to force push, as your local repository is perfectly in sync with your fork. Pushing to personal master will automatically update the PR, too.

Reset the state of your local repository

Once the PR gets merged (or closed without being merged) you'll want to reset the state of your local repository so you don't run into any hairy merging situations during future work.

This step is similar to "fold and put away laundry" in that you're going to want to skip it. LAUNDRY ISN'T DONE UNTIL IT'S FOLDED AND PUT AWAY. Don't be that person that just gets dressed in front of the dryer.

Anyway, don't skip this step. You'll thank me later.

So to reset the state of your local repository, first you want to get things back to where they were when you started:

$ git reset --hard origin/master

Then go ahead and pull to resync with upstream's remote:

$ git pull --rebase origin master

If everything went well (and there's no reason it shouldn't've) you should see the commits that were merged as part of your PR get applied to your local repository.

And now your repository is back in sync with upstream! You can forget about this repository for a while, confident that when you come back to it in a week or month or whatever you'll be ready to start the workflow again at step 1.

Stick a fork in it

So that's the workflow I use currently. Give it a try. Tell me if you improve on it or if it sucks for your context. It's worked for me as an individual, and it's something I encourage my teammates to do with repositories that we don't own in our current project context.

If you have a favorite workflow for forks I'd love to hear it. Regardless, please enjoy this picture of me leading a miniature horse named Baby in from the paddock: