Clever Engineering Blog — Always a Student

Mo Repos, Mo Problems? How We Make Changes Across Many Git Repositories

By Nathan Leiby on

tl;dr: Try out microplane! It’s a CLI tool to make changes across many repos.

The Problem

At Clever, we’ve embraced microservices. They promote modularity, which leads to simpler code bases and lets our engineers move quickly and independently. They are easier to deploy, which helps us build towards incremental, frequent deploys and continuous delivery. In our containerized architecture, they let us scale our services independently in response to load and enforce fine-grained access control.

We create a Git repository for each of our services, and we have over 500 repositories at Clever. Working in separate repos vs. one large repo (i.e. a monorepo) is a tradeoff. Our approach of “one repo per service” helps us clarify team ownership and keep source control fast, but it’s more difficult to share code or enforce consistent practices.

One particularly painful aspect of multiple repositories is that it’s sometimes hard to make a conceptually simple change because it requires updating many repos. For example, when we recently updated from Go 1.8 to 1.9, it meant making the same change to nearly 200 repos.

The Dream

We had built tools like gitbot and reposync to help make changes across repos, but the overall process still took many hours or days to complete. We could quickly make 200 commits to 200 different repos, but it was slow to carry out the full development process: debug failures, fix edge cases, open PRs, get PRs accepted, merge them, and deploy the changes.

We asked ourselves – what if deploying simple changes across hundreds of repos were trivial? Our ideal process would allow a change like “update all repos from Go 1.8 to 1.9” to take a hour of manual effort, instead of a few days.

Our Design

During our recent Hackathon I decided to work on this problem. I teamed up with Rafael Garcia, our CTO, and we began to refine the idea. We came up with the following principles:

  • Have an opinionated workflow. We know teams have different workflows around source control, code review, CI systems, and more. We aimed for a tool that would work for Clever. It’s designed around Github, CircleCI, and the Pull Request approval process that we already use.
  • Empower anyone to make changes. Our infrastructure team needs to make sweeping changes, but so do others. For example, our security team might want to apply security fixes or our guilds (subject matter experts on node, golang, resiliency, etc.) might want to update a dependency version across many projects.
  • CLI first. We prefer working with CLI tools. They also allow for a simple permissions model, where the current user is the one making commits and API calls. We drew inspiration from Terraform, which allows users to “plan” changes before applying them. Of course, running locally isn’t perfect: for example, a centralized service might have made it easier to collaborate or to queue updates so we don’t overwhelm our CI system.

With these in mind, we set to work building Microplane.

How Microplane Works

Let’s revisit the cross-repo change from our previous example: “Update all Go 1.8 repos to use Go 1.9”. Here’s the workflow you follow when making changes with Microplane:

Init => Clone => Plan => Push => Merge

Init targets the relevant repos via a Github search. You first can use Github’s advanced search to iterate on a search that will target the correct repos. Then, copy your query string and run init.

$ mp init "org:mp-test go1.8"

Now, clone the repos. We ignore existing copies of the code on your machine, and instead clone a fresh, up-to-date copy of each repo into a staging directory.

$ mp clone
$ mp status
REPO                    STATUS          DETAILS
test-service-1          cloned
test-service-2          cloned                
...

Next, we’ll plan our change. Planning lets us run a script and preview the change. Each time you run plan, it copies the repo from the clone step and then executes your script. This enables fast, local iteration on your script without needing to re-download the repo.

For our Go version update, a script that uses sed to update a string will suffice (note: On MacOS, use sed -i ”). We also pass arguments to set the Git branch and commit message.

$ mp plan -b go1.9 -m "Golang 1.9 upgrade" -- sed -i 's/1.8/1.9/g' circle.yml

Aside: Although the above script isn’t too complex, sometimes debugging your script requires more trial-and-error. When running plan (or any other step), it can be simpler to work with a single repo at a time. You can use the –repo (-r) flag to do this.

$ mp plan -r <repo-name> -b go1.9 -m "Golang 1.9 upgrade" -- sed -i 's/1.8/1.9/g' circle.yml
diff --git a/circle.yml b/circle.yml
index 436edd9..6d1d72f 100644
--- a/circle.yml
+++ b/circle.yml
@@ -1 +1 @@
...
-go1.8
+go1.9
...

Once you’ve worked out the kinks, just remove that flag and you’ll be back to parallelizing your change across all repos.

Once we’re happy with our plan, it’s time to push the change. This pushes our Git commits, opens PRs, and sets PR assignees.

$ mp push -a nathanleiby

After pushing, we can see all the PRs we opened.

Beyond the Github UI, we can also run status to check CI status.

$ mp status
REPO                    STATUS          DETAILS 
test-service-1          pushed          🕐 waiting for CI
test-service-2          pushed          🕐 waiting for CI
...

After a few minutes when CI is complete, the status will be updated to reflect that.

$ mp status
REPO                    STATUS          DETAILS 
test-service-1          pushed          ✅ ready to merge
test-service-2          pushed          ✅ ready to merge 
...
 

Once we’re ready, merge the open PRs. This will only merge the subset of PRs that are valid to merge, i.e. the PR is mergeable and passes CI. In the future, we may also enforce that the PR has been manually approved.

$ mp merge 
 

:boom: You just merged PRs for 100s of repos from your terminal.

The Future

We aim for a world where making sweeping commits across Clever is feasible, safe, and fast.

Some of the next steps we see to make Microplane even more powerful are

  • Determining PR Assignees. How do we determine the right owner for a PR? So far, we’ve hard coded an assignee but relied on existing user subscriptions to repos or manual assignment to do better. Might we rely on Github’s suggested reviewers or CODEOWNERS? Can we prevent automated commits from skewing git blame data? Ideally, our process for assigning a reviewer should be the same for both manual PRs and machine generated ones.
  • Combining with Continuous Delivery (CD). Over the past 6 months we have been building CD features like deployment pipelines (canary deploy, monitor, then full deploy) and automated rollbacks. We believe investing in CD will help us to deploy many small changes quickly and safely, with minimal manual work. In the case of our of Go upgrade, about 20% of our merges kicked off a fully automated deployment pipeline. We aim to increase the number of deploys that happen without need for human oversight

Thanks

Thank you to Raf for building the initial version of Microplane with me, the Clever Infrastructure team for continuing to improve it, and to the other engineers at Clever who’ve tried it out and given us valuable feedback.

Thanks also to the creators of tools like Hub and Terraform for inspiring this work.

… and of course, thank you for reading!

Have comments?

Talk about this post on HackerNews.

p.s. We’re hiring. If you want to work with exciting technology while improving K-12 education, please see our jobs page. In particular, here are the engineering roles: software engineer, infrastructure engineer, and software engineering intern.