Upstream risk: the vanishing

Go is one of those languages that has a tight-coupling to upstream code stored in remote repo. Lets call that “Other People’s Code”, OPC. You write some function, it automatically pulls in (usually from github) the OPC, and away you go.

Now, after a while of this, the good people of the Go community got tired of OPC breaking. It was fun when you could just use OPC and it worked, but what if ‘they’ changed the API or something? Then you would have to go fix Your Code (YC). So to prevent breaking YC all the time, the greybeards of Go invented the ‘Vendor’. The ‘Vendor’ technique is effectively a manifest file like this:

{
"version": 0,
 "dependencies": [
  {
   "importpath": "github.com/Shopify/sarama",
   "repository": "https://github.com/Shopify/sarama",
   "vcs": "git",
   "revision": "0fb560e5f7fbcaee2f75e3c34174320709f69944",
   "branch": "master",
   "notests": true
  },
  {
   "importpath": "github.com/Sirupsen/logrus",
   "repository": "https://github.com/Sirupsen/logrus",
   "vcs": "git",
   "revision": "10f801ebc38b33738c9d17d50860f484a0988ff5",
   "branch": "master",
   "notests": true
   },
...

OK, life is good again. I use the ‘OPC of the day’ for a while, then, when I think I’m done with YC, I lock down the dependencies. Mission Accomplished! But is it really? Consider this case that I ran into today:

{
 "importpath": "github.com/go-kit/kit/circuitbreaker",
 "repository": "https://github.com/go-kit/kit",
 "vcs": "git",
 "revision": "b8f878dd8851dd7b724c813f04d469fa2dae881a",
 "branch": "master",
 "path": "/circuitbreaker",
 "notests": true
},

The astute amongst you will recognise that b8f878dd8851dd7b724c813f04d469fa2dae881a has been rebased away. Its gone. Its an ex-commit. Its not just pining for the fjords. So I can’t build.

And that’s probably the best-case scenario. You see, the app I’m building has 790 upstream live dependencies. To various services (github, cloud.google.com, gopkg.in, golang.org, …). Trust all of them? Trust all of the people that have private repos on each of them? Hope that no-one figures out how to do a SHA1 hash attack? I mean, its not like that hasn’t been demonstrated. So someone can rewrite history in git (the rebase), why not rewrite it so the hash collides and I get ‘bad’ code?

In the meantime, well, I guess I binary search down a nearby commit hash in this repo and find one that is ‘good enough’.

I’m glad that YC is perfect, if only OPC was. Wait, MC is OPC from your standpoint? And OPC from my standpoint might be YC? Confusing!


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *