The agony of NFS for 25+ years of my life! Then and now. ClearCase and Kubernetes
My first experiences with NFS (Network File System) started in 1989. My first term at university, a set of vax machines running BSD Unix, some vt220 terminals, and ‘rn’.
My first understanding of NFS came a few years later. ClearCase. I was working at HP, the year was 1992. Most of us on the team shared a 68040-based HP server (an HP/Apollo 9000/380), but were very excited because the new PA-RISC machines (an HP 9000/720) were about ready to use, promising much higher speeds. We were using a derivative of RCS for revision control (as was everyone at the time).
Our HP division was (convinced? decided?) to try some new software that had come as a genesis of HP buying Apollo in 1989. A new company (Atria) had formed, and, well, ClearCase was born out of something that had been internal (DSEE). And it was based on distributed computing principles. The most novel thing about it was called ‘MVFS‘ (multi-version file system). This was really unique, it was a database + a remote file access protocol. You created a ‘config spec’, and, when you did an ‘ls’ in a directory, it would compute what version you should see and send that to you. It was amazing.
This amazement lasted about an hour. And then we learned what a ‘hard mount’ in NFS was. You see, the original framers of Unix had decided that blocking IO would, well, block. If you wrote something to disk, you would waiting until the disk was finished writing it. When you made this a network filesystem it means that if the network were down, you would wait for it to come back up.
Enter the next ‘feature’ of our network architecture of the time: everything mounted everything. This was highly convenient but it meant that if any one machine in that building went down, or became unavailable, *all* of them would lock up waiting.
And this lead rise to a lot of pacing the halls and discussion over donuts of “when will ClearCase be back?”
Side note: HP was the original tech startup. And one of its original traditions was the donut break. Every day at 10, all work would stop, all staff would congregate in the caf, a donut, a coffee would be had, and you would chat with people from other groups. It was how information moved. Don’t believe me? Check it out.
OK, back to this. Over the years, Clearcase and computing reliability/speed largely stayed in check. Bigger software, bigger machines, it stayed somethiing that was slow and not perfectly reliable, but not a squeaky enough wheel to fix. As we started the next company, we kept ClearCase, and then on into the next. So many years of my life involved with that early decision.
But what does this have to do w/ today you ask? Well, earlier I posted about ReadWriteOnce problems and backup and my solution. But today I ran into another issue. You see, when I deployed gitlab, two of the containers (registry + gitlab) shared a volume for purposes other than backup. And, it bit me. Something squawked in the system, it got rescheduled, and then refused to run.
OK, no problem, this is Unix, I got this. So I decided that, well, the lesser of all evils would be NFS. You see, in Google Kubernetes Engine (GKE) there is no ReadWriteMany options. So I decided to make ReadWriteOnce volume, load it into a machine running NFS server, and then mount it multiple times in the two culprits (gitlab + docker registry). It would be grand.
And then time vanished into a blackhole vortex. You see, when you dig under the covers of Kubernetes, it is a very early and raw system. It has this concept of a PersistentVolumeClaim. On this you can set options such as nfs (hard vs soft). But, you cannot use it with anything other than the built-in provisioners. I looked at the external provisioners but, well, a) incubator, and b) for some custom hardware I don’t have.
Others were clearly worried about NFS mount options since issue 17226 existed. After 2.5 years of work on it etc, mission accomplished was declared. But only for PersistentVolumeClaim, not volume. And this matters.
spec: containers: - name: nfs-client ... volumeMounts: - name: nfs mountPath: /registry volumes: - name: nfs nfs: server: nfs-server.default.svc.cluster.local path: /
you see, I need something like that (which doesn’t accept options). Because I cannot use:
apiVersion: v1 kind: PersistentVolume metadata: name: pv-nfs-client spec: capacity: storage: 10Mi accessModes: - ReadWriteMany nfs: server: nfs-server.default.svc.cluster.local path: "/" --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-nfs-client spec: accessModes: - ReadWriteMany resources: requests: storage: 1Mi selector: matchLabels: name: pv-nfs-client
Since this requires a built-in provisioner for some external NFS appliance.
OK, maybe I have panic’d to early, I mean, its 2018. Its 26 years since my first bad experiences with NFS, and 29 since my first ones at all.
Lets try. So I wrote k8s-nfs-test.yaml. Feel free to try it yourself. ‘kubectl create -f k8s-nfs-test.yaml’. Wait a min, then delete it. And you will find if you run ‘kubectl get pods’ that you have something stuck in Terminating:
nfs-client-797f96b748-j8ttv 0/1 Terminating 0 14m
Now, you can pretend to delete this:
kubectl delete pod --force --grace-period 0 nfs-client-797f96b748-xgxnb
But, and I stress but, you haven’t. You can log into the Nodes and see the mount:
nfs-server.default.svc.cluster.local:/exports on /home/kubernetes/containerized_mounter/rootfs/var/lib/kubelet/pods/fa6c9ce3-61d0-11e8-9758-42010aa200b4/volumes/kubernetes.io~nfs/nfs type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard, proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.19.240.34,mountvers=3,mountport=20048, mountproto=tcp,local_lock=none,addr=10.19.240.34)
in all its ‘hard’ glory. You can try and force unmount it:
umount -f -l /path
But, well, that just makes it vanish from the list of mounts, no real change.
So. Here I sit. Broken-hearted. I’ve looped full circle back to the start of my career.
So, peanut gallery, who wants to tell me what I’m doing wrong so that i can ‘claim’ an export I’ve made from a container I’ve launched, rather than one in the cloud infrastructure. Or alternatively, how I can set a mount-option on a volumeMount: line rather than a PersistentVolume: line.
Or chip in on 51835.