Using single-sign-on oauth2 across many sites in Kubernetes

You have a set of web resources (a kibana dashboard, a grafana dashboard, a few other misc ones). You are setting them all up with 'basic' auth because its simple, and secretly hoping no-one guesses "MyS3cret". You, my friend, are doing it wrong. Let me explain.

It turns out there is a protocol called 'oauth2'. You have probably seen this on many sites (e.g. 'sign in with Google'/'GitHub' etc). As a consumer, you should always do that when you can. Its much better to have one strong setup (your Google one) than many weak ones. When you 'sign in with Google' it doesn't actually share your password or profile, it just does authentication.

Now, how can we translate that into the misc set of web pages that we run in our system? This is super simple but it wasn't well documented and took me a bit to figure out how to do it well.

First, lets create a small yaml file, 'oauth2-values.yaml'. Fill it in like so. You will need to get the clientID and clientSecret. I am using Google (so, but there are other sites like GitHub, GitLab, etc you can use, and ample instructions online for this. In the Google case, allow redirect URI of 'oauth2.MYDOMAIN/oauth2/callback'

  clientID: ""
  clientSecret: "yyyyyyy"
  # Create a new cookieSecret with the following command
  # python -c 'import os,base64; print base64.b64encode(os.urandom(16))'
  cookieSecret: "zzzzz=="
  configFile: |-
    pass_basic_auth = false
    pass_access_token = true
    set_authorization_header = true
    pass_authorization_header = true

  repository: ""
  tag: "v3.1.0"
  pullPolicy: "IfNotPresent"

  provider: "google"
  email-domain: "MYDOMAIN"
  cookie-domain: "MYDOMAIN"
  upstream: "file:///dev/null"
  http-address: ""

  enabled: true
  annotations: 'true' nginx "true" letsencrypt 100m
    path: /
    - oauth2.MYDOMAIN
    - oauth2.MYDOMAIN
    - secretName: oauth2-noc-tls
        - oauth2.MYDOMAIN
        - oauth2.MYDOMAIN

Now we are going to install an 'oauth2 proxy'. We will run *1* for our entire domain, and it will allow anyone with our domain to access.

helm install -f oauth2-values.yaml --name oauth2 --namespace oauth2 stable/oauth2-proxy

OK, now we just need to add 2 annotation lines to every ingress: https://oauth2.MYDOMAIN/oauth2/start?rd=$http_host$request_uri https://oauth2.MYDOMAIN/oauth2/auth

And boom we are done. I'm assuming you are already using the excellent cert-manager so your site is TLS protected. Now you have strong sign-on, and, more usefuly, single-sign-on.

The key is that 'cookie-domain' above. It means we are using a single oauth2-proxy protecting the entire domain, and, once one site is signed in, all sites are signed in. So great!

Even better, if you use Multi-Factor-Authenticaation it fits in with it. Rarely type a password again, prevent needing passwords for new sites, and be more secure. What's not to love!

Project Block Heater: an update

The other day I wrote about adding a 'block heater' to the e-bike charging system. I'm please to report its working great!

If we look at it with a thermal imaging camera, we can see that the 'hot spot' is about 6C, outside the insulation. So it shouldn't be *colder* than that inside I guess.

This is with an outdoor temperature of about -10C.

The 'cold' spot is about 0C.

So I think its achieving the goal. The efficiency is not that great (charging off the built-in battery instead of an external pack), but I'm not concerned about long-distance driving, and I only have to charge about 1/week in the winter, so its not additional inconvenience.

IoT (h)army: hacking the smart switch

I purchased a pair of Teckin SP10 smartplugs. They were on sale for $8 each, they fluctuate up and down, are available in round, square, 1-pack,2-pack,4-pack, lots of options. I did this on the thesis that:

  1. They would be a disaster for security
  2. They would probably have an esp8266 in them for simple hacking

I'm pleased to report that both turned out to be true! Look @ the attached packet capture and you will see for yourself (dump.pcap). In a nutshell, they run a public MQTT server, all these devices contact it. You can use that bus to upgrade them (imagine me pushing new firmware to a widget in your house, and that widget can be a WiFi AP and Client for great man-in-the-middle attacks against your other devices). Its got a bit of control on it (there is some password which is defined by the mac of the device... can't imagine that number being guessable!). Hmm.

So let's dremel it open (side note: it seems you can just apply some heat and wiggling to break the ultra-sonic weld, but who has time for that!).

OK, we're in. That little module standing vertically is indeed an ESP8266-01. The serial ports is indeed exposed underneath, so programming it is simple.

But, turns out there is an even simpler way. Install this git repo and plug in the device, boom, running Tasmota. And now I can setup the device from a simple web interface, assign it to my private MQTT server, and from there my HomeAssistant. And now we are good to go, no Internet needed, security is much stronger.

This is actually quite a good device for the price. Since it has the esp8266 its both more and less hackable than the KanKun I did earlier. I kind of wish I had got the rectangular ones, but they were 'much more expensive' at ~$15/each. They also have a power-bar one with 4 outlets. Hmm, so many choices!

So, tl;dr: this device has decent hardware. The software and app worked very well (surprisingly, they are usually terrible). The security was a 2/5, I mean, its unlikely your house will be burned down, but, well, a moderately skilled hacker could use it to get access to traffic from other devices in your home. THe hackability is high, its now running secure on my TLS-based MQTT, on my private network, with my own HomeAssistant (meaning I don't worry about it being bricked if they give up like Lowes did).

So, I do recommend. Get your hack on.

Cloud anti-pattern dilemma: shared ‘state’

So i've been working with fluent-bit for cloud log processing. Got it working with for some ludicrous scale etc.

Now the way fluent-bit works, it mounts the host '/var/log' inside itself, and does a 'tail -f' on all the files (it runs as a DaemonSet). To handle the case where you restart fluent-bit (perhaps for an upgrade, perhaps to change config) it can squirrel the state away into a database (sqllite), effectively remembering the offsets in each of the files. Sounds great right? Means that when fluent-bit is restarted you don't get duplicated messages.

Now, let's assume I place a 'readiness' check on it. I make a config change, and let the 'rollingUpdate' strategy do its work. Let's assume I also use the 'hash of file' strategy I mentioned earlier so that the config is 'hermetic'.

So what happens is the 'old' fluent-bit is running. A new one is started. Until its 'ready' Kubernetes doesn't kill the old one. They both process the same files for a bit, writing to this shared file. If the new one doesn't come online (e.g. it has an error in config) this is fine I guess. But when it does come online, they are both processing for some period of time. Hmm. Double logs? Corrupt sqllite? Hmm.


DoS’ing the cloud with logs

A few years ago an NTP issue came to light that caused a lot of damage. Cloudflare did a good writeup on this if you want to see the details. But in a nutshell, if there is a request which can be sent which causes a larger response, you have amplification. In the NTP case, a small packet requesting the peer list would get a much larger response. Coupled with being UDP and a lot of providers not implementing BCP38, meaning I can generate a request from your IP, and the large response goes to you, and you have a problem.

OK what does this have to do with cloud you ask? Well, lets look at logging. A lot of people use 'logging as a service' (stackdriver, elasticsearch, ...). It can be managed by your cloud provider (Google, Microsoft, ...), or by a 3rd party. But, well, you log a lot of stuff in (moderatly) abnormal circumstances.

Now lets look at a particular tool. 'Tranquility' (this observation seems true for nearly any Java program as far as I have observed). When something happens, it logs a stack trace. As an end-user? useless. Lets look at one below. When someone connects to my service without knowing my credentials, they eventually time out and leave this bomb. Its 9747 bytes. From a single connect. This is in fact worse than the above NTP issue.

But it gets worse. Each of these lines is bundled up into JSON, with a timestamp, some fields around originating host, etc. It turns out by the time this lands on the wire on the way out to my log provider its more like 20KB. That's right, a hundred bytes or so of inbound SYN and ACK cause this. Now I know what you are thinking, what endpoint of Don's can I try this on? Well its ''. Go ahead.

How long before someone finds and attacks this vector? Hmm.

2019-02-03 22:08:00,959 [qtp1247938090-34] WARN  c.m.t.server.http.TranquilityServlet - Server error serving request to
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos( ~[na:1.8.0_181]
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos( ~[na:1.8.0_181]
	at java.util.concurrent.CountDownLatch.await( ~[na:1.8.0_181]
	at com.twitter.util.Promise.ready(Promise.scala:667) ~[com.twitter.util-core_2.11-6.42.0.jar:6.42.0]
	at com.twitter.util.Promise.result(Promise.scala:673) ~[com.twitter.util-core_2.11-6.42.0.jar:6.42.0]
	at com.twitter.util.Await$$anonfun$result$1.apply(Awaitable.scala:151) ~[com.twitter.util-core_2.11-6.42.0.jar:6.42.0]
	at com.twitter.concurrent.LocalScheduler$Activation.blocking(Scheduler.scala:220) ~[com.twitter.util-core_2.11-6.42.0.jar:6.42.0]
	at com.twitter.concurrent.LocalScheduler.blocking(Scheduler.scala:285) ~[com.twitter.util-core_2.11-6.42.0.jar:6.42.0]
	at com.twitter.concurrent.Scheduler$.blocking(Scheduler.scala:115) ~[com.twitter.util-core_2.11-6.42.0.jar:6.42.0]
	at com.twitter.util.Await$.result(Awaitable.scala:151) ~[com.twitter.util-core_2.11-6.42.0.jar:6.42.0]
	at com.twitter.util.Await$.result(Awaitable.scala:140) ~[com.twitter.util-core_2.11-6.42.0.jar:6.42.0]
	at com.metamx.tranquility.tranquilizer.Tranquilizer.flush(Tranquilizer.scala:243) ~[io.druid.tranquility-core-0.8.3.jar:0.8.3]
	at com.metamx.tranquility.server.http.TranquilityServlet$$anonfun$doSend$3.apply(TranquilityServlet.scala:204) ~[io.druid.tranquility-server-0.8.3.jar:0.8.3]
	at com.metamx.tranquility.server.http.TranquilityServlet$$anonfun$doSend$3.apply(TranquilityServlet.scala:204) ~[io.druid.tranquility-server-0.8.3.jar:0.8.3]
	at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:108) ~[org.scala-lang.scala-library-2.11.8.jar:na]
	at scala.collection.mutable.HashMap$$anon$2$$anonfun$foreach$3.apply(HashMap.scala:108) ~[org.scala-lang.scala-library-2.11.8.jar:na]
	at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230) ~[org.scala-lang.scala-library-2.11.8.jar:na]
	at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) ~[org.scala-lang.scala-library-2.11.8.jar:na]
	at scala.collection.mutable.HashMap$$anon$2.foreach(HashMap.scala:108) ~[org.scala-lang.scala-library-2.11.8.jar:na]
	at com.metamx.tranquility.server.http.TranquilityServlet.doSend(TranquilityServlet.scala:204) ~[io.druid.tranquility-server-0.8.3.jar:0.8.3]
	at$metamx$tranquility$server$http$TranquilityServlet$$doV1Post(TranquilityServlet.scala:141) ~[io.druid.tranquility-server-0.8.3.jar:0.8.3]
	at com.metamx.tranquility.server.http.TranquilityServlet$$anonfun$4.apply(TranquilityServlet.scala:87) ~[io.druid.tranquility-server-0.8.3.jar:0.8.3]
	at com.metamx.tranquility.server.http.TranquilityServlet$$anonfun$4.apply(TranquilityServlet.scala:85) ~[io.druid.tranquility-server-0.8.3.jar:0.8.3]
	at org.scalatra.ScalatraBase$$scalatra$ScalatraBase$$liftAction(ScalatraBase.scala:270) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$invoke$1.apply(ScalatraBase.scala:265) ~[org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$invoke$1.apply(ScalatraBase.scala:265) ~[org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$class.withRouteMultiParams(ScalatraBase.scala:341) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraServlet.withRouteMultiParams(ScalatraServlet.scala:49) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$class.invoke(ScalatraBase.scala:264) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraServlet.invoke(ScalatraServlet.scala:49) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$runRoutes$1$$anonfun$apply$8.apply(ScalatraBase.scala:240) ~[org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$runRoutes$1$$anonfun$apply$8.apply(ScalatraBase.scala:238) ~[org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at scala.Option.flatMap(Option.scala:171) ~[org.scala-lang.scala-library-2.11.8.jar:na]
	at org.scalatra.ScalatraBase$$anonfun$runRoutes$1.apply(ScalatraBase.scala:238) ~[org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$runRoutes$1.apply(ScalatraBase.scala:237) ~[org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at scala.collection.immutable.Stream.flatMap(Stream.scala:489) ~[org.scala-lang.scala-library-2.11.8.jar:na]
	at org.scalatra.ScalatraBase$class.runRoutes(ScalatraBase.scala:237) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraServlet.runRoutes(ScalatraServlet.scala:49) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$class.runActions$1(ScalatraBase.scala:163) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$executeRoutes$1.apply$mcV$sp(ScalatraBase.scala:175) ~[org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$executeRoutes$1.apply(ScalatraBase.scala:175) ~[org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$executeRoutes$1.apply(ScalatraBase.scala:175) ~[org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$scalatra$ScalatraBase$$cradleHalt(ScalatraBase.scala:193) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$class.executeRoutes(ScalatraBase.scala:175) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraServlet.executeRoutes(ScalatraServlet.scala:49) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$handle$1.apply$mcV$sp(ScalatraBase.scala:113) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$handle$1.apply(ScalatraBase.scala:113) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$$anonfun$handle$1.apply(ScalatraBase.scala:113) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) [org.scala-lang.scala-library-2.11.8.jar:na]
	at org.scalatra.DynamicScope$class.withResponse(DynamicScope.scala:80) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraServlet.withResponse(ScalatraServlet.scala:49) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.DynamicScope$$anonfun$withRequestResponse$1.apply(DynamicScope.scala:60) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) [org.scala-lang.scala-library-2.11.8.jar:na]
	at org.scalatra.DynamicScope$class.withRequest(DynamicScope.scala:71) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraServlet.withRequest(ScalatraServlet.scala:49) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.DynamicScope$class.withRequestResponse(DynamicScope.scala:59) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraServlet.withRequestResponse(ScalatraServlet.scala:49) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraBase$class.handle(ScalatraBase.scala:111) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at$scalatra$servlet$ServletBase$$super$handle(ScalatraServlet.scala:49) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.servlet.ServletBase$class.handle(ServletBase.scala:43) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraServlet.handle(ScalatraServlet.scala:49) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at org.scalatra.ScalatraServlet.service(ScalatraServlet.scala:54) [org.scalatra.scalatra_2.11-2.3.1.jar:2.3.1]
	at javax.servlet.http.HttpServlet.service( [javax.servlet.javax.servlet-api-3.1.0.jar:3.1.0]
	at org.eclipse.jetty.servlet.ServletHolder.handle( [org.eclipse.jetty.jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]
	at org.eclipse.jetty.servlet.ServletHandler.doHandle( [org.eclipse.jetty.jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]
	at org.eclipse.jetty.servlet.ServletHandler.doScope( [org.eclipse.jetty.jetty-servlet-9.2.5.v20141112.jar:9.2.5.v20141112]
	at org.eclipse.jetty.server.handler.ScopedHandler.handle( [org.eclipse.jetty.jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle( [org.eclipse.jetty.jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
	at org.eclipse.jetty.server.Server.handle( [org.eclipse.jetty.jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
	at org.eclipse.jetty.server.HttpChannel.handle( [org.eclipse.jetty.jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
	at org.eclipse.jetty.server.HttpConnection.onFillable( [org.eclipse.jetty.jetty-server-9.2.5.v20141112.jar:9.2.5.v20141112]
	at$ [org.eclipse.jetty.jetty-io-9.2.5.v20141112.jar:9.2.5.v20141112]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob( [org.eclipse.jetty.jetty-util-9.2.5.v20141112.jar:9.2.5.v20141112]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$ [org.eclipse.jetty.jetty-util-9.2.5.v20141112.jar:9.2.5.v20141112]
	at [na:1.8.0_181]