Logging. Its the modern day tower of Babel.

So I'm using an EFK (Elasticsearch, Fluent, Kibana) stack for log management. Its a common combination (don't argue with me that logstash is better, its merely different). And as I've wandered down the rabbit hole of trying to get semantic meaning from log messages, I've learned a lot, and I'll share one of those tips with you.

So first, the problem domain. You have a fleet of well orchestrated containers, grouped into Pods, run from replication controllers, grouped under namespaces in Kubernetes. Its a big fleet, things come and go all the time and everyone is happy with that. But, they have needs. Audit: is someone hacking me? DevOps: is there a bug we haven't found that is causing crashes that are hidden by the restart mechanism? etc.

Now, log messages were usually designed to be read by the person who wrote the code on the hour it was written. They are very hard to assign meaning to generally.

The first problem is timestamps. For some reason not everyone got the memo about ISO8601. Stop with the "abbreviation of month in obscure language"_day-in-roman-numeral/year-of-the-pig type thing. Its YYYY-mm-dd. So you have to normalise a bunch of backwards dates. OK, no problem, we got this.

The next problem is the messages. Some are in JSON, some are raw-text. That raw-text might be 7-bit clean ASCII. It might be UTF-8. Maybe it has single quotes or semi-colons, so make sure you don't get a 'little bobby tables' problem here.

Now we get to the real issue. The developer might have put: "Ooops. This shouldn't occer<sic>. ERR: 5762" Or they might have e.g. an Nginx or apache-style log with the user-agent, url, etc, delimited by a weird sprinkling of spaces, quotes, and square brackets.  And you want to extract the fields to make them searchable (URI, CODE, METHOD, ...). What to do?

Well, enter fluent-bit. It runs pretty much anywhere, including as a DaemonSet in your Kubernetes cluster, watching all the logs go by. If you read the user-guide, the source-code, and use the force, you can add some semblance of meaning to this highly-entropic log world.

One of the tools that fluent-bit has is the 'annotation'. This allows you to say: "fluentbit.io/parser: foobar", and then have a foobar regex. Great. But... what about Istio? Istio uses a mutating web-hook to inject what are called sidecars (a 2nd container in your Pod, bolted to the first). Since in Kubernetes annotations go at the Pod level, not the container level, you can't simply add some magic as you install Istio (side note, my pull-request to Istio was accepted, so the helm chart now has these annotations settable for e.g. the ingressgateway). Istio uses a tool called 'envoy' to act as its dataplane, and logs messages in a custom format, and you can parse it with this mouthful (don't try to pronounce this):

^(?<host>[^ ]*) - \[(?<real_ip>[^ ]*)\] - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<last>[^$]*)

OK, that was easy. But how do we teach fluent-bit to apply this regex to the istio-proxy container (only)?

Well, first I added some code to fluent-bit (its all in the upstream now in various pull-requests and merges) to allow one to specify the container (and stream) in the annotation. But this still didn't help since the side-car injection ends up with no annotation.

So then I dug into the tag mechanism. And here is my solution (note it relies on a Pull Request not yet merged).

        Name             tail
        Path             /var/log/containers/*.log
        Parser           docker_no_time
        Tag              kube.<namespace_name>.<pod_name>.<container_name>
        Tag_Regex        (?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-
        Refresh_Interval 5
        Mem_Buf_Limit    5MB
        Skip_Long_Lines  On

        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Merge_JSON_Log      On
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
        Regex_Parser        kube-tag

        Name                parser
        Match               kube.*.istio-proxy
        Key_Name            log
        Parser              envoy

So what this does is first parse the container raw logs (which are in docker format, e.g. { time: x, log: y }. As part of the tail input I apply a regex to create the tag (this is not documented anywhere, this is from code inspection in fluent-bit) such that you end up with a tag of 'kube.namespace.pod.container'. This allows later route matching.

I then run the Kubernetes filter across all kube.*. This uses the Kubernetes API for enrichment, etc, and runs any parsers suggested by the annotation. For the istio-proxy container there is no suggested parser, so it does a Docker 'decode_as' which unescapes strings etc, but otherwise leaves the text in 'log'.

I then run a regex-based parser on kube.*.istio-proxy, e.g. all the istio-proxy named containers. And this unpacks the strings and goes on with life.

Sounds easy in this write-up. It wasn't. Enjoy.

Tagged with: , , ,

I'm doing a (guest) webinar for RootSecure on Wednesday Oct 24th @ 11:00 EDT (Toronto) time. You can register if you want to hear a bit about things that might surprise you as you migrate from a safe secure comfy closet to a big airy cloud.

For the last years I have been working on cloud, OpenStack, Google, Azure, Kubernetes. And I've learned a lot about things that surprised me.

As an example, you might be feeling very secure about your 'IP blacklist' firewall. But did you know in a cloud environment all your connections will come from your own IP space? Huh. Makes your audit simpler right? I mean, I guess there was no breach since the data was sent to!

Cloud comes in many flavours. There is SaaS (hosted applications like Dropbox, GSuite, Office365, Salesforce). There is PaaS (e.g. a managed database). There is IaaS (e.g. virtual machines and networks).

Cloud can be public (all instances are part of the public Internet, and run on shared infrastructure). It can be private (an elastic pool of capacity runs on hardware dedicated to you), and there is 'virtual private cloud' which emulates the latter on the former.

Traditional IT security architectures have emphasised 'perimeter security'. A set of VLAN's, controlled physical access, and firewalls at the door. The outside is evil, the inside is trusted. Suddenly a wild cloud appears and there is no longer an 'outside' or 'inside' or even 'door' to place the firewall. Is this your only concern?

Want to know more? Or challenge my conceptions and wild opinions masquerading as fact? Then join and comment away!

Tagged with: , , ,

Lets say you have a micro-services app. Its got a bunch of containers that you've orchestrated out with Kubernetes. Deployments, Pods, Daemonsets all over the place. Autoscaling. You are happy. Now it comes time to implement that pesky 'security' step. You are a bit nervous, there's no internal firewall, all the services listen on port 80, no encryption. All the passwords are hard-coded and in the global environment. No one would guess your l33t mysql password right? So you google 'how is secur networx'. And you click I'm feeling lucky.

Good thing for you google was watching your previous searches and had the microphone on, so it not only corrected your txt-speak spelling but also selected Istio for you.

But suddenly you need to triple the capacity of your cluster. Lets take a look. Here's kubectl top from my cluster. The lines in red are associated with the securing + auditing. See that last column? Seems we are using 8144MiB for monitoring the thing that is using 2259MiB. And don't get me started on the CPU.

I said it before, the cloud doesn't scale down.

$ kubectl top pod --all-namespaces

NAMESPACE      NAME                                                    CPU(cores)   MEMORY(bytes)   
default        ingress-nginx-ingress-controller-d77f4bd55-xv4h6        4m           146Mi           
default        ingress-nginx-ingress-default-backend-bc76f4cc7-fp6jk   0m           3Mi             
istio-system   istio-citadel-84fb7985bf-bb7cf                          0m           12Mi            
istio-system   istio-egressgateway-bd9fb967d-mxhgr                     2m           35Mi            
istio-system   istio-galley-655c4f9ccd-x9qqw                           13m          39Mi            
istio-system   istio-ingressgateway-7f5fdcf65-nxj9k                    3m           37Mi            
istio-system   istio-pilot-6cd69dc444-t465g                            8m           84Mi            
istio-system   istio-policy-77f6849776-6z8vq                           89m          419Mi           
istio-system   istio-policy-77f6849776-865xv                           97m          521Mi           
istio-system   istio-policy-77f6849776-d6bqn                           99m          492Mi           
istio-system   istio-policy-77f6849776-wc2nj                           62m          345Mi           
istio-system   istio-policy-77f6849776-xhwmw                           63m          351Mi           
istio-system   istio-sidecar-injector-8975849b4-47rx7                  13m          27Mi            
istio-system   istio-statsd-prom-bridge-7f44bb5ddb-zr25c               34m          23Mi            
istio-system   istio-telemetry-77f7949485-24xgl                        76m          440Mi           
istio-system   istio-telemetry-77f7949485-l69tn                        105m         559Mi           
istio-system   istio-telemetry-77f7949485-srjr5                        109m         525Mi           
istio-system   istio-telemetry-77f7949485-vbxdb                        106m         574Mi           
istio-system   istio-telemetry-77f7949485-wg8z5                        79m          437Mi           
istio-system   prometheus-84bd4b9796-fftw8                             51m          689Mi           
kube-system    cert-cert-manager-645675c97c-xxtfh                      2m           22Mi            
kube-system    heapster-6c4947855f-tjnw4                               0m           41Mi            
kube-system    kube-dns-v20-5fd69f859d-bxv6z                           18m          27Mi            
kube-system    kube-dns-v20-5fd69f859d-zmhmr                           18m          28Mi            
kube-system    kube-proxy-5rhch                                        3m           36Mi            
kube-system    kube-proxy-dxk9f                                        3m           42Mi            
kube-system    kube-svc-redirect-dpxvl                                 11m          156Mi           
kube-system    kube-svc-redirect-z5qhw                                 5m           110Mi           
kube-system    kubernetes-dashboard-66bf8db6cf-cwnbz                   0m           15Mi            
kube-system    metrics-server-64f6d6b47-jfc9l                          0m           26Mi            
kube-system    tiller-deploy-895d57dd9-ll66n                           0m           45Mi            
kube-system    tunnelfront-7794f9f645-dnvhx                            21m          16Mi            
logging        elasticsearch-867b4f6769-s4bzp                          567m         1420Mi          
logging        fluent-bit-56d6z                                        21m          11Mi            
logging        fluent-bit-8cbnl                                        17m          12Mi            
logging        logging-fluentd-69fdd5dbd6-mtsmm                        1m           59Mi            
logging        logging-kibana-76849979dd-m2vz9                         1m           152Mi           
logging        sysctl-conf-92l84                                       0m           0Mi             
logging        sysctl-conf-hb2vn                                       0m           0Mi             
monitoring     alertmanager-monitoring-0                               1m           15Mi            
monitoring     monitoring-exporter-kube-state-7c757d986f-75rrq         3m           37Mi            
monitoring     monitoring-exporter-node-57g9n                          1m           14Mi            
monitoring     monitoring-exporter-node-gdczp                          1m           10Mi            
monitoring     monitoring-grafana-59fcbfc89f-5qhzb                     0m           35Mi            
monitoring     monitoring-prometheus-operator-7d99d5f9c9-w82gv         2m           30Mi            
monitoring     prometheus-monitoring-prometheus-0                      7m           176Mi           
socks          carts-6994d7d589-6j5c2                                  5m           340Mi           
socks          carts-db-7dd64bfd7b-cld92                               5m           96Mi            
socks          catalogue-849865789c-k9pqn                              4m           47Mi            
socks          catalogue-db-6d66678d9-tfkzx                            3m           236Mi           
socks          front-end-855684fd8c-fv2qp                              4m           118Mi           
socks          orders-7d9cf5cb46-d2h2z                                 5m           350Mi           
socks          orders-db-6db4678bf6-6fx9j                              5m           93Mi            
socks          payment-6cdc5b656-84fbb                                 4m           48Mi            
socks          queue-master-7b99dbb88c-zrxst                           5m           301Mi           
socks          rabbitmq-7c5fbf778d-f5fbr                               7m           127Mi           
socks          session-db-fdd649d68-cq5sp                              3m           52Mi            
socks          shipping-5b9ffdbdfb-xnsjj                               5m           321Mi           
socks          user-84ccd5fd57-2vpb5                                   4m           47Mi            
socks          user-db-7dcc9649dc-b72rs                                4m           83Mi            


Tagged with: , , ,

So the other day I posted my pride and joy regex. You know, this one?

'^(?<host>[^ ]*) - \[(?<real_ip>)[^ ]*\] - 
(?<user>[^ ]*) \[(?<time>[^\]]*)\] 
"(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" 
(?<code>[^ ]*) (?<size>[^ ]*) 
"(?<referer>[^\"]*)" "(?<agent>[^\"]*)" 
(?<request_length>[^ ]*) 
(?<request_time>[^ ]*) 
\[(?<proxy_upstream_name>[^ ]*)\] 
(?<upstream_addr>[^ ]*) 
(?<upstream_response_length>[^ ]*) 
(?<upstream_response_time>[^ ]*) 
(?<upstream_status>[^ ]*) (?<last>[^$]*)'

Seems simple, right? But, it leads to a set of questions:

  1. If we can get a " in the path, we can do a quoting-style-escape to avoid getting logged
  2. The regex engine used in fluent-bit is onigmo. And it has some CVE. This means its conceivable that a pattern that a user can put on the wire can escape into our trusted most privileged logging container (running privileged, node filesystem mounted, etc)
  3. DDoS. We log a lot. But the logs are often bigger than the thing they are logging.

For #3, consider this. Its a connection log from istio. Yes you read that right, a TCP SYN( ~64 bytes) creates this in JSON of 816 bytes:


Hmm, so you are seeing where I am going. You remember a few years ago where we found that NTP could be asked its upstream list? So a small packet would create a large response? And, being UDP, could be spoofed, so the response could go to someone else? Making it a great DDoS source.

Well, my log. Your SYN costs me a lot more to receive than it costs you to send. Think of all the mechanisms below that (elasticsearch, fluent-bit, kibana, storage, network, cpu, ram, ...).


Now about #2. That is a bit of a trouble point. Who wants to find that the regex that is parsing the field that any user can send you via netcat is itself prone to a crash, or remote escape? Not me.

Tagged with: , , ,

A lot of my background is telecom related. And one of the things telecom is very good at is alarm + fault management. Its an entire industry devoted to making sure that when something goes wrong, the right action happens immediately. That when things trend or fluctuate they get caught.

Its complex and a lot of work to integrate your new system up, you have to model all your alarms and faults, etc. But when its done, its bulletproof. And you get a UI (like on the right) where all vendors have a normalised view. The workflow is managed, trending is there, etc.

Enter the cloud. It substitutes (unformatted) logs for faults. It substitutes performance management (time-series trending) for alarms. And it pretends to be self-healing, except when its not. It substitutes volume of data for information. I want a red-light/green-light, not a sewer pipe.

Have you ever looked in your cloud logs? You'll find such gems as:

"time="2018-09-17T12:13:43Z" level=error msg="ERROR: infiniband collector failed after 0.030957s: strconv.ParseUint: parsing "N/A (no PMA)": invalid syntax" source="collector.go:123""

OK, so what am I to do this with this? It happened a few times, somewhat irregularly. Is it a fault (a system error that clears itself when the underlying condition is removed, e.g. over-temperature) or an alarm (something I must take affirmative action and acknowledge). There's no workflow here, i'm on my own, free-floating.  The log system tells me everything, and yet also nothing.

What about this one:

"2018/09/17 12:19:26 [error] 8#8: *758040 upstream prematurely closed connection while reading upstream, client:, server: _, request: "GET /api/v1/namespaces/kube-system/services?resourceVersion=344761u0026timeoutSeconds=438u0026watch=true HTTP/2.0", upstream: "", host: """

Its all over the logs. It comes from 'azureproxy' which is some sort of method to get from my Kubernetes pods to the (managed) master.

And this gem. Its from heapster, which is installed as part of the managed AKS. What am I to do?

"ERROR: logging before flag.Parse: E0917 12:19:26.569980 1 streamwatcher.go:109] Unable to decode an event from the watch stream: stream error: stream ID 1499; INTERNAL_ERROR"

So now I'm left uncertain, are these 'errors' things I need to address? if not, what errors are? How do I filter them out? Or trend to see if they are one-of events or long term issues? As I hire more operations staff, how do I teach them to read all the various and sundry logs, I want them to look @ red/green light charts like that nagios one above.

Tagged with: , , ,