In 2016 a vulnerability came to light in Apache struts, a common framework used in Java-based web-apps. In it an attacker could malform a header (really any header) and end up taking over your Java VM. Not so good. E.g. you were one curl away from badness.
Well, another struts issue arrived. Same sort of thing.
You might be wondering why, in this day an age, a web-server can fall over when presented with a wee-bit of text to parse? Lets look at an example, HTTP headers are more complex than you think, and the effort spent understanding them is less than it should be.
Nginx, one of the most popular forward & reverse proxies out there. There is a header called ‘Forwarded’ (from RFC 7239). It has this format:
node = nodename [ ":" node-port ] nodename = IPv4address / "[" IPv6address "]" / "unknown" / obfnode IPv4address = <Defined in [RFC3986], Section 3.2.2> IPv6address = <Defined in [RFC3986], Section 3.2.2> obfnode = "_" 1*( ALPHA / DIGIT / "." / "_" / "-") node-port = port / obfport port = 1*5DIGIT obfport = "_" 1*(ALPHA / DIGIT / "." / "_" / "-") DIGIT = <Defined in [RFC5234], Section 3.4> ALPHA = <Defined in [RFC5234], Section B.1>
So there is more cases than you would think. Even just what is a digit/what is alphanumeric. You have to worry about different character sets, some of which have multi-byte encoding. An IP? ipv6 has much more complex parsing than you would think.
This header was more or less designed for nginx. So how do they recommend parsing it? Well, keep in mind that Forwarded can be multiple (Forwarded: 1; 2; …) in a chain. And this is their example, no foolin. You spot an error in that regex line? Me either, but its so complex that it would belie belief that there was not one. And this is how that Struts framework bug occurs. You make something so complex that everyone’s eyes glaze over. Yes that is a 627-character regex.
map $remote_addr $proxy_forwarded_elem { # IPv4 addresses can be sent as-is ~^[0-9.]+$ "for=$remote_addr"; # IPv6 addresses need to be bracketed and quoted ~^[0-9A-Fa-f:.]+$ "for=\"[$remote_addr]\""; # Unix domain socket names cannot be represented in RFC 7239 syntax default "for=unknown"; } map $http_forwarded $proxy_add_forwarded { # If the incoming Forwarded header is syntactically valid, append to it "~^(,[ \\t]*)*([!#$%&'*+.^_`|~0-9A-Za-z-]+=([!#$%&'*+.^_`|~0-9A-Za-z-]+|\"([\\t \\x21\\x23-\\x5B\\x5D-\\x7E\\x80-\\xFF]|\\\\[\\t \\x21-\\x7E\\x80-\\xFF])*\"))?(;([!#$%&'*+.^_`|~0-9A-Za-z-]+=([!#$%&'*+.^_`|~0-9A-Za-z-]+|\"([\\t \\x21\\x23-\\x5B\\x5D-\\x7E\\x80-\\xFF]|\\\\[\\t \\x21-\\x7E\\x80-\\xFF])*\"))?)*([ \\t]*,([ \\t]*([!#$%&'*+.^_`|~0-9A-Za-z-]+=([!#$%&'*+.^_`|~0-9A-Za-z-]+|\"([\\t \\x21\\x23-\\x5B\\x5D-\\x7E\\x80-\\xFF]|\\\\[\\t \\x21-\\x7E\\x80-\\xFF])*\"))?(;([!#$%&'*+.^_`|~0-9A-Za-z-]+=([!#$%&'*+.^_`|~0-9A-Za-z-]+|\"([\\t \\x21\\x23-\\x5B\\x5D-\\x7E\\x80-\\xFF]|\\\\[\\t \\x21-\\x7E\\x80-\\xFF])*\"))?)*)?)*$" "$http_forwarded, $proxy_forwarded_elem"; # Otherwise, replace it default "$proxy_forwarded_elem"; }
Now, why did I say your cloud firewall was loafing? Well, its a 1-tuple firewall. It allows/denies based on port# alone. Its not getting involved with those headers, normalising etc. Its not involved in the middle tier of your app (user->load-balancer->front-proxy->sidecar->front-end->backend is the normal chain of events, what if its ‘backend’ that has the vuln?).
Leave a Reply