That stuff's great. But do you know why he's on my list? He doesn't seem to think it's a good idea for a shell to parse its entire environment; he seems to think that its better to use the name of an environment variable than its value to determine if it's input to the shell or not; he seems to feel somewhat less at ease with the thought of millions of HTTP request headers passing through the bash parser, relying on its myriad of parsing rules to ensure that no part of them is accidentally executed as a shell command - now or tomorrow.
I must say that this shellchock thing has shaken my belief in humanity somewhat... Why is it that these are not obvious principles that we all agree on? Am I just too "oldschool" or something? :)
Part of this is reflected in things like "Robustness Principle". This encourages programs to go out of spec and be creative with inputs to make things "easier"... but it only makes things worse.
SIP, for instance, has very complicated parsing rules (like HTTP) because the messages are designed to be written by hand. This introduces security holes as well as interop. But instead of recognizing this and trying to get programs to be strict, the IETF publishes a document where they gleefully list a bunch of crazy things possible in their spec, and encourage programs that guess at the intent of malformed messages.
I'm currently dealing with SIP messages at work. My manager actually told me not to be so pedantic with the parsing as we got some complaints from other vendors (and privately, I thought: why bother with standards if you aren't going to follow them?) I ended up with logging a warning instead of returning an error. Sigh.
You can't actually parse SIP messages unambiguously these days. Even big open source projects like Asterisk and OpenSIPS have strictly different and incompatible interpretations. I'm purely talking about parsing, not actually interpretation of the content. It's that bad. And yes, this opens up security holes when one system is trusted to read the message correctly.
I think there is a fairly convincing argument here that a lot of these bugs might not have occurred if Bash used a hand-written, recursive-descent parser instead of a Yacc autogenerated one:
- The original ShellShock: Reused the general parsing/execution function for the subset task of evaluating a function definition. A hand-written parser would likely have this isolated to its own function. With a Yacc parser, there is neither an obvious nor easy way to call directly into the parser to parse a "nonterminal".
- The ones in the article having to do with error handling: traditionally, error handing in autogenerated parsers has been more difficult and Yacc is no exception. This is related to the first point in that a function for only parsing the function definitions would never execute its input, even upon encountering an error; the usual abort behaviour means the parser stack unwinds until the original caller - in this case the code parsing environment variables for function definitions - recognises the error, outputs an error message, and in this case, ignores the definition.
- The operation of a recursive-descent parser is simpler to trace through manually since each piece of the grammar is logically broken into separate functions, making it easier to audit than the more opaque flow of a Yacc one.
The real problem is that Bash has a huge attack surface and isn't written with security in mind. With the number of new exploits found here without too much effort, there are almost certainly more yet to be found. (Or already known to attackers.)
Yup; as I mentioned before in a previous ShellShock article, it seems like if you want to audit a system against this type of attack, you'd need to examine how every packet coming into the system is handled and what data from that could set envs or invoke a shell.
I have a feeling we'll be seeing different variations of this type of attack for a long time now that people are thinking about this type of vulnerability. The ways that this type of thing can be combined with other exploits are both exciting and terrifying.
> ...if you want to audit a system against this type of attack, you'd need to examine how every packet coming into the system is handled and what data from that could set envs or invoke a shell.
Really? Don't you think its easier to audit the system to ensure that nobody's parsing the entire environment as if it was shell script / function definitions, threatening to execute parts of it as shell commands?
Here we're talking about a system made up of several different Unix processes, passing data between them through environment variables, command line arguments, input and output. This seems to affect how people think about the problem in an unfortunate way... Pretend instead that all this happened within one process, e.g. a Ruby program - who would then be responsible for the vulnerability - person A who stored a HTTP request header in a variable or person B who decided to call eval() on that variable?
To me it's quite obvious that person B is at fault.
True. Even shells that are written with security in mind are bound to have some similar bugs, though perhaps not as easy to discover or exploit. I think the bash bugs have exposed some general flaws in the way we continue to trust input both from the internet and between processes in general.
Bash is doing a crazy about of stuff under-the-hood. I spent a lot of time over the weekend inside gdb trying to trace down one weird behavior that appeared when I added a single printf(), and I got taken on a very wild ride.
(It wasn't necessarily a security issue, but I figured I'd go exploring to learn some bash internals.)
bashcheck https://github.com/hannob/bashcheck/blob/master/bashcheck includes now also tests for the last 2 lcamtuf bugs, so it tests now for 6 known vulns from this shellshock series, plus a general test for variable function parser safety.
1. https://news.ycombinator.com/item?id=8378515