Nick Howard

Lightning Talk: Muskox

Lightning Talk: Muskox

by Nick Howard

In this lightning talk titled "Muskox," presented at the Rocky Mountain Ruby 2013 event, speaker Nick Howard introduces his new parser generator named Muskox, which aims to enhance security in parsing various input formats. Howard begins by discussing the utility of parser generators, mentioning well-known examples like Yacc and Bison. He highlights critical security concerns, particularly in frameworks like Rails, which have faced vulnerabilities due to unusual input scenarios.

Throughout the talk, Howard addresses specific security issues, providing notable examples:

  • YAML Embedded in XML: This situation can lead to the evaluation of unsafe expressions, posing significant risks.
  • Billion Laughs Attack on XML: This attack exploits XML parsing by creating input that leads to an arbitrary expansion of expressions, resulting in out-of-memory errors.

These examples emphasize the necessity for frameworks to take a cautious approach to input handling. Howard then introduces Muskox as a schema-based solution that leverages JSON schemas to define valid input structures, including typing information and validation rules.

Key features of Muskox include:
- Schema-based validation: Input that does not conform to the defined JSON schema will trigger an error, preventing potentially harmful input from being processed.
- Conservative output policy: By rigorously defining acceptable inputs, applications can avoid inadvertently executing harmful code.

Howard elaborates on the principle from language security theory—"be liberal in what you accept but conservative in what you output." This principle is vital in ensuring that only safe, defined inputs are accepted, effectively reducing the risk of attacks.

For additional insights and resources, Howard invites the audience to visit his GitHub page for Muskox and to explore the broader context of language security at langsec.org. The talk concludes with an encouraging note for developers to prioritize security in their input management processes, endorsing the proactive use of schema validation as a powerful method to combat vulnerabilities.

00:00:08.480 Hello everyone! Today I'm excited to talk to you about parser generators. Perhaps you have heard of Yacc and Bison? Well, I'm here to introduce a new parser generator that I've created called Muskox.
00:00:16.880 Parser generators are fantastic tools. Their primary purpose is to generate parsers that can handle various input formats. However, security is a crucial aspect we all care about. Recently, there have been several security issues in frameworks like Rails caused by unusual inputs.
00:00:36.480 An example of such an issue is when users send YAML embedded within XML. This can lead to the evaluation of unsafe expressions, which is problematic. Another concerning case is the billion laughs attack on XML, which allows for an arbitrary expansion of expressions, leading to out-of-memory errors. Frameworks often claim to parse any input sent to them, but that's not a prudent approach.
00:01:10.720 So, how do we address these concerns? Muskox offers a solution. It is a schema-based parser generator that uses JSON schemas. JSON schemas define the structure of JSON objects, including typing information and validation rules. Muskox generates parsers based on these schemas, ensuring that any input that doesn't conform to the defined schema will trigger an error.
00:01:29.680 For instance, if your schema defines two properties that are both strings, the parser will successfully return a hash when the input is valid. However, if the input contains unexpected keys attempting to exploit your server, the parser will raise an error before your application processes it, effectively blocking malicious input.
00:02:10.080 Muskox is based on concepts from language security theory. A key principle in this area is to be liberal in what you accept but conservative in what you output. This is encapsulated in the advice to be definite about what you accept. By adhering to definite schemas, we can mitigate the risk of inadvertently processing harmful input.
00:02:47.040 If you’re interested in learning more, I encourage you to check out my GitHub page for Muskox and explore the fascinating field of language security at langsec.org.