Thursday, November 15, 2012

TLD Generator

I recently found a very useful project if you are building a Java library that contains tags or functions that you would like to expose within a JSP. TLD Generator - this tool allows one to configure your TLD by annotating your code and then it generates the TLD using an annotation processor. I thought that this was very useful so I worked with the author to get this published to maven central. If you wanted to use this in your maven project simply add a dependency like this:
<dependency>
    <groupId>com.google.code.tld-generator</groupId>
    <artifactId>tld-generator</artifactId>
    <version>1.1</version>
    <scope>compile</scope>
    <optional>true</optional>
</dependency>
The option element is key here in that Maven will not include the tld-generator as a dependent library - but it will put it on the class path during compile time so that the annotation processor will run. See the documentation on the TLD Generator's wiki for information how to annotate your code to generate the TLDs.

Wednesday, November 7, 2012

Remediation of XSS: Nested Contexts (part two)


In part one of this series I covered the correct use of JavaScript encoding and how this already covers the issue of the “nested” contexts. Now, onto a better solution – don’t use HTML Event attributes! Given the vulnerable code:
<div onclick="showError('<%=request.getParameter("error")%>')">
An error occurred, click here to see the details</div>
Instead of adding encoding to a complicated location within the DOM like the onclick event attribute - hook all of your events via JavaScript (example below uses JQuery):
<div id="errorBanner">An error occurred, click here to see the details
   <div id="errorDetails" style="display:none">
      <%=Encode.forHtml(request.getParameter("error")) %>
   </div>
</div>
<script type="text/javascript">
   $('#errorBanner').click($('#errorDetails').show());
</script>
The key here is to avoid placing dynamic data into "nested contexts" such as an event handler. This makes the remediation much simpler in many cases and lowers the amount of security knowledge a developer needs to understand how to fix the vulnerability.

The additional benefit of using JS to hook your events is that you can then externalize your JavaScript and define a Content Security Policy (CSP) for you site. CSP is by no means a magic bullet – but restrictive CSP policy can limit the damage potential of an XSS exploit.

Remediation of XSS: Nested Contexts (part one)

I have seen some solutions for XSS involving nested contexts that are not ideal. Partly because they are complicated and require a deep understanding of how the browser processes the HTML/DOM and they are likely inefficient; there are better solutions. This is the first post in a two part series.

First, what do I mean be nested contexts? Some examples would be writing dynamic data into an event handler such as onclick.
<div onclick="showError('<%=request.getParameter("error")%>')" >An error occurred, click here to see the details</div>
When the browser processes this it will first HTML decode the contents of the onclick attribute and then it will pass the results to the JavaScript Interpreter. As such, the advice I have seen (and previously given) is to apply “layered” encoding: 1) JavaScript encode and then 2) HTML Attribute Encode:
<div onclick="showError('<%= Encoder.encodeForHtml(Encoder.encodeForJavaScript( request.getParameter("error")%>')))" >An error occurred, click here to see the details</div>
Wow – that is completely unfriendly. Additionally, after thinking about this solution it seems unnecessarily complicated if you are using a robust JavaScript encoder (i.e. one that will JavaScript encode the '&' character) – then even though you are in an “HTML Attribute” context the following should be sufficient:
<div onclick="showError('<%= Encoder.encodeForJavaScript( request.getParameter("error")%>'))" >An error occurred, click here to see the details</div>
The reason that the above is sufficient is the Encode.forJavaScript will encode the '&' character so that when the browser HTML Decodes the attribute there is nothing to decode. However, this only applies to robust JavaScript encoders.

The encoder used above is from the OWASP Java Encoder Project.

Next post will cover refactoring the use of nested contexts rather than just encoding the data. As we will see, this has some very nice benefits.

Tuesday, January 10, 2012

Content Security Policy (CSP)

Content Security Policy (CSP) is a technology, that at the time of writting this is still a working draft, which will allow a web page to limit where external content can be loaded from. It allows the web page to define which domains image files, CSS files, JavaScript files, etc. can be loaded from. Additionally, inline JavaScript and style is not allowed; all JavaScript and style must be externalized. This externalization of JavaScript and style is the one feature I am most excited about; more on this later.

CSP has been discussed by others indicating it is not a complete solution to the XSS/Content Injection problem. A couple of the better posts about this are Postcards from the post-XSS world by Michal Zalewki and HTML scriptless attacks by Gareth Hayes. Both of these posts discuss what can be done with XSS/Injection Attacks that don’t require JavaScript. Even the introduction of the CSP draft states that it is “not intended as a first line of defense against content injection vulnerabilities.” Morale of the story, CSP will definitely help, but developers still need to validate input and encode output.

So why am I a big fan of CSP, specifically with regards to having to externalize JavaScript? Doing this means that several of the complicated encoding scenarios go away. Knowing what type of encoding to perform becomes, in the majority of cases, simple again. You no longer have event attributes such as onclick which will HTML decode the value of the attribute before passing the data to the JavaScript interpreter. With JavaScript externalized you no longer have the “javascript:” protocol used in HREFs, which are HTML Attribute decoded prior to determining what protocol is in use, then because the javascript protocol is used the data is URL Decoded prior to being passed to the JavaScript Interpreter. Basically, with JavaScript and style externalized you don’t have nested contexts where the web page is passing data through several different parsers!

Given the lack of nested contexts in the rendered HTML doesn’t mean all the XSS/Content Injection issues go away. Data still needs to be correctly encoded for whatever context it is being written to (see the OWASP XSS Prevention Cheat Sheet). Additionally, when using JavaScript to manipulate the DOM – this still needs to be done correctly using safe APIs such as setting innerText instead of innerHtml; there are several other articles discussing this so I’ll leave it at that.

I also believe there will be issues with dynamically generated style sheets and JavaScript (e.g. the CSS and JavaScript being generated by templating technologies such as JSP so that dynamic data can be inserted). However, I still believe overall it becomes easier to choose the correct encoding and filtering techniques when writting data into these types of files.
CSP can limit the damage potential of XSS/Content Injection Attacks (some) and it requires cleaner HTML which makes writing pages that utilize the correct encoding easier. This cleaner HTML will hopefully lead to less XSS/Content Injection vulnerabilities.

--Jeremy