Validating Redirects with Hyperlinks

Wednesday, December 2, 2015

I came across an application recently that contained an Unvalidated Redirect flaw. The flaw was pretty basic. The login page accepted a next parameter and blindly redirected to the value of the parameter without validating whether or not the value represented a trusted destination. The redirect occurred in client-side logic without the parameter ever hitting the server. My recommendation to the client included a pretty basic JavaScript validation filter, and they quickly implemented a fix and sent the code back for me to validate if the flaw had been remediated. In looking at the code, I realized that they had not implemented my recommended code, but did something that I had not seen before and thought was quite novel. Hence, why I am writing this.

The remediated redirect logic contained a call to a function that consumed the value of the next parameter and the hostname of the current location.

document.location = validate(next, document.location.hostname)

Pretty standard stuff. The interesting bit was in the called function.

function validate(n, c) { var r = document.createElement("a"); return r.href = n, r.hostname === c ? n : "/"; }

Which can be simplified to...

function validate(n, c) { var r = document.createElement("a"); r.href = n; return r.hostname === c ? n : "/"; }

Which can be further simplified to...

function validate(n, c) { var r = document.createElement("a"); r.href = n; if(r.hostname === c) { return r.href; } else { return "/" ; }}

I am including several versions of the same code because the first version can be quite confusing to folks that aren't familiar with the Comma operator.

Just in case you haven't picked up on it yet, let's look at exactly what is going on here. I am using the Chrome Developer Tools JavaScript console on the nVisium web page to demonstrate this if you want to follow along. Paste one of the functions above into the console and assign a value to a variable called next.

var next = "http://lanmaster53.com"

Now let's follow the logic of the first simplified version of the validate function to see how this works. The function first creates a hyperlink tag. Don't type the following into the console. I'll let you know when we're ready to continue with the demonstration.

var r = document.createElement("a")

Hyperlink tags accept an attribute called href that determines the destination of the browser when the hyperlink is clicked. The function then sets the href attribute of the dynamically created hyperlink to the value of the next parameter.

r.href = n

This is where it gets interesting. Like the document object itself, the hyperlink tag object has a hostname property. Once the hyperlink's href attribute has a value, the hostname property will contain a nicely parsed hostname for the assigned href. What the function is essentially doing is using the browser's builtin parser to break apart URLs in a consistent manner. Pretty cool, right?

All that's left for the function to do is compare the hostname of the document (provided to the function) and the hostname derived from the dynamically created hyperlink to determine whether the value of the next parameter is a safe location, in this case local to the application.

return r.hostname === c ? n : "/"

This is a Ternary operator that accepts a conditional expression that evaluates to true or false and returns one of two expressions based on the result. In this case, the function returns the value of the next parameter if the hostnames match, or the root of the web site if they do not, effectively restricting all redirects to locations local to the application.

Let's test the validation function with our values. Enter the following into the console to continue the demonstration.

validate(next, document.location.hostname)

The function should have returned /, which, when assigned to document.location would redirect the browser to the root of the website. Now change the value of next to something local and test.

var next = "https://nvisium.com/blog"
validate(next, document.location.hostname)

The function should have returned the value of next.

Using the browser's builtin parser to break apart URLs is pretty darn cool if you ask me. And you aren't limited to the hostname. Hyperlinks also have the origin property if you want to restrict URLs based on similar restrictions enforced by Same-Origin Policy. In any case, I thought this was worth sharing.

Regex: Regularly Exploitable

Thursday, June 11, 2015

Here's a quick demonstration of why Regular Expressions (regex) can be bad for implementing character whitelisting.

I was reading through an application security assessment report recently and noticed a recommendation for preventing Operating System Command Injection (OSCI) that implemented character whitelisting on a given file name through the following regex.

/^[\/a-zA-Z0-9\-\s_]+\.rpt$/m

At first glance, the regex seems legit, right? It attempts to match any combination of letters, numbers, dashes, underscores, slashes, and whitespace, ending with the ".rpt" extension. Already knowing that there was a flaw here (we'll get to that in a moment), I put together the following proof-of-concept to demonstrate the security (or in-security) of the filter.

<?php
$file_name = $_GET["path"];
if(!preg_match("/^[\/a-zA-Z0-9\-\s_]+\.rpt$/m", $file_name)) {
    echo "regex failed";
} else {
    echo exec("/usr/bin/file -i -b " . $file_name);
}
?>

I tried all the typical attack payloads, and sure enough, the regex prevented injection into the shell command. The key here, and why one must always use caution when implementing regex filters, is understanding what the \s character class represents. Most resources are vague and say that it includes "any whitespace character", but what does that include? In most regex implementations, whitespace includes [ \t\r\n\f], i.e. spaces, tabs, line breaks, and form feeds. See the problem yet?

Many testers don't think about the impact of line breaks when dealing with injections, but when we're dealing with shell commands, line breaks become very important. Consider the following attack payload.

/path/to/file%0Aid%0A.rpt

%0a is a URL encoded line feed/break (whitespace), so according to the regex, this payload is safe. However, what happens when you put this into a shell? Below is the output from copying and pasting the decoded version of the above payload into a terminal prompt.

# /usr/bin/file -i -b /path/to/file
ERROR: cannot open `/path/to/file' (No such file or directory)
# id
uid=0(root) gid=0(root) groups=0(root)
# .rpt
bash: .rpt: command not found
# 

Do you see what happened? Each line break started a new command and we can see that the shell executed our arbitrary id command. Here's what it looks like through a web interface.

So let's fix this. Show of hands for how many people think the below regex solves the injection issue? (I replaced the \s with a space .)

/^[\/a-zA-Z0-9\- _]+\.rpt$/m

If we use the same payloads as before, including the one that resulted in a successful injection, we can see that the issue has been resolved.

Or has it? Consider the following attack payload.

/path/to/file.rpt%0aid

What just happened?! Let's look at the new regex again.

/^[\/a-zA-Z0-9\- _]+\.rpt$/m

See that m at the end of the regex pattern? It means something. At the end of the regex pattern declaration in PHP (available in other frameworks as well, but may be declared differently) there is a spot for modifiers. Regex modifiers change how the regex engine applies the pattern to the string. Discussing the different regex modifiers is outside the scope of this article, but what we want to focus on here is that the filter pattern is using the multiline modifier (m is the flag for multiline). The multiline modifier basically changes the way the beginning (^) and end ($) of line characters behave. When the multiline modifier is absent, the ^ and $ characters act as the beginning and end of the string, as opposed to the line. This is an important distinction, because in the payload, we are able to leverage the multiline modifier's effect on the $ character and a line break to create a match. We can then add anything we want to the end of the string to execute arbitrary commands within the shell.

There are a couple of takeaways here.

First, be mindful of how you build whitelists. Be as explicit as possible. The higher the level of whitelist, the better. For instance, in the above example, the optimal solution would be to build a whitelist of complete file names that are allowed, and ignore regex all together. If the file names are not known and we need to whitelist at the character level, then we would need to build a better regex that accounts for what is included in all allowed character classes within the context of the filter, e.g. %0a and its significance to shell commands and the multiline modifier.

Second, Burp was not able to find this vulnerability when the scanner speed was set to Normal (default). It wasn't until I set the scanner speed to Thorough and hard coded the ".rpt" extension into the payload that Burp was able to find it. There is no replacement for thorough manual testing by someone that knows what they're doing.

A shout out to John Poulin, who taught me a thing or two about exploiting regex that ultimately lead to this article. Thanks John.

Recon-ng Update (v4.6.0)

Thursday, May 21, 2015

Recon-ng v4.6.0 is the largest single framework commit to date. Most of the changes are behind the scenes and won't effect the average user's experience. Below is a summary of the important changes, including the one change that will affect everyone: dependencies.

Dependencies

Starting with Recon-ng v4.6.0, the framework no longer includes dependencies within the code base. Rather, the framework requires users to install dependencies before the dependent modules can be used. I know, I know. I've been quite public about how much I dislike requiring the installation of 3rd party libraries for my tools to function. However, I recently came to the realization that managing the integration of other software packages within my own was a waste of time and effort. Package managers exist for a reason. So I elected to go against my own personal preference and enforce 3rd party dependency installation. For Kali users, this is seamless, as dependencies are installed alongside the framework. For those not using Kali, you'll want to use the Python Package Index (PyPI or PIP) to install the dependencies listed in the REQUIREMENTS file. Follow the guidance on the Recon-ng Wiki Usage Guide for installation instructions. I recommend using virtual environments (virtualenv) to install the required packages and prevent clutter in your local Python instance.

All current dependencies are module specific, so the core framework is still completely functional without the dependencies. The framework conducts a module dependency check at runtime and disables the modules that fail to load due to missing dependencies. The framework will provide a warning for each disabled module.

Module Changes

As always, the update includes lots of module changes. I added threading to several modules, made lots of bug fixes, removed a few defunct modules (jigsaw, breachalarm), and merged a few new modules (unmangle, fullcontact). In addition, all modules were updated to the new module template discussed below.

Developer Changes

The biggest changes in Recon-ng v4.6.0 are for module developers and include new structures for both the framework's core package and the module template.

The files that make up the framework have been rearranged into a Python package. Therefore, importing API elements is a bit different now. Browse the package structure to get an idea of how to access API elements. As the framework moves forward, internal functionality will be moved into modules that can be imported from logical locations within the package. Several utilities have already been moved. Also, in order to properly handle missing module dependencies, some internal functionality has been abstracted out into mixins. The mixins and utilities are located at "recon/mixins" and "recon/utils" respectively.

The framework now loads all of the modules into memory when the framework launches as opposed to when the user invokes the module. This was originally how the framework behaved, but in a lapse of judgment, the developer perspective was favored over the user's, which led to the poor design choice and current loading system. Loading the modules when the framework launches is slightly slower, but affords the opportunity to capture load errors in modules, disable them, and notify the user of issues before they begin using the framework. With this comes the reintroduction of the "reload" command. The reload command is available in the global context and reloads all of the modules without the developer having to restart the framework.

Recon-ng Home Page