Contents

We’ve discovered that SQL injection is to this day not a fully solved problem, even in most popular frameworks. In this post, we’ll explain how these frameworks fail at escaping parts of a query, culminating in the discovery of a critical vulnerability in the popular Laravel framework which affects a large percentage of applications.

Let’s start with an innocent example, which provides the starting point of our journey. This is a typical simple use case: a filterable, sortable list.

# From safe to vulnerable in one step

In FuelPHP, a (much simplified) controller action to fetch a JSON result might look like this:

<?php
public function get_products()
{
$conditions = array('where' => array(array('deleted', '=', false)));$minprice = \Input::get('minprice', null);
if (isset($minprice))$conditions['where'][] = array('price', '>=', $minprice);$result = Model_Product::find('all', $conditions); return$this->response($result); } This would return products, optionally filtered by a minimum price. It uses the input directly, without checking whether the price is syntactically a valid number. Now, that’s pretty lazy, but regardless of not validating, this code is protected by the framework against SQL injection. If the input is non-numerical, and we’re using a sane database, this would cause a data constraint error, resulting in an exception being thrown in PHP. It can be argued that this is fine: there is nothing sane you can do with bogus input, so returning a HTTP status of 500 is acceptable. The main point is: this code is not vulnerable to SQL injection. Now, let’s suppose we wanted to allow the user to sort the product list by any field: <?php public function get_products() {$conditions = array('where' => array(array('deleted', '=', false)));
$minprice = \Input::get('minprice', null); if (isset($minprice))
$conditions['where'][] = array('price', '>=',$minprice);

return sprintf('"%s"', $name); } The implementations in CakePHP, CodeIgniter and FuelPHP before the patch were the worst of all. They were so incredibly large and complex they couldn’t possibly be safe. Just have a look at the name function from CakePHP or the escape_identifiers function from CodeIgniter. I certainly wouldn’t be able to confidently say these functions are secure, even if they did fix all the issues in them. It’s not even clear what exactly they’re trying to do, there’s just too much code getting in the way. And overly clever code, at that: I blinked and had to read the CodeIgniter function several times before I even understood the basic idea. It looks like these frameworks are trying to “do what I mean”. For example, they attempt to quote identifiers which occur inside aggregate expressions, like SUM("x"), and they split x.y into "x"."y". Unfortunately, this causes ambiguities: what if you have a column with the name SUM(x), or x.y, however unlikely? Instead, these functions should be doing only one simple thing: quote the identifier safely for use in a query. Arbitrary SQL fragments can be made in FuelPHP using DB::expr, so there’s already a provision for inserting a literal SUM(x) into a query. This obviates the need for more magic in the identifier quotation function. Finally, if it’s decided that it really is the user’s responsibility to avoid “strange” column names, and identifiers will never get quoted correctly (which is a cheap cop-out, in our opinion), it should at least be mentioned clearly in the documentation. Most of the frameworks we looked at simply didn’t mention anything about this in the manual or API docs. A simple statement along the lines of “only use trusted input as identifiers: they are not always escaped correctly” should be enough. ## It gets worse The above example is a bit of a stretch, because querying on a user-controlled field name wouldn’t (or shouldn’t?) occur much in practice. However, thanks to a tip from Alex Soban we started looking at “create” and “update” functions as well. This led to the discovery that in Laravel, incorrect identifier quotation caused a much bigger problem: its ORM uses the field quotation function for INSERT and UPDATE queries, without checking that the fields exist. This means that the following little snippet opens up a huge vulnerability in your application: <?php$u = User::find($id);$u->fill($_POST);$u->save();

If the POST body contains a key/value pair where the key is email"='foo@example.com',"password, Laravel will simply place quotes around it, which results in the following query:

    UPDATE "users" SET "email"='foo@example.com',"password" = ? WHERE "id" = ?

This allows an attacker to update the password via the first positional argument, which will be filled in with the value corresponding to that weird key in our POST. This works even if you have a $guarded clause in your user model which should prevent that, like this: <?php protected$guarded = array('id', 'password');

The only way to fix this is by using a whitelist instead of a blacklist (which is recommended as a security best practice, anyway):

<?php
protected $fillable = array('name', 'email'); Please note that all models must use $fillable: if there’s even one model that doesn’t, an attacker can insert the result of arbitrary subqueries into records, which allows them to systematically scrape your entire database.

The other frameworks generally check the list of supplied values against the list of columns in the database, and only store those. That means that they can’t be attacked in this particular way, but programs using attacker-controlled column names in queries like we’ve seen in the introduction are still vulnerable. Let’s see how bad the situation is for those frameworks.

First off, there aren’t many cases when arbitrary attacker-controlled strings are sent as identifiers in a query. Furthermore, if it is, it is most likely going to be in a column argument for an ORDER BY clause. This is one of the final clauses that occur in a SELECT query, and the SQL grammar allows only a handful of trailing clauses to an ORDER BY: only LIMIT/FETCH and OFFSET are allowed there, and an optional locking clause. Anything else will cause a syntax error to be signaled: you can’t put a UNION following the ORDER BY, for example (which is one of the most common ways to exploit SQL injection bugs), but you can order by a subquery expression, which makes for an information leak which can be exploited by enumerating things of interested.

Of course, if the SELECT happens to be embedded in a larger query (as a subquery or a common table expression), or when the query procedure being used allows for multiple SQL commands to be sent off to the database (separated by semicolons), there will be plenty of opportunity to abuse it. And of course there may be some (future?) extension of SQL that will allow trailing query commands to be embedded. Even then, there’s always someone more clever than you who may come up with creative ways to squeeze out more information from the database through the ways described above.

# Fixing the problem

Of course, we contacted the various vendors which we found to be affected by this problem. It proved to be more difficult than expected: The Laravel, Lithium and CodeIgniter frameworks did not provide any security contact information on their websites, and when we asked on IRC, we were suggested to file a ticket on GitHub. In our opinion, posting a security issue on a public issue tracker is not a responsible way of disclosing vulnerabilities!

In the end we decided to contact Lithium and CodeIgniter through the websites of the companies behind them. In the case of Laravel we contacted its original author. As fortune would have it, we reported the issue one day before Laracon 2014. He was unavailable due to this conference, so our e-mail went unseen for 5 days. This shows why it is important to have a security team, with contact details clearly posted on the framework’s website.

Since reporting this issue, Laravel has fixed the issue and updated the documentation to more clearly state this pitfall, FuelPHP has fixed the issue, CodeIgniter and CakePHP documented it as a potential developer pitfall. The Lithium core team are still deliberating on what to do.

We’d like to tip our hat to Thijs from WhiteHats for discussing with us how to best report the issue with so many frameworks, and for contacting the Drupal team about improving their documentation about this issue.

Surprisingly few frameworks quote identifiers correctly. To find frameworks which do it correctly, we looked outside the PHP ecosystem, and even there we found that only Ruby on Rails correctly quotes identifiers throughout. Even Django doesn’t quote correctly, but it consistently checks all attributes against database columns which mean it’s probably not vulnerable. Of course, we haven’t been able to look at all frameworks (there are so many of them!), so we would like to suggest that you investigate how your framework of choice does it. While you’re at it, of course you’ll need to know how to properly fix this issue, so let’s finish up by explaining that.

## How to correctly quote identifiers

It turns out that quoting identifiers in plain ANSI SQL under an ASCII character encoding is as simple as quoting values in ANSI SQL with ASCII. As you may know, values must be wrapped in single quotes, with embedded single quotes doubled. For identifiers, the same happens, but with double quotes. So if the value O'Neill gets quoted as 'O''Neill', the identifier This is the "simplest" thing gets quoted as "This is the ""simplest"" thing".

Unfortunately, there are some complications. As you may know, escaping values should be done via PQescapeStringConn() or mysql_real_escape_string() and not via the deprecated PQescapeString() or mysql_escape_string(). The reason is that you can’t treat strings in multi-byte character encodings one byte at a time, because that way you could be blindly replacing bytes within a character: after reinterpretation of those bytes as characters in the chosen encoding, your carefully quoted string might suddenly not be quoted anymore. The database client library knows which encoding is being used to communicate with the server, so it knows how to treat multi-byte characters correctly.

To this end, PostgreSQL’s libpq offers the PQescapeIdentifier() function. Unfortunately, neither MySQL’s nor SQLite’s client libraries offer a function for escaping identifiers. In the corporate enterprise™ world, it gets worse: in the Oracle database it apparently is impossible to safely quote identifiers! The documentation states “neither quoted nor nonquoted identifiers can contain double quotation marks or the null character (\0).”. Microsoft SQL Server has two ways of escaping, the regular ANSI way (which, like with MySQL, is only enabled if an option is set) and its proprietary way of wrapping the identifier in square brackets. The documentation doesn’t mention it, but you can escape square brackets within identifiers by doubling the closing bracket. This can be gleaned from the documentation on the in-database identifier quotation function, which doesn’t seem to be available in client libraries.

Because most DB client libraries have no identifier quotation function, most DB-agnostic abstractions don’t offer it either. This includes PDO, which is currently the most popular PHP database abstraction library. By offering only the lowest common denominator, the power to correctly quote identifiers is taken away. Instead, the library could emulate identifier escaping if it enforces a sane character encoding, and throw an exception in cases where a special character occurs in the string and escaping it is truly impossible (for example when quoting for the Oracle database).

Luckily, it turns out that in practice, multi-byte encodings are rarely used. The most used encoding is probably UTF-8, which also happens to be somewhat self-correcting, so the multi-byte issue might not be that big of a deal, especially because the encoding can usually not be changed by an attacker. So in practice, escaping the quotation ASCII character at the byte level is “good enough”.

Since most modern code uses parameterised queries or prepared statements, the risk of SQL injection is much lower than it used to be. Unfortunately, you can only pass values as parameters; there is no provision for parameterising identifiers, which would solve the problem more thoroughly.

All in all, there’s still some work to be done by the database vendors. The underlying problem, as I’ve argued before, is that we’re trying to build up strings containing statements of a full-fledged query language, with embedded user input. If the database vendors provided a way to bypass SQL strings, and instead provided direct access to the underlying abstract syntax tree, all SQL injection problems would disappear in one fell swoop.