BlogElasticSearch gotchas

Elastic Search

ElasticSearch allows you to perform complex queries, that would be too difficult to indexate for RDBMS.

For example financial instruments with trading details, that might contain up to 200 fields.

When you use RDBMS, ther'a two options.

  • 1. To store XML of instrument in one column and use XPATH for filtering. That would be too slow, as XPATH uses regular expressions.
  • 2. To create 200 columns and then create index for most common queries. That would be rather inefficient as well.

ElasticSearch can be used as a database in this use case. It offers you possibility to perform complex queries at almost constant time. It is scalable, it has HTTP interface, it allows you to create data types on the fly, etc, etc.

But is ther'a any drawbacks ? We have hit the following.

  • 1. Multiple simultaneous updates of indices can corrupt your DB.
    If two applications update dynamic mapping at the same time, then the data format, that ElasticSearch returns is unpredictable and will most probably break logic in your application. So you would need to create some sort of middleware for ElasticSearch, that would restrict amount of simultaneous mapping updates and force your team to use that middleware for loading data into ElasticSearch.
  • 2. It is extremely slow at indexing data.
    In our case it took 4 hours to load ~160k records.
  • 3. It takes good 2-3 weeks to train all developers on how to use ElasticSearch in a right way.

Apart from that, ElasticSearch requires a license and 16 GB RAM minimum.

6 December, 2016