How to set up ElasticSearch as a Source with Rivery

At this time ElasticSearch is not yet natively supported (this will be officially supported in the very near future). However, in the meanwhile you can already fetch this data by setting up a REST API.

What you will need is a ElasticSearch POST _search request with a Body (e.g. you want to pull _search data from Elastic).

You will want to get the Results into an endpoint (e.g. SnowFlake)

Checks to make it work:

  1. Check that you get the results in the REST API test (and the full body in Postman)
  2. In ‘REST API’ set up > ‘Results’ make sure that you set the Data location to hits.hits (what you put here may vary depending on the output which is how I put the full JSON above for you to check the path to the full JSON) see JSON path finder here: JSON Path Finder

See Pathfinder

Set hits.hits in Result > Data Location

Set “Replace invalid Character in header“

That’s it it. The river should now work, note that you might have to flatten the data in dataBase Endpoint for further cleaning of the data to create separate field for which ever field is nested under the _source