How to paginate with ElasticSearch using Scroll in Rivery

the_bar · January 7, 2022, 12:13am

Source: ElasticSearch Scroll documentation [here]

1. Create two REST APIs

Rest API Set-Up: Rest API Source Walkthrough - Rest API Source

Call 1: Fetch first _scroll_id

POST https://{endpoint_base_url}/prod/{index}/_search?scroll=5m
{
  "size": {page_size},
  "query": {
    "match_all": {}
  }
}

Call 2: Populate _scroll_id in this call and paginate over it
Notice that you do not put the index in the url here.

POST https://{endpoint_base_url}/_search/scroll
{
    "scroll" : "1m",                                                                 
    "scroll_id" : "{_scroll_id_from_body_output_call1}"

}

We will patch these together in the next step and paginate over the second call.

2. Create a multi-action with pagination:

In step 1: This Multi-Action flow returns the START REST API output and puts the _scroll_id into a variable {_scroll_id} .

In step 2: The “{_scroll_id}“ variable gets populated in the body of the POST request.

In the return set-up you set up pagination in which you take the next page key _scroll_id and populate it in the body parameter “ scroll_id “

Make sure to associate the _scroll_id from step 1 to new variable to new variable { scroll_id_page }
used in step 2 request body.

3. Created a Source to Target River:
C’est tout! Now you can created a Source to Target River that runs the MultiAction above and runs it into your desired database.

Topic		Replies	Views
Pagination in Action Rivers Tutorial	0	2563	January 24, 2023
How to set up ElasticSearch as a Source with Rivery Tutorial	0	1561	November 15, 2021
Scanning Rest API Data through looping due to limit on API Response data Action	1	1787	July 28, 2020
REST API pagination Discussions	4	382	September 7, 2023
Pagination with offset Action	6	132	March 21, 2024

How to paginate with ElasticSearch using Scroll in Rivery

Related topics