Hi, I’ve read tonnes of documentation but can’t seem to find an answer to this.
I have a REST API connection where the result has thousands of pages, more and more being added by minute. My questions are:
- how can I paginate through this request to import all historic data? (first run)
- how can the next run pick up from the last previously read page, rather than reading thousands of pages from the start again? So, incremental run only - is that possible?
hi @zivile, yes this is possible using Action rivers with pagination. It will depend on the pagination required by the API you are trying to connect to.
Here is our docs on Actions that cover pagination options: Pagination Options in Rest API
In addition you can use time period values to set up an incremental load, so that you are always pulling from the last run. This will also depend on what the source API allows/requires in terms of setting specific parameters to only pull certain data (via dates/times or pages like you mentioned). Would you mind sharing the source API’s documentation? This way we could provide a more relevant response to the specific pagination / incremental loading options.
That would be amazing, thank you! I’m looking at JudgeMe Reviews API:
Hi @zivile, thanks for sending! I took a quick look at the docs and the pagination is definitely doable in Rivery using the ‘pagination by page’ option:
Note, you’ll need to make sure that ‘page’ and ‘per_page’ exist as params in your request (‘page’ can be left empty as Rivery will populate it with iterating page numbers):
From the docs, it doesn’t seem it’s possible to send in a date parameter for incremental loading. One thing to try is making the ‘per_page’ a larger number, such as 100 or 1000. Many times APIs just recommend the per_page but in actuality they allow for larger pulls of data.
Overall the way to pull reviews data from this endpoint seems a bit limited, but I noticed they have a Webhooks option for subscribing to events when a review is created: Judge.me API Documentation
Rivery supports Webhooks as a source, which means once the event occurs, it is sent to Rivery and then loaded to your DWH table based on the schedule set in Rivery. This would inherently mean the loads are incremental, since they are loading as they are happening. If you’re interested in discussing further options for JudgeMe, we can set up a zoom meeting offline. Let me know what you think.
That worked, thanks so much for your help!