Action Series (Part 1) - Use an Action to Get Data from a REST API

In this series of posts, we’ll walk through Rivery’s capability to connect to REST APIs: Actions. An Action provides the ability to send REST requests to GET/POST data on the web.

Before creating an Action, you’ll need to know how to authenticate to the API you plan to connect to. This will differ from API to API, as some use API keys, some authenticate with a username and password, some use Oauth 2.0 protocol, etc.

In this first example, we will create an Action River to perform a GET call to the discourse.org API (Discourse is the software we use for this Community) to pull data about the different Rivery Community categories. Once we create the Action River, it will act as the source template for a Data Source to Target river, which will be used for the actual ingestion into our designated target.

Create a New Action River

First, we’ll create a new Action River.

In the Action Steps tab, choose the Rest Action option (the Multi Actions option we’ll cover in a later post).

Now, you’ll see the REST template below.

Set the Action Connection

The ‘Select Connection’ dropdown is optional, depending on the authentication requirements of your API.

You should create a REST connection if either of following is true:

  • Oauth 2.0 protocol is required for authentication
  • Sensitive values (i.e. a password) are required to authenticate

Discourse’s API requires a username and API key to authenticate, thus we’ll want to first create a REST connection. We can do this by clicking ‘New Connection’ and entering the username and API key as parameters.

For the api_key value above, we’ll check off the ‘Is Password’ box to ensure that this value is kept private.

Configure the Request

Next, we’ll enter the URL of the endpoint we wish to call and set the call type. In this case, we’ll pull information from Discourse’s ‘categories’ endpoint in the form of a GET call.

We’ll set the Authentication type to ‘None’ since the authentication will be passed through the headers in the Discourse API. In the Headers section, we’ll add our username and API key parameters.

Note - because we added these values to our connection, they now are available as variables for us to reference.

Configure the Results

Next, we’ll move to the Results area to configure the type of results desired.

In this use case we want to pull data into a table in our data warehouse, so we’ll select ‘Data’ as the Results Type. In the Data Location window we’ll put ‘category_list.categories’, since this is the level of data we would like to retrieve from the Categories endpoint in the Discourse docs outlined here:

Next, the data is returned in JSON format for this API call, so we’ll make sure the Results Structure is set to ‘JSON’.

Finally, let’s run the Action River to make sure our call is returning the results we expect.

In the next step, we’ll configure the Data Source to Target river that will use the Action we just created as a source template.

Use the Action as a Source

Click ‘Create New River’ and select Data Source to Target. In the Source step of the river configuration, we’ll choose ‘Rest API Source’ as our data source.

In the ‘Rest Action River’ dropdown, we’ll select the Action we just created in the previous steps.

Next, we’ll move to the Target step and select our desired target. In this example we’ll choose Snowflake, but this process would look the same for any cloud data warehouse we select.

We’ll set our connection and configure the target location for the data pull.

Next, we’ll navigate to the Column Mapping tab and automap our table schema.

From the mapping results above, we can see that this matches our expected response from the Discourse docs.

All that’s left now is to run the river, and then schedule to automate the data pulls. We can check the Snowflake table after the run completes successfully:

Thanks for reading! Be on the lookout for Part 2 of the Action Series coming soon, where we’ll dive into Multi-Actions.

2 Likes