Skip to content

Staging

Staging is a feature for our Tabular Storage controller.

Tip

When ingesting large datasets, it makes sense to do it in chunks - instead of one go.


Staging data with the SDK

When a tabular dataset has been created, and a schema has been made, data can be uploaded using staging.

First, we need to create a stage request with the create_stage_request function. We are then returned an ID for our stage.

To upload data to the stage we use the write function - which optionally accepts a table_stage parameter, this is where we'll put the stage id we received.

When we are done writing data to the stages we have created, we can commit_stage_request with table_stage for the stage. This tells the platform that you are finished uploading data to the stage, and you want to load everything in the stage over to the dataset.

Follow the progress by polling list_stage_request for the dataset, or get_stage_request for each stage by passing the table_stage_identifier. This way we can ask how the stages are going. When their statuses are all "completed" they are ingested to the dataset.