Add Data to the Ocean Data Platform (ODP)

This guide explains how to add data to the Ocean Data Platform (ODP). It covers how to create a dataset, upload files and/or ingest tabular data, enrich the dataset with metadata, manage access, and (optionally) publish the dataset to the public catalogue.

What is a Dataset in ODP?

A dataset is the primary container in ODP. It can contain:

Any number of files of any file type
One (or no) table, derived from structured files or ingested directly via code
Rich metadata describing provenance, licensing, spatial/temporal coverage, and attribution
Access control and publication status

Creating a Dataset

Create a dataset via the web UI:

Go to: https://app.hubocean.earth/my_data
Click Add New Dataset
Provide a title and description
Save

You can refine all metadata and add data at any time after creation.

Collections

A data collection can be created in almost exactly the same way. This acts as a folder to group together several datasets that belong together. You can either:

Create a collection first, then create datasets within it
Create a collection and then add it to an existing dataset from your My Data page by copying the collection UUID

Adding Rich Metadata

Rich metadata improves discoverability, usability, and correct attribution. From the dataset UI, you can add or edit:

Title, description, and tags
Geographical coverage (drawn interactively on the globe)
Temporal coverage (time range represented by the data)
Provider
The organisation or individual that owns, authored, or should be credited for the data
License
The terms under which the data may be used
Citation
A citation statement and (if applicable) DOI

Tip

We can create DOIs via our DataCite.org membership for data that is published for the first time via the ODP.

Adding Data to Your Dataset

ODP supports multiple ways to attach data to a dataset. The available features depend on how structured your data is.

Data types and capabilities

Data type	Example formats	How to add	Capabilities
Files only	Any: NetCDF, ZIP, TIFF, Zarr, Excel, proprietary formats	UI or code	Storage, download, metadata
Files with preview	PDF, images, text files	UI or code	Storage, preview within dataset page, download, metadata
Structured files	CSV, GeoJSON, Parquet	UI or code	Can be ingested into a table
Table	Ingested tabular data	Via structured files or direct code ingest	Querying, APIs, interactive maps, column statistics

Note

A table is created either by ingesting a supported structured file or by ingesting tabular data programmatically. The tabular functionality unlocks more advanced capabilities of the ODP, so we encourage using it where possible.

Uploading Files via the Web UI

Click Select Files or drag and drop files into the dataset
Click Upload All (or upload files individually)
For large files, uploading one at a time is recommended

If a file’s MIME type is not detected automatically (e.g. rare or proprietary formats), you may need to specify it manually.

Ingesting Structured Files into a Table

For supported structured formats (.csv, .geojson, .parquet), the UI displays an “ingest file to table” icon next to the file.

To ingest:

Upload the structured file
Click the ingest file to table icon
Confirm ingestion

This converts the file into ODP’s columnar tabular storage and enables:

Fast querying
Map visualisation
API access

Editing the Table Schema

After ingestion, use Edit table to refine how the data behaves:

Rename columns
Add human-readable column descriptions (tooltips)
Classify geometry columns (WKB/WKT, WGS84) or latitude/longitude pairs
Choose columns to index (partition) for improved query performance

Tip

Editing the schema alters the way the data is structured. Adding the correct class (e.g. Geometry or Latitude/Longitude) unlocks some of ODP’s features.
For large datasets, restructuring may take some time — even if you close the browser, it will continue until finished.

Adding Data Programmatically

For programmatic workflows, large datasets, or automation, use the Python SDK:

Programmatic uploads also allow:

Direct ingestion without intermediate files
More precise schema control
Adding metadata at upload time (UI support is limited and evolving)

Geospatial Data

Tabular data on ODP can handle geospatial data inputs. Defining this correctly allows the platform to unlock powerful features such as map visualisation, spatial queries, OGC Features, and (soon) vector tile endpoints.

Format	Geometry handling
GeoJSON	Geometry is detected automatically
CSV	Geometry must be provided as WKT or latitude/longitude float columns
Parquet	Geometry may be WKT, WKB, or latitude/longitude float columns
Direct table ingest	Geometry may be WKT, WKB, or latitude/longitude float columns

Automatic Table Profiling

When a dataset contains a table, ODP automatically derives column-level statistics, including:

Data types
Value ranges
Spatial and temporal extents

These statistics power search, filtering, and visualisation features. You can view them on the main dataset page (or from My Data, click Preview).

Note

Automatic profiling applies to tables, not to files alone.

You can control who can view or modify your dataset.

From the dataset UI, open Manage Access and assign roles:

Admin – full control to add permissions, publish, and edit
Editor – can modify data and metadata
Viewer – read-only access

Permissions can be granted to:

Individual already registered ODP users (via email address)
Groups (coming soon)

Public access

A dataset can be public without being published. If you enable Public Sharing: Share via link, anyone with the link will be able to access the dataset, but it will not be findable in the catalogue. Publication controls catalogue visibility, not access mechanics.

Publishing Your Dataset

Publishing makes a dataset discoverable in the public ODP catalogue. All functionality (files, tables, APIs) works the same before and after publication.

Warning

Once published, only Admins of the dataset can make edits (Editors no longer have edit rights).

What to Check Before Publishing

Before submitting a dataset for publication, ensure that:

You have the right to publish the data
The provider (owner/author) is clearly specified
The license accurately reflects usage rights
Any required citation is included

Rich metadata (spatial/temporal coverage, descriptions, tags) is strongly encouraged, but the most critical requirements are licensing and provenance.

Publishing Process

Review the dataset metadata
Confirm ownership, licensing, and citation
Submit the dataset for review
The ODP team will review and contact you if changes are required
Once approved, the dataset is published to the public catalogue

Licensing Guidance

Choosing the correct license is essential.

If the data has been published previously, the original license must be reproduced unless explicitly agreed otherwise with the provider
If the data is derived from one or more sources, the resulting license must be compatible with the original licenses
If the data is entirely your own, you may choose an appropriate license

If it is your choice, we recommend choosing a widely adopted open license such as Creative Commons CC BY 4.0 (only attribution required) or the least restrictive CC 0 1.0 Universal whenever possible. Alternatively you can browse other Creative Commons license options.