SDK
The current Python SDK for the Ocean Data Platform is in an experimental phase and only supports Tabular v2, our new generation of tabular storage. Previous versions and features have been deprecated in preparation for a more streamlined and powerful SDK release coming soon.
Tabular v2
Key features:
- Serialization:
json
is not very efficient for large datasets, and has limitation for some data types. the new tabular storage is based onarrow IPC
format, which is much more efficient both in data usage and speed of serialization/deserialization. - Schema: The new tabular storage uses
arrow Schema
for the schema definition, instead of a customjson
schema. - Partitions: Data is incrementally partitioned when ingested, without the need to specify a partition key. A custom engine will then use any query and validate if each partition might have candidates, dropping the ones that don't.
Getting Started
- quick: quick overview on how to use the new table_v2() with python examples.
- reference: detailed documentation on the new tabular storage in the python SDK.
Coming soon
We're actively working on the next major version of the SDK, which will offer a unified and intuitive interface for accessing datasets on the Ocean Data Platform. This upcoming release will integrate Tabular v2 as a core component and provide a more consistent and user-friendly developer experience across the platform.
If you're interested in trying a preview version or contributing feedback, we'd love to hear from you — feel free to reach out or contact us through our community channels.