Skip to content

Dask Quickstart

This is a quick tutorial of using Dask within our workspace. It mirrors the Dask Gateway documentation. So head over there for more details and examples.

We start in a notebook instance in the workspaces.


img_28.png

First we import the gateway from Dask. It helps us communicate and manage the clusters.

# Importing the gateway

from dask_gateway import Gateway

gateway = Gateway()

This code block does not return anything.


Now let's create a cluster. Running this code returns an interactive widget where we can define the processing power needed. img_29.png

# Creating a new cluster with the gateway

cluster = gateway.new_cluster()
cluster

Tip

All users have the ablity to create a minimum numbers of cluster and workers to test their code. If more processing power is needed, please reach out!


To verify that the cluster is created, we can list it out. Here, we can se it's name and status.

img_30.png

# List clusters
gateway.list_clusters()

Let's put the cluster to work. Creating a dask array connects it to the cluster. Running .compute() sends the job to be performed, and returned back to the notebook.

img_31.png

import dask.array as da

a = da.random.normal(size=(1000, 1000), chunks=(500, 500))
a.mean().compute()

This should get you going with using Dask in our workspaces.