Dask Quickstart
This is a quick tutorial of using Dask within our workspace. It mirrors the Dask Gateway documentation. So head over there for more details and examples.
We start in a notebook instance in the workspaces.
First we import the gateway from Dask. It helps us communicate and manage the clusters.
# Importing the gateway
from dask_gateway import Gateway
gateway = Gateway()
This code block does not return anything.
Now let's create a cluster. Running this code returns an interactive widget where we can define the processing power needed.
# Creating a new cluster with the gateway
cluster = gateway.new_cluster()
cluster
Tip
All users have the ablity to create a minimum numbers of cluster and workers to test their code. If more processing power is needed, please reach out!
To verify that the cluster is created, we can list it out. Here, we can se it's name and status.
# List clusters
gateway.list_clusters()
Let's put the cluster to work. Creating a dask array connects it to the cluster.
Running .compute()
sends the job to be performed, and returned back to the notebook.
import dask.array as da
a = da.random.normal(size=(1000, 1000), chunks=(500, 500))
a.mean().compute()
This should get you going with using Dask in our workspaces.