Adding a Cluster Definition

Firstly, locate the cluster definitions page from dropdown menu at the top-right of the page.

cluster definitions menu

The following steps require you to have Administrator, Project Owner or Project Contributor permissions.

Select the Cluster Definition Type

There are various types of Cluster Definitions that determine which tasks they can be used in and what options can be configured for them.

Cluster Definition Type Description Used by
Databricks Cluster (Azure) Spin up Apache Spark clusters on-the-fly using Azure based Virtual Machine configurations. Databricks, Spark SQL Statement
Azure Batch Pool Creates an Azure Batch Pool. Azure Batch Task
Azure Batch Container Pool Creates an Azure Batch Container Pool. Azure Batch Task

Configure the Cluster Properties

The next page in the form will prompt you to configure the various specs and software that is used in the cluster. If you need a refresher on what makes up a Cluster Definition read Cluster Definition Concepts.

There are no validation steps required for Cluster Definitions, this is because all information is populated based on the type and is verified against the cloud provider.

How to Create an Azure Batch Pool or Azure Batch Container Pool Cluster Definition

There are three different cluster types.

cluster definitions menu

You can create an Azure Batch Pool or an Azure Batch Container Pool.

For both cluster types you will also need to choose the Region, Connection and an Agent to Query Supported Images.

Choose a Region, Connection and Agent

Choose from the dropdown list of regions. (Your chosen region may affect the available OS Configurations you can choose on the next page.)

Your chosen region must be the same as your Azure Batch account region.

Choose a connection in the next dropdown. These are available connections to Azure Batch.

cluster definitions menu

Then choose an Agent to Query Supported Images.

This agent will be able to access your chosen above connection.

Projects

Then choose if this cluster definition will be available to all projects or only to selected projects.

If you choose selected projects, you can then choose from a list of all projects in this tenant. This cluster definition will only be available in these projects and will not be displayed when creating tasks in other projects.

Selected projects or all projects

After you have chosen your projects, click Next.

OS Configuration

Select an OS Configuration. You can choose to filter the OS Configuration dropdown by not displaying unverified OS Configurations or those that are expired or will soon expire. Your previously selected region may affect the OS Configurations that are available.

cluster definitions menu

The available OS Configuration options in the dropdown list will differ depending on those selections. (The following image does not contain OS Configurations that are unverified or will soon expire.)

cluster definitions menu

Azure Virtual Machine Type

Choose an Azure Virtual Machine Type. The VMs available will change depending on your chosen hosting Region and capabilities.

cluster definitions menu

If using a custom VM Image, please ensure that the OS Configuration (publisher/offer/SKU) and Azure Virtual Machine Type matches the VM Image.

Minimum and Maximum Workers

Then choose the number of Minimum Workers for this cluster definition. This is the minimum number of processes used to run tasks.

cluster definitions menu

You can also choose the number of Maximum Workers. You can leave this field blank and not specify a maximum number of workers. If specified, the cluster will automatically scale based on the workload.

Providing a number of Maximum Workers may result in higher running costs.

cluster definitions menu

Container Image

You can specify a container image. Provide the name or path of your chosen container image.

container image path

Container Registry Connection

You can choose a connection that connects to a custom container registry, so that you can use container images that are not available from Docker Hub or other container libraries.

Select the connection to the Container Registry from the dropdown.

Container registry connection

Use a Custom VM Image (Optional)

You can optionally use a custom VM image or you can submit this cluster definition below.

Check the Use a Custom VM Image? checkbox to expand the VM Image ID field.

When using a custom VM image, you need to use a managed identity on your ‘Azure Batch Account’ connection.

The Batch account must also have a reader permission on the shared image gallery or the individual image.

Please note that currently, Azure Batch does not support the ‘TrustedLaunch’ feature. You must use the standard security type to create a custom image instead.

Azure Batch only supports Generalized Shared Images. This means when creating your VM Image, the OS System State needs to be Generalized.

When using a custom VM image with a User Assigned managed identity for your Azure Batch Account connection, that identity must be given reader access to the VM image.

Custom VM Image checkbox

Provide the VM Image ID. This is the Resource ID of the VM image version. You can copy it under the ‘Properties’ tab.

This should be in the form of /subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Compute/galleries/{computeGalleryName}/images/{imageName}/versions/{imageVersion}

Custom VM Image id field

Complete the Cluster Definition

You can then submit this cluster definition to save, and it will be ready to use in your tasks.