Remote sessions
From version 2.9.0, Renku supports running remote sessions on compute infrastructure where workloads can be started on behalf of users with an API.
At the moment, Renku only supports starting remote sessions on High-Performance Computing (HPC) resources using the FirecREST API.
The technical details of remote sessions are explained on the Amalthea repository.
Configuring remote sessions at CSCS
This section describes how to set up remote sessions at CSCS, which uses the FirecREST API to start remote sessions. This example can be adapted to allow users of your Renku instance to start sessions at another HPC infrastructure.
1. Prerequisites
As an Renku admin, you must first create or obtain the OAuth 2.0 configuration which will be used to connect Renku users with CSCS.
2. Add a new integration
The first step is to add a new integration using the OAuth 2.0 configuration by clicking "Add Service Provider".
Then, fill the "Add provider" form as follows:
- Id:
cscs.ch - Kind:
Generic OIDC - Application slug: leave blank
- Application slug
- Display Name:
CSCS - URL:
https://cscs.ch - Use PKCE: leave unchecked
- Client ID: Fill in the
CLIENT_ID(OAuth 2.0 configuration) - Client Secret: Fill in the
CLIENT_ID(OAuth 2.0 configuration) - Scope:
default offline_access openid profile email - Image registry URL: leave blank
- OpenID Connect Issuer URL:
https://auth.cscs.ch/auth/realms/cscs
Once the new integration has been created, you should test it by clicking "Connect" on the corresponding entry in the "Integration" page (see testing integrations).
3. Add resource pools which start remote sessions
The next step is to create resource pools which start remote sessions.
At the moment, the admin panel is not up-to-date with this feature, so we
will have to use the swagger page (usually located at https://<your-renku-domain>/swagger/).
Under the resource_pools section, find the POST /resource_pools API endpoint and
send the following request body:
{
"quota": {
"cpu": 2560,
"memory": 5120,
"gpu": 0
},
"classes": [
{
"name": "eiger",
"cpu": 256,
"memory": 512,
"gpu": 0,
"max_storage": 10,
"default_storage": 1,
"default": true,
"tolerations": [],
"node_affinities": []
}
],
"name": "CSCS - Eiger - Debug",
"public": false,
"default": false,
"remote": {
"kind": "firecrest",
"provider_id": "cscs.ch",
"api_url": "https://api.cscs.ch/hpc/firecrest/v2/",
"system_name": "eiger",
"partition": "debug"
}
}
This creates a resource pool which is configured to start remote sessions with the following details:
- The
remote.system_namefield specifies that session jobs will be submitted to theeigercluster (see Eiger). - The
remote.provider_idfield specifies that Renku will use thecscs.chintegration to submit the session job on behalf of the user. - The
remote.api_urlfield specifies the FirecREST API URL. - The optional field
remote.partitionfield specifies the SLURM partition to use (here thedebugone).
More resource pools can be configured to give access to different HPC clusters.
4. Test the remote sessions
To test that the remote sessions at CSCS work properly, you will need the following:
- Open the admin panel and add yourself to the remote resource pools we just created
- Make sure you have an account at CSCS with at least one active project (e.g. you can start jobs with the
sruncommand)
Create a new project and then create a new session launcher using the ghcr.io/swissdatasciencecenter/renku/py-datascience-jupyterlab image.
If the py-datascience-jupyterlab image is not available as one of the global environments, you can create a new external environment with
the following settings:
- Container image:
ghcr.io/swissdatasciencecenter/renku/py-datascience-jupyterlab - Default URL:
/ - Mount directory:
/home/renku/work - Work directory:
/home/renku/work - Port:
8888 - GID:
1000 - UID:
1000 - Command: leave blank
- Args: leave blank
- Strip path prefix: leave unchecked
At the next step, select "Session launcher compute resources" and pick one of the remote resource pools you created in step 3.
Now you can launch the session.
Troubleshooting
If the session fails to start because the SLURM account is incorrect, you can specify it as an environment variable in the session launcher.
Open the session launcher's off-canvas, then scroll down to the "Environment Variables" section and add the following:
SLURM_ACCOUNT:<your CSCS group>