RWTHjupyter

Build Status

Jupyter

Service Description

With RWTHjupyter the IT Center provides in collaboration with the Institute for Automation of Complex systems all users with access to an interactive computing platform.

Costs

The cluster has been initially funded by the Projekt Digitale Lehr-/Lerninfrastrukturen DH NRW in 2019 and is thereby provided cost-free to all students and employees of the RWTH.

Hardware

The RWTHjupyter cluster consists of 7 Dell PowerEdge R740xd servers with the following configuration:

7x Dell PowerEdge 740XD:

  • Dual Socket Systems: 2x 16C / 32T Xeon Gold 5218 2,3 Ghz
  • Redundant Dual 10 GigE links
  • 768 GB DDR4 RAM / node (5.376 GB total)
  • 100 TB SSD Storage in Ceph

Additionally, 1 of the nodes is equipped with:

  • 2x NVIDIA Tesla T4 GPGPUs

Software Configuration

RWTHjupyter runs on a highly-available Kubernetes cluster using the Zero 2 JupyterHub project.

News

2020-04-21: Go-live of RWTHjupyter test phase

2020-02-28: Kick-off meeting of the RWTH Jupyter Working group

2020-01-24: The first nodes of the cluster have been provisioning at ITC Wendlinge

2019-11: A GPU extension for the cluster has been specified and ordered

2019-10: The cluster has been specified and ordered

2019-06 The RWTHjupyter proposal has been funded by the Project "Digitale Lehr-/Lerninfrastrukturen DH NRW"

EON Energy Research Center IT Center

Access

RWTHjupyter is accessable world-wide over the internet via jupyter.rwth-aachen.de.

Access to RWTHjupyter is granted via the universities Identity Management using a TIM-ID. By default all employes and students are given access to the cluster without any additional registration.

The use of this cluster is limited to teaching related activities. Please consult the Terms of Use for details.

Terms of Use

§1 Subject of the rules of use

The following rules of use describe the general conditions for the use of the pilot project RWTHjupyter for its higher education (especially teaching students). RWTHjupyter is a web-based interactive computational environment for creating Jupyter notebook documents.

These rules are agreed upon for the use of the RWTHjupyter pilot project.

In the following, for a better understanding, the users are uniformly referred to as "users". Users can be divided into two subgroups - "teachers" and "students".

§2 Conditions of use

  1. The present rules refer exclusively to the use of the RWTHjupyter.
  2. The use of the RWTHjupyter requires an enrollment or employment at the RWTH, the use of a browser and the use of a client software.
  3. The use of the RWTHjupyter is restricted to teaching related activities. This includes theses, student projects or autodidactic activities.

§3 Costs

Use of the RWTHjupyter is free of charge; there is no legal claim to registration and use.

§4 Duties of care

The user is obliged to treat his access data confidentially and to protect it from access by third parties. The user must therefore take all necessary measures to ensure the security and confidentiality of the access data and passwords generated by him. In case of possible misuse of his access data, the user must inform the RWTH immediately. He is also responsible for the consequences of such misuse.

The user may not take any measures or use any software that could interfere with the functioning of the RWTHjupyter service or otherwise interfere with its availability.

In particular, the amount of data, the number of objects (number of files) and the number of simultaneous connections are subject to the specifications made previously or the standard parameters of the RWTHjupyter.

Deviating uses of the provided software (e.g. the use of encryption technologies) are the sole responsibility of the user.

In case of violation by the user, the access authorization of the user will be blocked.

Furthermore, the user must always observe and comply with the instructions given.

§5 Rights and duties of the user

  1. By using the RWTHjupyter, the user grants the RWTH the right to store the data and work results he/she has stored in the RWTHjupyter and to view them under the following conditions.

  2. The data and work results stored by the user in the RWTHjupyter may be inspected by the RWTHjupyter admins, if

    • the employment relationship between the user and the RWTH is terminated
    • or the user is not available for more than 1 week to disclose the stored data and work results himself;
    • or the user violates the terms and conditions of use and as a consequence is blocked from using the service.
  3. Rights of use and copyrights to the stored information remain unaffected.

  4. Access to the cluster may be suspended due to excessive use, crypto miners or other types of misuse.

§6 User Lifecycle

  1. Users must log in to RWTHjupyter again at the latest 12 months after their last login. During this process a check is made to see whether the access conditions for using the service are still met.
  2. If the access conditions (§2) after this period are not met, the access to the cluster will be disabled.
  3. In case the last user activity was more than 12 months ago, the cluster access will be disabled and the user account will be deaktivated. The user will be notified via mail about a pending deletion of account 6 months after the deactivation.
  4. In case the last user activity was more than 18 months ago, the user account will be finally deleted including all stored data. Shared folders are only deleted after all user accounts with access to the share have been expired.

Users who have their account deactivated due to unmet access conditions can contact the administrators to regain access to their user data within the 18 month deletion period.

User Lifecycle Diagram

§7 Liability

The RWTH has unlimited liability in cases of intent or gross negligence, for injury to life, limb or health and according to the regulations of the Product Liability Act.

In case of slightly negligent violation of an obligation that is essential for the achievement of the purpose of the contract (cardinal obligation), the liability of the RWTH is limited to the amount of damage that is foreseeable and typical for the type of business in question.

A further liability of the RWTH does not exist.

The aforementioned limitation of liability also applies to the personal liability of employees, representatives and organs of the RWTH.

§8 Availability of the RWTHjupyter service

A permanent, trouble-free and/or unlimited availability of the service cannot be guaranteed or offered. Especially maintenance work or security aspects as well as force majeure and events beyond the control of the RWTH can lead to disturbances or a temporary suspension of the service.

§9 Data protection

The collection, use and application of personal data is carried out in accordance with the relevant data protection regulations. In particular, no personal data is passed on to third parties without authorisation.

When logging in to the JuptyerHub cluster, the following information about the user's person is transferred by the RWTH's identity management system and stored in the user database of the service:

In addition, the following user identifiable information will be stored for a period of 4 weeks and linked to the users account:

  • Time stamps of singleuser container spawns and terminations
  • Timestamp of last user activity on the cluster

All user data will be deleted after a total of 18 months of user inactivity (see §6 User Lifecycle)

§10 Miscellaneous

  1. Changes and amendments to this contract must be made in writing. This also applies to the amendment or cancellation of this clause.
  2. General terms and conditions of the user shall not apply.
  3. German law shall apply to this contract to the exclusion of the United Nations Convention on Contracts for the International Sale of Goods of 11.4.1980 (UN Sales Convention).
  4. Place of performance is Aachen. Exclusive place of jurisdiction is Aachen, provided that each party is a merchant or legal entity under public law or has no general place of jurisdiction in Germany.
  5. Should individual provisions of this contract be invalid, this shall not affect the validity of the remaining provisions. The parties to the contract shall endeavour to find a valid provision in place of the invalid provision, which comes closest to the economic meaning of the invalid provision.

Frequently Asked Questions (FAQ)

Are there any resource quotas/limits enforced on singleuser containers on RWTHjupyter?

Yes, we currently provide each user container with a maximum of 64 GiB RAM, 32 CPU cores and a 4 GiB of persistent storage space for your home directory.

Can I use RWTHjupyter for my thesis/research/hobby project?

Please consult the Terms of Use for details.

Can I use my own Jupyter environment or install additional packages / kernels?

This is only possible to a limited extend. We encourage all instructors and professors to apply for a customized profile for their course.

Students who which to use RWTHjupyter outside of their courses, we recommend one of the generic kernel profiles. These generic profiles can by customized by installation additional packages via pip and conda. By default, added packages are not persistent and only available until the next spawn of your Jupyter container. You can work around this, by installing packages into your home directory:

pip install --user pandas

You can also add this line into a Jupyter Notebook cell by prefixing it with an exclamation mark:

!pip install --user pandas

It is currently not possible to load custom conda environemnts.

How can I rebuild the Docker image which used for my profile?

Please visit the following site and select your profile to trigger a new build of the Docker image which is used to spawn single user containers:

https://jupyter.rwth-aachen.de/service/profile/

Note: You need to hold the manager role of the profile which you want to rebuild. Please contact us, if you dont have the role yet.

What is the purpose of the shared, materials and dataset directories in my home directory

Please have a look at the following dedicated page: Shared Folders

How can I share a notebook with colleagues, friends, partners or my instructor?

Please have a look at the following dedicated page: Links

Can RWTHjupyter be used by external partners?

Yes. Please use the RWTH IdM Partner Manager to sponsor a RWTH partner account. Partner accounts can access RWTHjupyter but might have reduced privileges or priorities.

How can I create a perma-link to a open a specific profile and/or Jupyter notebook?

Please have a look at the following dedicated page: Links

This is cool! How can I contribute or improve RWTHjupyter?

We host most of our code, configuration and more on the RWTH GitLab instance.

Please feel free to contribute by submitting merge requests.

We are also looking for HiWi's to support us in improving this service. Feel free to get in touch with us.

Who is behind RWTHjupyter?

The RWTHjupyter infrastructure was created in collaboration between the Institute for Automation of Complex Power Systems (ACS) and the IT Center of RWTH Aachen University.

I've lost my changes to my notebooks. Changes to the courses Notebooks are not synchronized properly.

We use a Jupyter extension called nbgitpuller to sync Jupyter Notebooks. nbgitpuller uses an "automatic merging behaviour" to sync changes between your local home directory and the upstream Git repo. Please consult the nbgitpuller documentation for details about this merging behaviour.

How can use GPUs on RWTHjupyter

Currently, GPUs are only available to selected courses. Please contact us if you wish to use GPU resources.

Instructions

Links

Open a Jupyter environment

  1. Please visit the Webpage of the Cluster at: https://jupyter.rwth-aachen.de
  2. Click on "Sign in with Shibboleth" and use your RWTH TIM-ID to login
  3. Choose a profile from the list and click on "Start" at the bottom of the page
  • Alternatively, you can use the search function to narrow down the selection of profiles
  1. Wait until your personal Jupyter environment has been spawned. This can take between 5-60 seconds.

Shared folders

By default every user sees three special folders in his home directory:

  • ~/materials
  • ~/datasets
  • ~/shared

These folders are addition volume mounts and share the same content between all users. These folders are also read-only with the exception of ~/shared which can be used as a global scratch pad to exchange files between users on the cluster.

materials

This folder contains a selection of Jupyter notebooks which can be used to explore the capabilities of Jupyter Lab. They are also well suited as a starting point for new lectures.

The example materials are managed by a dedicated Git repository. Pull requests to add new materials are welcome:

https://git-ce.rwth-aachen.de/jupyter/example-materials

datasets

This folder contains a collection of well-known machine learning datasets. Due to their size, they are shared between all users.

Feel free to contact us, if you wish to add you examples or datasets.

shared

This folder can be used as a global scratch pad to exchange files between all users of the cluster. All users can read, write, modify and delete files from this folder. No backups are made!

Group Shares / Sciebo Integration

We are currently working on a solution to allow the restriction of shared folders to a smaller user group. Also possibly the integration of external Sciebo shares. Please feel free to contact us if you are interested in this feature.

Share Notebooks with other users

RWTHjupyter support several mechanisms to share notebooks between users.

By manual exchange of .pynb files

The easiest way of sharing notebooks with other users is by downloading/uploading the .ipynb files via the menu in JupyterLab. This can then be easily emailed to colleagues who can open it back in JupyterLab (just drag and drop the file into file browser of JupyterLab).

By using a shared folder

In addition, each user has access to a shared/ folder in their home directory by default. This folder is shared between all users and can therefore be used well for sharing notebooks.

Attention: Please note that all users have access and can also customize or delete their notebooks there as they see fit.

We are currently still working on a possible to also allow private-shared folders. Similar to what you might already know from Sciebo. But we can't make a concrete statement about the availability yet.

By granting the other user full access to your Jupyter server

Users can share access to their running Jupyter server by sharing a link with other users.

Attention: Only share the link with trusted friends, colleagues and partners as it grants the receiving user full access to your Jupyter account!!!

Copy the following code snippet into your notebook to craft a shareable link. Please note that you can provide a path (optional) and an expiration date for the link.

import rwth_nb.misc.share as share
import rwth_nb.misc.notebook as nb

from IPython.core.display import HTML

url = share.get_shareable_url(path=nb.get_current_path(), note='access for my colleage at IKS', expires_in=24*60*60)

HTML(f'<a href="{url}">{url}</a>')

Launch Profiles and Notebooks via Links

This page shows a few examples of permanent links for accessing profiles or notebooks in the Jupyter cluster. They can be used for your README.md, Websites or Moodle.

Link creation wizard

Feel free to use the link creation wizard to create links the the schemes described below.

Spawn and open Notebook

Opens a specific notebook using the default profile (Python) or your running Jupyter server.

Schema: https://jupyter.rwth-aachen.de/user-redirect/lab/tree/{{ notebook_path }}
Example: https://jupyter.rwth-aachen.de/user-redirect/lab/tree/gdet3/GDET3%20Faltung%20GUI.ipynb

Arguments

  • notebook_path and URL encoded path to the Notebook which should be opened. Please not that this path must be URL encoded.

Spawn with a specific profile

This link to spawn a specific profile.

Note: If a server is already running with a different profile, no action will be taken.

Schema: https://jupyter.rwth-aachen.de/hub/spawn?profile={{ profile_slug }}
Example: https://jupyter.rwth-aachen.de/hub/spawn?profile=gdet3

Arguments

  • profile_slug a short alpha numeric identifier for the profile. Please take a look at the profile list for available profiles.

Spawn with a specific profile and open a notebook

This version combines the first two types of links:

Schema: https://jupyter.rwth-aachen.de/hub/spawn?profile={{ profile_slug }}&next=/user-redirect/lab/tree/{{ notebook_path }}
Example: https://jupyter.rwth-aachen.de/hub/spawn?profile=gdet3&next=/user-redirect/lab/tree/gdet3/GDET3%20Faltung%20GUI.ipynb

Specify username and server name

Using this slighly extended version, a named server for a specific user can be spawned. Please note that user impersonalization is only available for administrators.

A normal user can start up to 10 named servers. E.g. for using several profiles in parallel

Schema: https://jupyter.rwth-aachen.de/hub/spawn/{{ user_name }}/{{ server_name }}?profile={{ profile_slug }}
Example: https://jupyter.rwth-aachen.de/hub/spawn/vzi3jsam/my_et3_server?profile=gdet3

Arguments

  • server_name is a alpha numeric identifier for the named server

Badge

You can use the following Markdown or HTML snippets to embed a badge into your README.md files or Moodle activities for launching a Notebook:

Example

Markdown

[![](https://jupyter.pages.rwth-aachen.de/documentation/images/badge-launch-rwth-jupyter.svg)](https://jupyter.rwth-aachen.de/hub/spawn?profile=pti&next=/user-redirect/lab/tree/pti/index.ipynb)
[![](https://mybinder.org/static/images/badge_logo.svg)](https://mybinder.org/v2/git/https%3A%2F%2Fgit.rwth-aachen.de%2FIENT%2Fpti/master?urlpath=lab/tree/index.ipynb)

HTML

<a href="https://jupyter.rwth-aachen.de/hub/spawn?profile=pti&next=/user-redirect/lab/tree/pti/index.ipynb"><img src="https://jupyter.pages.rwth-aachen.de/documentation/images/badge-launch-rwth-jupyter.svg" /></a>
<a href="https://mybinder.org/v2/git/https%3A%2F%2Fgit.rwth-aachen.de%2FIENT%2Fpti/master?urlpath=lab/tree/index.ipynb"><img src="https://mybinder.org/static/images/badge_logo.svg" /></a>

Instructions for Instructors

Update Notebooks included in the profile

In case you provided a URL to a Git repository while requesting your profile, you can update the contents of the repository at any time using standard Git workflows.

The contents of your repositry will be synchronized during each start of the profile with the local copy of the user in his RWTHjupyter home directory. We use nbgitpuller to perform this synchronization.

Please note that users need to completely restart their Jupyter service before changes become available in their Jupyter instance. To force a restart, perform the following steps:

  1. Stop your current server by clicking on "Stop my server" on the JupyterHub home page.
  2. Wait a minute, then reload the page
  3. Start the server again by selecting a profile.

Alternatively, you can force a synchronization while the Jupyter server is running:

  1. Access your (running) Jupyter server at: https://jupyter.rwth-aachen.de/.
  2. Open a terminal: File -> New -> Terminal
  3. Run the following command: bash /scripts/start.sh

Rebuild Docker image for your profile

All changes to the repository which affect the runtime environemnt (installed jupyter kernel, Conda's environment.yml, Pip's requirements.txt ) require a rebuild of the underlying Docker image of your profile.

Such a rebuild needs to triggerd manually by the profile manager:

  1. Check if you have the manager role for your profile by visiting: https://jupyter.rwth-aachen.de/services/whoami/
  • Ensure that you are member of a group named: manager-{profile_slug}.
  1. Visit the following service to trigger a rebuild: https://jupyter.rwth-aachen.de/services/profile/
  2. Select your profile from the drop-down
  3. Press the button "Trigger new build of Docker image"

New Profiles

The following page documents the required steps for integrating new lectures or laboratories into the RWTHjupyter cluster.

We recommend that each course which plans to use Jupyter, creates a new profile in the cluster. After login, users will be greeted with a profile selection page from which they can choose their desired environment. We also will add a perma-link feature which allows you to generate a link which directly opens a particular profile & notebook. These links are the an adequate form to open Jupyter Notebooks from your Moodle course. A profile needs to be prepared and installed to the cluster before it can be used.

It consists of:

  • A name identifying purpose of the profile (or the lecture title)
  • A Git repository containing Jupyter notebooks which should be imported into the workspace by default
  • A Dockerfile sepcifying the runtime environment and optional third-party Python packages

We provide a list of generic profiles for the most common programming languages but offer the possibility to add new profiles for lectures including customizations like custom third-party dependencies, Jupyter kernels, etc.

Separation of runtime environment and Jupyter notebooks

A key point during the preparation of a new profile is the separation of the runtime enviroment from the Notebook content.

We use nbgitpuller so reguarily synchronize Jupyter notebooks from a Git repositry with the users workspace. This allows the instructors to gradually release the Notebooks alongside the timeline of the lecture. Each user maintains his own private clone of this Git repository and is able to make changes to the Notebooks which will be persisted between sessions.

On the contrary, the runtime enviroment is supposed to me more or less static and should not require frequent updates. All users of a profile will share the same runtime environment. Changes outside the home directory are not presistent and will be lost after logout.

Warning for Git repos

Students will get full read-only access to the Git repository your provide. In case you provide us with a token to synchronize private/internal repos, the students will also see this token. This means: Only share read-only tokens to repositories which dont contain sensitive information (also in other branches) as students will be able to access the repo.

Procedure

In general, the following steps are required for adding a new profile to the cluster:

  1. Setup a local test environment for preparing your Jupyter Environment
  2. Choose a Jupyter Kernel
  3. Create a new Git repository for Jupyter Notebooks and runtime environment
  4. Prepare Jupyter Notebooks
  5. Collect Python/Third-Party requirements
  6. Adapt Dockerfile
  7. Submit request for inclusion in profile list
  8. Wait until request is reviewed and the new profile is included in the list

Hereby, steps 4.-6. are optional and only required if the course requires a non-standard Jupyter kernel or special Python dependencies.

1. Setup a local test environment for preparing your Jupyter Environment

Please follow the instructions from the official Jupyter website: https://jupyter.org/install

For Windows users, we recommend to use the Anaconda Python distribution. Linux and macOS users can use Anaconda as well or simply rely on pip installed via the systems package manager.

2. Choose a Jupyter Kernel

Jupyter supports a variety of different programming languages via different kernels. The original and most commonly used kernel using the IPython interpreter. However also others exist. An updated lists of kernels can be found at the Jupyter wiki.

In principle, almost all of the existing kernels can be used on jupyter.rwth-aachen.de by creating custom runtime environments. Please consult the documentation for the installation of other kernels and add these instructions to the Dockerfile (step 6).

However, in most cases the standard IPython kernel is used and no further steps are necessary.

3. Create a new Git repository for Jupyter Notebooks and runtime environment

We recommend to use the RTWH Software Engineering Services / GitLab to manage, track, distribute Jupyter Notebooks and the definition of the required runtime environment.

This allows for a continous delivery of updates to Jupyter as well as welcomes contributions by Students to the Jupyter notebooks.

You can either use an existing Git repository, which might be already existing for a lecture or create a new one by forking our example profile.

Please using the following link to create a new fork of the example profile: create fork of example profile.

After the fork has been completed, you can clone the fresh repo to your local machine:

git clone git@git.rwth-aachen.de:/stvogel/my-new-course.git
cd my-new-course

4. Prepare Jupyter Notebooks

A new profile can include a set of Jupyter notebooks which come along with it. These Notebooks are will be synchronized from a Git repository every time a user enters the Jupyter environment.

As such, the collection of Notebooks for a course can be expandend during the course of the semester.

To start, please launch Jupyter on your local machine:

jupyter lab

A new browser window will open and present you the Jupyter web interface and you should already see the list of existing files in the current Git repository. Add new Notebooks and fill them with content to your wishes.

5. Collect Python requirements

Depending on the contents of the Jupyter Notebook, additional third-party Python packages might be required. Usually these external dependencies are collected in a requirements.txt or environment.yml file:

Please have a look at the following links for further information:

6. Adapt Dockerfile

Some profiles (e.g. when using special Jupyter Kernels) may require additional modifications to the runtime environment beyond the installation of new Python requirements.

To accomodate these the user can modify the included Dockerfile to run arbitrary commands during the preperation of the environment. Note that these commands are only executed during the Docker build phase.

7. Publish Jupyter Notebooks and Dockerfile in Git repository

After completing steps 2.-4., the resulting changes need to be commited to the Git repo and published on a Git hoster (e.g. RWTH GitLab or GitHub):

git add .
git commit -m "first version of new profile"
git push

8. Open request for creation of new RWTHjupyter profile

Please open a new request for the inclusion of your profile into the RWTHjupyter cluster using the following link: Create new profile

The link above will open a form for submitting a profile request via the RWTH GitLab system. Please follow the provided template and give us a few days to review your request before we make it available in the cluster.

Links

Support

Support for RWTHjupyter is provided on a best-effort basis by the RWTH Jupyter user-group. Please note that it is not yet provided as an official service by the RWTH IT Center.

You can contact the group via its mailing list: jupyter@lists.rwth-aachen.de. A subscription to this list is possible via the following link: https://lists.rwth-aachen.de/postorius/lists/jupyter.lists.rwth-aachen.de/

You can also contact the administrators of the cluster via a dedicated closed list: jupyter-admin@lists.rwth-aachen.de.

Legal

Imprint

Please refer to the imprint of the IT Center of RWTH Aachen University.

Data Privacy

Please see Terms of Use § 9 until further notice.