Information on HPC

Dear HPC-Team,

my name is Sarah Oberbichler and I am Assistant Professor in Artificial Intelligence and History at the C2DH. I am looking for possibilities to make the usage of AI-models at our centre more privacy secure and environmentally friendly. I have seen this documentation: https://hpc-docs.uni.lu/services/jupyter/. I did use something very similar when I worked at the University of Mainz in Germany, however, it was only for me and my team. I would like to propose providing institutional GPU access through JupyterLab, enabling users to pull models via Ollama or Hugging Face and to use them. I would be very interested in discussing the technical possibilities and exploring available options. Thank you very much for your time, Sarah Oberbichler

Dear Sarah,

There are a few requirements that characterise the interactive use of AI models in HPC systems. The models need access to

  • a lot of resources for
  • an indeterminate amount of time, with
  • low latency, and
  • though specific application interfaces.

When using an HPC system you have to conform to some constraints. We maintain a scheduler that assigns resources for a job for fixed amounts of time, to ensure that resources are used efficiently. You can simplify the amount of engineering required to support your jobs by dropping some of your requirements. For instance, if you drop the requirement that your job starts with low latency, then we already have the Open OnDemand portal from where you can launch Jupyter notebooks. Your support engineers can configure special notebooks that come preloaded with the models that you need and we can add them in our system.

However, there have been some requests from C2DH researches that stress the limits of our capabilities. For instance there is a request to run a RAG with DeepSeek-V3.2 (695B parameters). This will require at least 3 of our largest GPUs in our Grid’5000 partition, that does not support Open OnDepand. We cannot ensure that these GPUs will always be available instantly to launch a probabilistic language model for interactive usage. Maintaining resources in standby prevents other users from using them, and we are really trying to maintain a high resource utilisation (throughput over latency) to justify the cost of investing in new hardware.

If you really need something beyond what we can offer at the moment, with Open OnDemand for instance, then there are options.

  • You can have exclusive access to hardware in the HPC centre, the scheduler offers the option to preempt other user jobs in specialised queues. If you can buy the hardware that you require we can host it in the HPC Center with fast access to network and shared storage and we can give you preemption rights so that your jobs can start with no latency.
  • If you want us to setup some low latency queue for your jobs, there are options to orchestrate job launch through Kubernetes, but it will require some significant engineering to ensure that the users have a truly seamless experience when running interactive jobs.

Could you give us some more detail on how do you imagine your resource usage, keeping in mind the particularities of the cluster scheduler?

Dear Georgios, Thank you so much for your quick reply and for clarifying the possibilities and constraints. I didn’t know about the Open OnDemand portal. I have used OnDemand (Mogon) in Germany, and I was very happy with it.

Most of the time, there is no need to use models as large as DeepSeekV3.2, and keeping those requests separate is just fair. My intention with my request is to provide access to local and specialized models (maybe up to 70B or even less) without significant barriers. If researchers can access a ready available platform and test different models, this would definitely help to create possibilities for a more responsible usage of AI.

Do you think there is a way to set up a dedicated notebook with access to Ollama and Hugging Face models on Open OnDemand for us as a group? I’m not sure if the system here works the same as the one I used in Germany, but I had a shared notebook where many other people could be added.

The reason I think this is important: Researchers often don’t know from the beginning what they need and want to experiment before requesting a specialized setup. Thank you very much!