Nvidia GPU
GPU consumption is measured in GPU-seconds. It's essential to understand that a GPU-second
refers to one second of one GPU allocation, not its utilization. This is distinct from burstable CPU or memory where consumption relates to actual usage. GPU-seconds are not included in the Resource Packages and always incur extra charges.
At the moment, Puzl supports NVIDIA A100 (40GB) GPUs only. The amount of GPU available for the request is defined by each Resource Package.
How to Request Nvidia GPU
To leverage GPUs for your pipeline jobs in Puzl, you need to specify the KUBERNETES_GPU_REQUEST
variable in your .gitlab-ci.yml
:
variables:
KUBERNETES_GPU_REQUEST: 2
How to Distribute GPUs Across Containers
Puzl allows you to distribute GPU resources across multiple containers within a single pipeline job. By default, every container in a job will have access to all the requested GPUs. However, you can restrict or specify which GPUs are visible to each container using the NVIDIA_VISIBLE_DEVICES
variable.
For example, in the scenario below:
variables:
KUBERNETES_GPU_REQUEST: 3
NVIDIA_VISIBLE_DEVICES: "0,1"
services:
- name: nvidia/gpu-inference
variables:
NVIDIA_VISIBLE_DEVICES: "2"
The main build container will have access to GPUs with indexes 0
and 1
, while the nvidia/gpu-inference
service container will only have access to GPU with index 2
.
Ensuring Correct GPU Allocation
There are a few things to remember when distributing GPUs across containers:
If you want to ensure a container does not have access to any GPUs, set the
NVIDIA_VISIBLE_DEVICES
variable to an empty string""
for that container.The sum of GPUs indicated in the
NVIDIA_VISIBLE_DEVICES
environment variables across all containers should match the total GPUs requested inKUBERNETES_GPU_REQUEST
.