- Shell 100%
|
All checks were successful
Build-Publish-GPUBig / build-gpubig (push) Successful in -1m1s
Build-Publish-Multi-Arch / build (linux/arm64) (push) Successful in 14s
Build-Publish-Multi-Arch / build (linux/amd64) (push) Successful in -52s
Build-Publish-Multi-Arch / create-manifest (push) Successful in 7s
Build-Publish-GPU / build-gpu (push) Successful in 5m34s
|
||
|---|---|---|
| .gitea/workflows | ||
| Dockerfile.accelerated_base | ||
| Dockerfile.clabtree-api-base | ||
| Dockerfile.clabtree-faceswap-intel-base | ||
| Dockerfile.clabtree-faceswap-nvidia-base | ||
| Dockerfile.clabtree-imagetools-base | ||
| Dockerfile.clabtree-music-pascal | ||
| Dockerfile.comfyui-cuda | ||
| Dockerfile.debian-curl | ||
| Dockerfile.ds-network-sidecar | ||
| Dockerfile.ik_llama.cpp-server-cuda | ||
| Dockerfile.knowledgebase-base | ||
| Dockerfile.llama-cpp-server-sycl | ||
| Dockerfile.nemotron-speech-ada | ||
| Dockerfile.nemotron-speech-blackwell | ||
| Dockerfile.nemotron-speech-blackwell-torch | ||
| Dockerfile.nemotron-speech-blackwell-vllm | ||
| Dockerfile.orthrus-base | ||
| Dockerfile.pytorch-gpu | ||
| Dockerfile.pytorch-gpu-pascal | ||
| Dockerfile.qwen3-asr-base | ||
| entrypoint.ds-network-sidecar.sh | ||
| package-lock.json | ||
| README.md | ||
Generic Docker Images
Reusable Docker base images published to forge.jde.nz/public/.
Requires forgejo runners gpu, gpubig to run the CI.
Available Images
pytorch-gpu
PyTorch + CUDA base image for NVIDIA GPU workloads. Supports all NVIDIA GPUs from GTX 1060 to RTX 5090.
GPU Compute Capabilities: 6.1 (GTX 10xx), 7.5 (RTX 20xx), 8.0 (RTX A6000/A40), 8.6 (RTX 30xx), 8.9 (RTX 40xx), 9.0 (RTX PRO 6000), 10.0 (RTX 50xx/Blackwell)
Includes:
- CUDA 12.8.1 runtime
- PyTorch 2.7+ with torchaudio
- transformers, accelerate, safetensors
- Flash Attention 2 (compiled for all target architectures)
- FFmpeg, libsndfile, sox (audio processing)
- Python 3 venv at
/opt/venv
Usage:
FROM forge.jde.nz/public/pytorch-gpu:latest
RUN pip install my-project-specific-packages
COPY my_app/ /app/
CMD ["python", "/app/server.py"]
Architecture: x86_64 only (CUDA)
Size: ~8-10GB
orthrus-base
PyTorch + Flash Attention 2 + transformers 5.8+ base image for Orthrus-Qwen3 models with diffusion-mode speculative decoding.
Includes:
- CUDA 12.8.1 runtime
- PyTorch 2.x with CUDA 12.8
- transformers >= 5.8.0, accelerate >= 1.13.0
- Flash Attention 2 (compiled for Ampere+ architectures: sm_80, sm_86, sm_89, sm_90, sm_100)
- FastAPI + uvicorn
- Python 3 venv at
/opt/venv
Usage:
FROM forge.jde.nz/public/orthrus-base:latest
COPY server.py /app/
CMD ["python3", "/app/server.py"]
Architecture: x86_64 only (CUDA, Ampere+ GPUs)
clabtree-imagetools-base
Pre-built base image for clabtree-imagetools. Layers all heavy ML deps on top of pytorch-gpu.
Includes: torchvision, transformers 4.x (pinned below 5.0 for Flux2KleinPipeline compat), diffusers from git (Flux2KleinPipeline), torchao, onnxruntime-gpu, insightface, gfpgan, realesrgan, opencv-python-headless, pillow-heif
Purpose: Avoids 3+ minute pip builds on every imagetools deploy. Consumer image just copies app code on top.
Architecture: x86_64 only (CUDA, GPU runner only)
clabtree-api-base
Pre-built base image for clabtree-api. Bundles all heavy Python dependencies so API deploys only need to copy app code.
Includes:
- PyTorch CPU (no CUDA — API runs on CPU-only server)
- pipecat-ai with voice pipeline extras (silero, webrtc, openai, smart-turn)
- LiveKit agents + plugins, sherpa-onnx, LocalVQE neural AEC (compiled from source)
- fastembed (ONNX embeddings)
- weasyprint + pandoc (PDF generation)
- ffmpeg, audio/graphics system libraries
- All other API Python deps (fastapi, httpx, aiosqlite, etc.)
- All models pre-baked so the API image and runtime download nothing:
livekit turn-detector EOU models (via
download-files), sherpa STT/TTS, LocalVQE GGUF, and theBAAI/bge-small-en-v1.5fastembed model. Cache dirs are pinned (HF_HOME,FASTEMBED_CACHE_PATH) so baked models resolve at runtime.
Purpose: Avoids per-deploy HuggingFace fetches. The consumer image
(clabtree-api) is pure app code: FROM base + COPY . ..
Architecture: x86_64 only
accelerated_base
Media processing base image with Intel QuickSync / VA-API + FFmpeg hardware acceleration.
Features:
- Debian trixie (Debian 13) — ships the Intel iHD media driver 25.x, which supports modern Intel iGPUs including Arrow Lake-S. (Ubuntu 22.04's iHD 22.x did not, so VA-API hardware encode failed on Arrow Lake.)
- FFmpeg (Debian 7.1, VA-API enabled) + FFmpeg dev libraries
- Intel QuickSync / VA-API via
intel-media-va-driver-non-free(x86_64) - Python venv at
/opt/venv(onPATH) — consumer imagespip installdirectly, no PEP 668 friction - OpenCV (headless), numpy, Pillow, imageio; non-root
appuser
Usage: layer your app on top and pass --device /dev/dri at runtime for hardware encode/decode:
FROM forge.jde.nz/public/accelerated_base:latest
USER root
RUN pip install --no-cache-dir my-deps
COPY app/ /app/
Architecture: x86_64 (Intel QuickSync) + arm64 (VA-API libraries only — no Intel iGPU driver, software fallback)
Note: the previous Ubuntu CUDA-11.8 NVENC libs (libnvidia-encode-515 etc.) were removed — they were unused by consumers. Use a CUDA base image if you need NVIDIA NVENC.
nemotron-speech-blackwell
Pre-built base image for clabtree-parakeet-asr on Blackwell GPUs (RTX 50xx, sm_120/sm_121).
Contains everything from pipecat-ai/nemotron-january-2026
Dockerfile.unified except the app code layer (Phase 11).
Includes: PyTorch from source (CUDA 13.0/13.1), torchaudio, NeMo ASR+TTS, vLLM, llama.cpp, triton, mamba-ssm — all compiled for Blackwell (sm_120 x86_64 / sm_121 arm64).
Purpose: Avoids the 2-3 hour build on every deploy. Consumer image (clabtree-parakeet-asr)
just layers the app code on top (COPY src/ + uv pip install -e .).
Architecture: x86_64 + arm64 (CUDA, GPU runner only)
nemotron-speech-ada
Pre-built base image for clabtree-parakeet-asr on Ada Lovelace GPUs (RTX 40xx, sm_89).
ASR only — no vLLM, llama.cpp, or TTS. Pre-built PyTorch cu128 wheels replace the
2-3 hour from-source build in the Blackwell variant.
Includes: PyTorch + torchaudio (pre-built cu128 wheels), NeMo ASR, triton.
Purpose: Same as nemotron-speech-blackwell but for Ada Lovelace. ~20-30 min build time.
Architecture: x86_64 only (CUDA, GPU runner only)
comfyui-cuda
Headless ComfyUI API server with CUDA support. Pre-installs custom nodes for Wan 2.2 GGUF video generation.
Includes:
- ComfyUI (latest from git)
- PyTorch with CUDA 12.8
- ComfyUI-GGUF (city96) — GGUF quantized model loading
- ComfyUI-WanMoeKSampler (stduhpf) — auto HighNoise/LowNoise step splitting
- ComfyUI-VideoHelperSuite (Kosinkadink) — video output handling
- FFmpeg
Usage:
FROM forge.jde.nz/public/comfyui-cuda:latest
COPY workflow.json /comfyui/
COPY app/ /app/
Mount models at runtime via -v /data/models:/comfyui/models/diffusion_models etc.
Architecture: x86_64 only (CUDA)
debian-curl
Minimal Debian image with curl. Multi-arch.
Registry
Images are available at:
forge.jde.nz/public/<image_name>:latest— multi-arch manifest (or x86_64 for GPU images)forge.jde.nz/public/<image_name>:latest-x86_64— x86_64 specificforge.jde.nz/public/<image_name>:latest-aarch64— arm64 specific (non-GPU only)
Building
Images are automatically built and published by CI on push to main.
To build locally: ./build.sh
To publish: ./publish.sh