E.g. Debian with Curl, Ubuntu Build etc.

HTML 73.3%
Shell 13.7%
Python 13%

Find a file

j 62238d8ce7 All checks were successful Build-Publish-Multi-Arch / build (linux/amd64) (push) Successful in 10s Details Build-Publish-Multi-Arch / build (linux/arm64) (push) Successful in 13s Details Build-Publish-Multi-Arch / create-manifest (push) Successful in 8s Details Build-Publish-GPU / build-gpu (push) Successful in 17s Details Build-Publish-GPUBig / build-gpubig (push) Successful in -1m43s Details Move pocket-tts bake script to a file; heredoc + --mount crashed build		2026-07-04 23:08:31 +12:00
.gitea/workflows	Replace Supertonic TTS with Pocket TTS; drop pipecat and LocalVQE	2026-07-04 22:58:50 +12:00
llm-cpu-gemma-chat/webui	feat: add thinking toggle and live token stats to web UI	2026-06-18 13:29:03 +12:00
bake_pocket.py	Move pocket-tts bake script to a file; heredoc + --mount crashed build	2026-07-04 23:08:31 +12:00
Dockerfile.accelerated_base	make APT_PROXY conditional on proxy host being resolvable	2026-05-25 23:08:42 +12:00
Dockerfile.clabtree-api-base	Move pocket-tts bake script to a file; heredoc + --mount crashed build	2026-07-04 23:08:31 +12:00
Dockerfile.clabtree-faceswap-intel-base	feat: add APT proxy support and python3-dev to faceswap base images	2026-04-07 10:13:19 +12:00
Dockerfile.clabtree-faceswap-nvidia-base	feat: add APT proxy support and python3-dev to faceswap base images	2026-04-07 10:13:19 +12:00
Dockerfile.clabtree-imagetools-base	fix: add peft to imagetools-base for LoRA support	2026-04-06 12:34:49 +12:00
Dockerfile.clabtree-music-pascal	Pin diffusers <0.38.0 to fix infer_schema crash on PyTorch 2.5.1	2026-05-05 22:57:57 +12:00
Dockerfile.clabtree-youtube-tool-base	Add clabtree-youtube-tool-base image with yt-dlp + curl_cffi	2026-06-04 23:30:53 +12:00
Dockerfile.comfyui-cuda	Fix VideoHelperSuite requirements path	2026-05-10 17:17:44 +12:00
Dockerfile.debian-curl	ci: add APT_PROXY build arg for apt caching proxy support	2026-04-06 11:53:25 +12:00
Dockerfile.ds-network-sidecar	Add ds-network-sidecar image with Tailscale and Cloudflare tunnel	2026-04-18 22:37:14 +12:00
Dockerfile.ik_llama.cpp-server-cuda	Update Dockerfile.ik_llama.cpp-server-cuda	2026-05-05 22:39:52 +12:00
Dockerfile.knowledgebase-base	knowledgebase-base: remove stale apt proxy file before apt-get	2026-05-10 23:35:18 +12:00
Dockerfile.llama-cpp-server-sycl	Add llama.cpp server image with Intel SYCL for Arc GPUs	2026-04-16 12:30:34 +12:00
Dockerfile.llm-cpu-gemma-chat	feat: add thinking toggle and live token stats to web UI	2026-06-18 13:29:03 +12:00
Dockerfile.nemotron-speech-ada	feat: add nemotron-speech-ada Dockerfile for Ada Lovelace GPUs	2026-04-06 14:57:41 +12:00
Dockerfile.nemotron-speech-blackwell	feat: split nemotron-speech-blackwell into 3 layered images	2026-04-06 21:51:48 +12:00
Dockerfile.nemotron-speech-blackwell-torch	feat: split nemotron-speech-blackwell into 3 layered images	2026-04-06 21:51:48 +12:00
Dockerfile.nemotron-speech-blackwell-vllm	feat: split nemotron-speech-blackwell into 3 layered images	2026-04-06 21:51:48 +12:00
Dockerfile.orthrus-base	Switch gpubig workflow to gpu64 runner (ripper.home, 64GB), MAX_JOBS=4	2026-05-16 14:54:07 +12:00
Dockerfile.pytorch-gpu	Build flash-attn from source when pre-built wheel unavailable	2026-05-03 20:54:34 +12:00
Dockerfile.pytorch-gpu-pascal	Update Dockerfile.pytorch-gpu-pascal	2026-05-05 19:05:56 +12:00
Dockerfile.qwen3-asr-base	Add gcc/g++ to qwen3-asr-base for vLLM/Triton JIT compilation	2026-04-11 23:35:32 +12:00
entrypoint.ds-network-sidecar.sh	Add ds-network-sidecar image with Tailscale and Cloudflare tunnel	2026-04-18 22:37:14 +12:00
package-lock.json	add package-lock.json	2026-05-25 15:49:39 +12:00
README.md	feat: add thinking toggle and live token stats to web UI	2026-06-18 13:29:03 +12:00

README.md

Generic Docker Images

Reusable Docker base images published to forge.jde.nz/public/.

Requires forgejo runners gpu, gpubig to run the CI.

Available Images

pytorch-gpu

PyTorch + CUDA base image for NVIDIA GPU workloads. Supports all NVIDIA GPUs from GTX 1060 to RTX 5090.

GPU Compute Capabilities: 6.1 (GTX 10xx), 7.5 (RTX 20xx), 8.0 (RTX A6000/A40), 8.6 (RTX 30xx), 8.9 (RTX 40xx), 9.0 (RTX PRO 6000), 10.0 (RTX 50xx/Blackwell)

Includes:

CUDA 12.8.1 runtime
PyTorch 2.7+ with torchaudio
transformers, accelerate, safetensors
Flash Attention 2 (compiled for all target architectures)
FFmpeg, libsndfile, sox (audio processing)
Python 3 venv at /opt/venv

Usage:

FROM forge.jde.nz/public/pytorch-gpu:latest

RUN pip install my-project-specific-packages
COPY my_app/ /app/
CMD ["python", "/app/server.py"]

Architecture: x86_64 only (CUDA)

Size: ~8-10GB

orthrus-base

PyTorch + Flash Attention 2 + transformers 5.8+ base image for Orthrus-Qwen3 models with diffusion-mode speculative decoding.

Includes:

CUDA 12.8.1 runtime
PyTorch 2.x with CUDA 12.8
transformers >= 5.8.0, accelerate >= 1.13.0
Flash Attention 2 (compiled for Ampere+ architectures: sm_80, sm_86, sm_89, sm_90, sm_100)
FastAPI + uvicorn
Python 3 venv at /opt/venv

Usage:

FROM forge.jde.nz/public/orthrus-base:latest
COPY server.py /app/
CMD ["python3", "/app/server.py"]

Architecture: x86_64 only (CUDA, Ampere+ GPUs)

clabtree-imagetools-base

Pre-built base image for clabtree-imagetools. Layers all heavy ML deps on top of pytorch-gpu.

Includes: torchvision, transformers 4.x (pinned below 5.0 for Flux2KleinPipeline compat), diffusers from git (Flux2KleinPipeline), torchao, onnxruntime-gpu, insightface, gfpgan, realesrgan, opencv-python-headless, pillow-heif

Purpose: Avoids 3+ minute pip builds on every imagetools deploy. Consumer image just copies app code on top.

Architecture: x86_64 only (CUDA, GPU runner only)

clabtree-api-base

Pre-built base image for clabtree-api. Bundles all heavy Python dependencies so API deploys only need to copy app code.

Includes:

PyTorch CPU (no CUDA — API runs on CPU-only server)
pipecat-ai with voice pipeline extras (silero, webrtc, openai, smart-turn)
LiveKit agents + plugins, sherpa-onnx, LocalVQE neural AEC (compiled from source)
fastembed (ONNX embeddings)
weasyprint + pandoc (PDF generation)
ffmpeg, audio/graphics system libraries
All other API Python deps (fastapi, httpx, aiosqlite, etc.)
All models pre-baked so the API image and runtime download nothing: livekit turn-detector EOU models (via download-files), sherpa STT/TTS, LocalVQE GGUF, and the BAAI/bge-small-en-v1.5 fastembed model. Cache dirs are pinned (HF_HOME, FASTEMBED_CACHE_PATH) so baked models resolve at runtime.

Purpose: Avoids per-deploy HuggingFace fetches. The consumer image (clabtree-api) is pure app code: FROM base + COPY . ..

Architecture: x86_64 only

accelerated_base

Media processing base image with Intel QuickSync / VA-API + FFmpeg hardware acceleration.

Features:

Debian trixie (Debian 13) — ships the Intel iHD media driver 25.x, which supports modern Intel iGPUs including Arrow Lake-S. (Ubuntu 22.04's iHD 22.x did not, so VA-API hardware encode failed on Arrow Lake.)
FFmpeg (Debian 7.1, VA-API enabled) + FFmpeg dev libraries
Intel QuickSync / VA-API via intel-media-va-driver-non-free (x86_64)
Python venv at /opt/venv (on PATH) — consumer images pip install directly, no PEP 668 friction
OpenCV (headless), numpy, Pillow, imageio; non-root appuser

Usage: layer your app on top and pass --device /dev/dri at runtime for hardware encode/decode:

FROM forge.jde.nz/public/accelerated_base:latest
USER root
RUN pip install --no-cache-dir my-deps
COPY app/ /app/

Architecture: x86_64 (Intel QuickSync) + arm64 (VA-API libraries only — no Intel iGPU driver, software fallback)

Note: the previous Ubuntu CUDA-11.8 NVENC libs (libnvidia-encode-515 etc.) were removed — they were unused by consumers. Use a CUDA base image if you need NVIDIA NVENC.

nemotron-speech-blackwell

Pre-built base image for clabtree-parakeet-asr on Blackwell GPUs (RTX 50xx, sm_120/sm_121). Contains everything from pipecat-ai/nemotron-january-2026 Dockerfile.unified except the app code layer (Phase 11).

Includes: PyTorch from source (CUDA 13.0/13.1), torchaudio, NeMo ASR+TTS, vLLM, llama.cpp, triton, mamba-ssm — all compiled for Blackwell (sm_120 x86_64 / sm_121 arm64).

Purpose: Avoids the 2-3 hour build on every deploy. Consumer image (clabtree-parakeet-asr) just layers the app code on top (COPY src/ + uv pip install -e .).

Architecture: x86_64 + arm64 (CUDA, GPU runner only)

nemotron-speech-ada

Pre-built base image for clabtree-parakeet-asr on Ada Lovelace GPUs (RTX 40xx, sm_89). ASR only — no vLLM, llama.cpp, or TTS. Pre-built PyTorch cu128 wheels replace the 2-3 hour from-source build in the Blackwell variant.

Includes: PyTorch + torchaudio (pre-built cu128 wheels), NeMo ASR, triton.

Purpose: Same as nemotron-speech-blackwell but for Ada Lovelace. ~20-30 min build time.

Architecture: x86_64 only (CUDA, GPU runner only)

comfyui-cuda

Headless ComfyUI API server with CUDA support. Pre-installs custom nodes for Wan 2.2 GGUF video generation.

Includes:

ComfyUI (latest from git)
PyTorch with CUDA 12.8
ComfyUI-GGUF (city96) — GGUF quantized model loading
ComfyUI-WanMoeKSampler (stduhpf) — auto HighNoise/LowNoise step splitting
ComfyUI-VideoHelperSuite (Kosinkadink) — video output handling
FFmpeg

Usage:

FROM forge.jde.nz/public/comfyui-cuda:latest
COPY workflow.json /comfyui/
COPY app/ /app/

Mount models at runtime via -v /data/models:/comfyui/models/diffusion_models etc.

Architecture: x86_64 only (CUDA)

debian-curl

Minimal Debian image with curl. Multi-arch.

llm-cpu-gemma-chat

Fully self-contained, CPU-only LLM image — a standalone equivalent of the llm-cpu-gemma-26B dropshell template with everything baked in and no external dependencies (no model cache, no router, no auth, no config).

Includes:

llama.cpp CPU server (from ghcr.io/ggml-org/llama.cpp:server)
The Gemma 4 26B-A4B QAT GGUF (UD-Q4_K_XL, ~14 GB) downloaded from HuggingFace at build time and stored inside the image
A tiny no-frills chat webpage served at / — streams tokens live, shows a running token/s counter, and has a Thinking toggle (off by default). Conversation lives only in the browser tab's memory (reload = fresh chat); no auth.
The standard OpenAI API at /v1/* (e.g. /v1/chat/completions)

Usage — everything (model path, web UI, port, sane CPU defaults) is baked in, so there are no required flags:

docker run -d -p 8080:8080 forge.jde.nz/public/llm-cpu-gemma-chat:latest
# Chat UI:    http://localhost:8080
# OpenAI API: http://localhost:8080/v1/chat/completions

Tuning. The defaults (--threads 8, --ctx-size 4096) are tuned for a low-power laptop and are overridable — just append a flag (or set -e LLAMA_ARG_*):

# e.g. a beefier box: more threads + a bigger context window
docker run -d -p 8080:8080 forge.jde.nz/public/llm-cpu-gemma-chat:latest \
  --threads 16 --ctx-size 32768

Set --threads to your CPU's physical core count (on hybrid Intel chips, the auto-detect undercounts — only the P-cores — so setting it explicitly matters). Raise --ctx-size for longer conversations if you have RAM headroom.

Bake a different GGUF with --build-arg MODEL_URL=.... CPU-only; needs ~16 GB+ RAM free for the 26B model (less at smaller --ctx-size).

Size: ~15 GB (model is baked in). Architecture: x86_64 + arm64.

Registry

Images are available at:

forge.jde.nz/public/<image_name>:latest — multi-arch manifest (or x86_64 for GPU images)
forge.jde.nz/public/<image_name>:latest-x86_64 — x86_64 specific
forge.jde.nz/public/<image_name>:latest-aarch64 — arm64 specific (non-GPU only)

Building

Images are automatically built and published by CI on push to main.

To build locally: ./build.sh To publish: ./publish.sh