Self-Hosting Stirling PDF on Kubernetes

If you work with PDFs regularly and you’re running your own infrastructure, Stirling PDF is hard to beat. It’s a self-hosted, web-based PDF toolbox with over 50 operations — merge, split, compress, convert, OCR, edit, watermark, rotate, and a whole lot more. It’s built on Java/Spring Boot with LibreOffice and Tesseract doing the heavy lifting under the hood. At 75,000+ GitHub stars and counting, it’s become one of the most popular self-hosted apps in the homelab community.

In this post I’ll walk through how I deploy it on my Talos Linux Kubernetes cluster using Ansible and the official Helm chart, and break down what you get on the free tier versus a paid plan.

What Can It Do?

The feature list is genuinely impressive for a free, self-hosted tool:

  • Page operations — merge, split, rotate, reorder, delete, extract, crop, auto-split on blank pages
  • Conversion — PDF to/from Word, Excel, PowerPoint, HTML, Markdown, images (JPG, PNG, TIFF, WebP, SVG), and more
  • OCR — powered by Tesseract with 40+ language support
  • Compression — reduce file size with configurable quality settings
  • Security — add/remove passwords, set permissions, redact content, sign with certificates
  • Stamping & watermarking — text and image watermarks, page numbering, custom stamps
  • Forms — flatten forms, extract form data
  • Repair & sanitize — fix corrupted PDFs, remove JavaScript, strip metadata
  • Compare — visual comparison of two PDFs
  • Pipelines — chain operations together into automated workflows
  • REST API — every operation is callable via API, great for automation

All of the above is available on every tier — including free. Stirling PDF uses an open-core model, so the core functionality is fully open source and there are no artificial feature limits on the PDF operations themselves.

Free vs. Paid: What Actually Changes

Stirling PDF recently moved to a three-tier model: Free, Server, and Enterprise. The thing to understand is that the paid tiers are about operational and organizational features — not about unlocking PDF capabilities. You don’t have to pay to split a PDF or run OCR. Here’s how they differ:

FeatureFreeServerEnterprise
All PDF operations
Self-hosted deployment
User limitUp to 5 usersUnlimitedPer-seat licensing
Community support
Email support✅ (priority)
External database
Google Drive integration
OAuth 2.0 / OIDC SSO
SAML 2.0 SSO
Prometheus monitoring endpoint
Audit logs
Custom metadata automation
SLA guarantee
1:1 meetings / dedicated account manager

For a homelab or small self-hosted setup, the Free tier is genuinely excellent. The 5-user cap is the only real constraint. If you’re running it for your household or a small team, you’ll probably never hit it. Once you cross that threshold, that’s when a paid plan becomes relevant.

Licensing is installation-based (tied to a machine fingerprint), managed through built-in Settings UI since v2.0, and paid via Stripe. Air-gapped Enterprise environments can use offline certificate files.

Deploying on Kubernetes with Ansible

Stirling PDF publishes an official Helm chart, which makes Kubernetes deployment straightforward. I manage my cluster with Ansible, so the deployment lives in a dedicated role that handles the namespace, Helm install, and Gateway API HTTPRoute.

The stack here is:

  • Talos Linux Kubernetes cluster
  • Longhorn for persistent storage (RWO)
  • Gateway API (instead of Ingress) for routing
  • Ansible driving Helm and kubernetes.core.k8s

The Playbook

The top-level playbook is minimal — it just calls the role:

---
# Deploy Stirling-PDF v2.5.1 (Helm chart 3.1.0)
# 80 Gi Longhorn RWO PVC at /configs
# Gateway API HTTPRoute → https://stirlingpdf.example.com
- name: Deploy Stirling-PDF
  hosts: localhost
  connection: local
  gather_facts: false
  vars_files:
    - ../group_vars/talos_cluster.yml
  roles:
    - role: stirlingpdf

Role Defaults

All tunables live in defaults/main.yml. Nothing is hardcoded in the tasks themselves:

---
# Namespace
stirlingpdf_namespace: stirlingpdf

# Helm configuration
stirlingpdf_helm_repo_name: stirling-pdf
stirlingpdf_helm_repo: "https://stirling-tools.github.io/Stirling-PDF-chart"
stirlingpdf_chart: "stirling-pdf/stirling-pdf-chart"
stirlingpdf_chart_version: "3.1.0"   # appVersion 2.5.1; image overridden to 2.8.0
stirlingpdf_image_tag: "2.8.0"
stirlingpdf_release_name: stirling-pdf

# Ingress / networking
stirlingpdf_ingress_enabled: true
stirlingpdf_ingress_host: "stirlingpdf.example.com"

# Container port (app listens on 8080 inside the container)
stirlingpdf_port: 8080

# Persistent storage (Longhorn)
stirlingpdf_storage_class: longhorn
stirlingpdf_storage_size: "80Gi"

# Resource limits
# Stirling-PDF is a Java/Spring Boot app that spins up LibreOffice and
# Tesseract for conversions — give it reasonable headroom.
stirlingpdf_resources_requests_cpu: "500m"
stirlingpdf_resources_requests_memory: "512Mi"
stirlingpdf_resources_limits_cpu: "2000m"
stirlingpdf_resources_limits_memory: "3072Mi"

# Application settings
stirlingpdf_enable_login: "false"
stirlingpdf_tz: "America/Los_Angeles"
stirlingpdf_langs: "en_US"

A note on storage: /configs is where Stirling PDF keeps everything — its settings database, uploaded pipeline configs, scratch space for large conversions. 80 Gi is a comfortable starting point. The Longhorn RWO access mode is fine since this is a single-replica deployment.

Helm Values Template

The Ansible role renders a Helm values file from a Jinja2 template. A few things worth calling out:

# Stirling-PDF Helm values override
# Generated by Ansible — do not edit manually

fullnameOverride: "{{ stirlingpdf_release_name }}"

replicaCount: 1

image:
  registry: docker.stirlingpdf.com
  repository: stirlingtools/stirling-pdf
  tag: "{{ stirlingpdf_image_tag }}"
  pullPolicy: IfNotPresent

service:
  type: ClusterIP
  externalPort: {{ stirlingpdf_port }}

envs:
  - name: SECURITY_ENABLELOGIN
    value: "{{ stirlingpdf_enable_login }}"
  - name: LANGS
    value: "{{ stirlingpdf_langs }}"
  - name: TZ
    value: "{{ stirlingpdf_tz }}"
  # ResourceMonitor thresholds — CPU is measured as a 0.0-1.0 ratio relative
  # to available processors. The JVM startup spike on ARM routinely exceeds
  # 100%, falsely triggering CRITICAL and causing the TempFileShutdownHook
  # constructor to throw (upstream issue #5643).
  # Set to 3.0 (300%) so startup spikes never trip the monitor.
  - name: STIRLING_RESOURCE_CPU_CRITICAL_THRESHOLD
    value: "3.0"
  - name: STIRLING_RESOURCE_CPU_HIGH_THRESHOLD
    value: "2.5"
  - name: STIRLING_RESOURCE_MONITOR_INTERVAL_MS
    value: "120000"

resources:
  requests:
    cpu: "{{ stirlingpdf_resources_requests_cpu }}"
    memory: "{{ stirlingpdf_resources_requests_memory }}"
  limits:
    cpu: "{{ stirlingpdf_resources_limits_cpu }}"
    memory: "{{ stirlingpdf_resources_limits_memory }}"

securityContext:
  enabled: true
  fsGroup: 1000

persistence:
  enabled: true
  accessMode: ReadWriteOnce
  size: "{{ stirlingpdf_storage_size }}"
  path: /configs
  storageClass: "{{ stirlingpdf_storage_class }}"

# Stirling PDF starts Java + LibreOffice/unoserver — this takes 60+ seconds
# on first boot. The chart's default initialDelaySeconds of 5 causes the
# liveness probe to fire before the app is ready, triggering graceful shutdown
# while beans are still initialising (upstream issue #5643).
probes:
  liveness:
    enabled: true
    initialDelaySeconds: 90
    periodSeconds: 10
    failureThreshold: 5
    timeoutSeconds: 5
  readiness:
    enabled: true
    initialDelaySeconds: 60
    periodSeconds: 10
    failureThreshold: 5
    timeoutSeconds: 5

# Disable chart's built-in Ingress — Gateway API HTTPRoute is used instead
ingress:
  enabled: false

Role Tasks

The tasks follow a simple sequence: namespace → Helm repo → render values → Helm install/upgrade → Gateway HTTPRoute:

---
- name: Create stirlingpdf namespace
  kubernetes.core.k8s:
    kubeconfig: "{{ kubeconfig_path }}"
    validate_certs: false
    state: present
    definition:
      apiVersion: v1
      kind: Namespace
      metadata:
        name: "{{ stirlingpdf_namespace }}"
        labels:
          pod-security.kubernetes.io/enforce: baseline
          pod-security.kubernetes.io/audit: baseline
          pod-security.kubernetes.io/warn: baseline

- name: Add Stirling-PDF Helm repository
  ansible.builtin.command:
    cmd: >
      helm repo add {{ stirlingpdf_helm_repo_name }} {{ stirlingpdf_helm_repo }}
  environment:
    KUBECONFIG: "{{ kubeconfig_path }}"
  register: helm_repo_add
  changed_when: "'has been added' in helm_repo_add.stdout or 'already exists' not in helm_repo_add.stderr"
  failed_when: helm_repo_add.rc != 0 and 'already exists' not in helm_repo_add.stderr

- name: Update Helm repositories
  ansible.builtin.command:
    cmd: helm repo update
  environment:
    KUBECONFIG: "{{ kubeconfig_path }}"
  changed_when: true

- name: Generate Stirling-PDF Helm values file
  ansible.builtin.template:
    src: values.yaml.j2
    dest: /tmp/stirlingpdf-values.yaml
    mode: '0600'

- name: Deploy Stirling-PDF via Helm
  ansible.builtin.command:
    cmd: >
      helm upgrade --install {{ stirlingpdf_release_name }} {{ stirlingpdf_chart }}
      --namespace {{ stirlingpdf_namespace }}
      --version {{ stirlingpdf_chart_version }}
      -f /tmp/stirlingpdf-values.yaml
      --wait
      --timeout 10m
  environment:
    KUBECONFIG: "{{ kubeconfig_path }}"
  register: helm_install
  changed_when: "'has been upgraded' in helm_install.stdout or 'has been installed' in helm_install.stdout"

- name: Create HTTPRoute for Stirling-PDF
  kubernetes.core.k8s:
    kubeconfig: "{{ kubeconfig_path }}"
    validate_certs: false
    state: present
    apply: true
    definition:
      apiVersion: gateway.networking.k8s.io/v1
      kind: HTTPRoute
      metadata:
        name: stirling-pdf
        namespace: "{{ stirlingpdf_namespace }}"
      spec:
        parentRefs:
          - name: default-gateway
            namespace: gateway-api
            sectionName: websecure
        hostnames:
          - "{{ stirlingpdf_ingress_host }}"
        rules:
          - backendRefs:
              - name: "{{ stirlingpdf_release_name }}"
                port: "{{ stirlingpdf_port }}"
  when: stirlingpdf_ingress_enabled | bool

A Few Gotchas

Probe timing is critical. Stirling PDF starts a JVM and then brings up LibreOffice and unoserver. On ARM hardware (like a Raspberry Pi or similar compact cluster), this can easily take 60–90 seconds. If you leave the liveness probe at the chart’s default initialDelaySeconds: 5, Kubernetes will kill the pod before it’s finished initializing — which triggers a cascade into TempFileShutdownHook throwing IllegalStateException and cryptic crash loops. Setting initialDelaySeconds: 90 for liveness and 60 for readiness clears this up entirely.

CPU resource monitoring needs tuning on ARM. The built-in ResourceMonitor measures CPU as a ratio relative to available processors. JVM startup on constrained hardware spikes well above 100%, which the default thresholds flag as CRITICAL and can cause the shutdown hook to fire prematurely. Setting STIRLING_RESOURCE_CPU_CRITICAL_THRESHOLD=3.0 gives the app room to breathe at startup without masking real problems at runtime.

fsGroup 1000 is required on the securityContext for the Longhorn PVC to be writable by the container user. Miss this and the app starts but can’t write its config database.

Secrets management. For anything sensitive — login credentials, license keys — I store those in a secrets vault and inject them at deploy time rather than hardcoding in the Ansible defaults. Keep those out of your repo.

Wrap Up

Stirling PDF is one of those tools where the free tier genuinely holds its own. If you’re a solo user or a small household, you get every PDF operation available — OCR, conversion, merging, signing, redacting, pipelines, REST API — with no artificial caps beyond the 5-user limit. The paid tiers unlock organizational capabilities (SSO, unlimited users, audit logs, external DB) rather than core features, which is a respectful way to run an open-core product.

On the Kubernetes side, the official Helm chart is well-maintained and the deployment is clean. The main things to watch out for are probe timing on slower hardware and the CPU monitor thresholds.

Up Next

A couple of things I’m looking at for future posts:

omni-tools — A lightweight, privacy-focused alternative for general file manipulation. It covers images, video, audio, PDF, text, data (JSON/CSV/XML), math, and date/time tools — all processed client-side in the browser. No data ever leaves your machine. The Docker image is only 28 MB and it’s MIT licensed. It’s not a replacement for Stirling PDF’s depth on the PDF side, but as a general Swiss Army knife it’s interesting.

OpenRouter.ai with nanobot/Jarvis — I’m looking at integrating OpenRouter.ai into my nanobot (Jarvis) setup to get access to a broader range of models through a single API endpoint. The idea is to be able to route different tasks to different models — keep Claude Haiku for lightweight conversational stuff, route heavier reasoning to something else — all without maintaining separate API keys and client configs for each provider. More on that once I’ve had time to wire it up properly.

Leave a Comment