Cluster Online | - nodes · - GPUs · -GB VRAM

Private AI API
OpenAI-Compatible

Drop-in replacement for OpenAI. 18 models, GPU-accelerated, hosted in the Netherlands. Your data stays private. GDPR compliant.

Get API Key → View Docs

from openai import OpenAI

client = OpenAI(
    base_url="https://api.helheim-ai.dev/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="qwen",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

# Works with any HTTP client
curl https://api.helheim-ai.dev/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 256
  }'

import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.helheim-ai.dev/v1",
    apiKey: "YOUR_API_KEY"
});

const response = await client.chat.completions.create({
    model: "qwen",
    messages: [{role: "user", content: "Hello!"}]
});
console.log(response.choices[0].message.content);

🔄

Drop-in Replacement

Same API as OpenAI. Change one line of code — your base_url — and you're running on private hardware.

⚡

GPU Accelerated

Multi-GPU CUDA inference with 28GB VRAM. RTX 3060 + RTX 5060 Ti. Models load from RAM in <1 second.

🔒

Private & GDPR

Hosted in the Netherlands. Your data never leaves our network. No logging, no tracking, no third-party cloud.

Available Models

18 models ready for inference. Smart routing picks the best one for your task.

Qwen 2.5 14B Best Allround

model="qwen"

14B params

Qwen 2.5 Coder 14B Best Code

model="code"

14B params

Mistral Small 24B Largest

model="mistral-small"

24B params

DeepSeek R1 14B Reasoning

model="deepseek-r1"

14B params

Phi-4 Microsoft

model="phi"

14B params

Gemma 2 9B Google

model="gemma"

9B params

Mistral 7B v0.3 Fast

model="mistral"

7B params

StarCoder2 15B Code

model="starcoder"

15B params

+ 10 more models. Use model="auto" for smart selection or model="fast" for lowest latency.

Quick Start

Three steps to your first API call.

Get an API Key

Install the SDK

pip install openai
Works with the standard OpenAI library.

Make a Request

Set base_url to api.helheim-ai.dev/v1 and you're live.

API Endpoints

POST /v1/chat/completions Chat completion (OpenAI format)

GET /v1/models List available models

GET /health Cluster health check

Simple Pricing

No hidden fees. Pay for what you use.

Free Tier

Try it out

€0

✓ 100 requests/day
✓ All models
✓ OpenAI-compatible
✕ Priority queue

Get Started

Popular

Developer

For indie devs & startups

€19/mo

~10,000 requests included

✓ Unlimited requests
✓ All models
✓ Priority queue
✓ Usage dashboard

Business

For teams & companies

€79/mo

~50,000 requests included

✓ Everything in Developer
✓ Dedicated GPU time
✓ Custom models
✓ Priority support

	Helheim AI	OpenAI	Groq
Privacy / GDPR	✓ EU hosted	✕ US	✕ US
Data logging	✓ None	~ 30 days	~ Unknown
OpenAI-compatible	✓	✓	✓
Starting price	Free	$20/mo	Free (limited)

Private AI APIOpenAI-Compatible

Drop-in Replacement

GPU Accelerated

Private & GDPR

Available Models

Quick Start

Get an API Key

Install the SDK

Make a Request

API Endpoints

Simple Pricing

Free Tier

Developer

Business

Why Helheim?

Private AI API
OpenAI-Compatible