Harness Machina
Open source · Self-hosted · Multi-model
HarnessMachina

Your AI engine.
On your hardware. On your terms.

Harness Machina is the open-source AI harness for organizations that can't send all their data to the cloud. Deploy RAG, multi-model orchestration and agentic workflows inside your perimeter — on hardware you own, certified by partners you trust.

100%
On-prem capable
0
Bytes leaving your VPC
Multi
LLM providers
OSS
Apache 2.0
The problem

Most AI tooling assumes your data can go to the cloud.

For governments, banks, hospitals and defense organizations — it can't. Regulations, sovereignty, classified data, audit requirements: the cloud API stack simply isn't an option. So they get left behind, or they ship insecure point-solutions and pray.

Data residency

Sensitive data never leaves your infrastructure — full stop.

Multi-model freedom

Mix local models with selective external APIs, per workload.

Auditable & open

Open source, inspectable, and aligned with your compliance team.

How it works

Harness · Hardware · Certified partners

Three layers, fully open, fully on your turf.

01
01

The Harness

We build and maintain the Harness Machina software — model orchestration, RAG, agents, tool-use, evals and observability. Open source, free, yours to inspect and extend.

02
02

Your hardware

Buy hardware that matches our reference specification — from a single workstation to a multi-node rack. Run as much capacity as your workloads need.

03
03

Certified install

Hire a certified partner to install, harden and operate it. Start with GreenCo or OrangeCo.

$ harness up --config ./harness.yaml
▸ loading harness spec…
▸ verifying hardware profile: rack-r4 (4x H100, 2TB NVMe)
▸ pulling local models: llama-3.3-70b, qwen2.5-vl, nomic-embed
▸ registering external providers: openai (allow-list: 3 endpoints)
▸ initializing RAG store: 12 corpora, 4.2M documents
▸ enabling audit pipeline → siem://internal
✓ Harness Machina ready · http://harness.internal
Features

Everything you need to run AI inside the perimeter.

RAG, built-in

Document ingestion, chunking, embeddings, vector search and re-ranking — all local by default.

Multi-model

Llama, Qwen, Mistral, DeepSeek, Gemma — plus optional API providers behind allow-lists.

Hybrid mode

Per-workload policy: full local, hybrid, or selective API egress with redaction.

Agents & tools

Tool-use, MCP, sub-agents, scheduled jobs — same building blocks as the modern cloud stacks.

Air-gap ready

Designed to run with zero outbound connectivity. Update via signed offline bundles.

Audit & evals

Every prompt, retrieval and tool call logged to your SIEM. Built-in eval harness.

Identity & RBAC

SSO, SAML, OIDC. Role-scoped models, corpora and tools per team.

Open source

Apache 2.0. No paywall, no telemetry, no vendor lock-in.

Supported AI models

The best open-weight models, all running inside your perimeter.

Mix and match the latest open-weight models per workload — 100% local, no vendor lock-in, no per-token fees. Swap a model with one config line.

Llama 3.3 70B
Meta
LLM

The default workhorse. Strong general reasoning, instruction following and tool-use across 8 languages — the safe pick when you don’t know which model you need yet.

Llama Community
Qwen 3 235B-A22B
Alibaba
LLM

Best multilingual coverage — Spanish, Portuguese, Chinese and Arabic at near-native quality. The right call for LatAm, APAC and MENA deployments.

Apache 2.0
DeepSeek-R1
DeepSeek
Reasoning

Frontier-tier chain-of-thought reasoning, math and code. Use for agentic workflows, complex analysis and anything that benefits from an explicit thinking trace.

MIT
Mistral Large 2
Mistral AI
LLM

EU-built and EU-hosted-friendly. 128K context, excellent function calling — the natural pick for European regulated industries and GDPR-sensitive workloads.

Mistral Research
Qwen 2.5-VL 72B
Alibaba
Vision-Language

Documents, screenshots, charts, receipts, OCR-heavy pipelines. Reads tables and forms reliably — perfect for back-office automation and intake.

Apache 2.0
Gemma 3 27B
Google
Small

Lightweight but capable. Fits on a single GPU and runs fast — ideal for edge nodes, RAG retrievers and high-QPS classification or routing workloads.

Gemma
BGE-M3 / Nomic Embed v2
BAAI / Nomic
Embeddings

The retrieval layer of the local RAG store. Multilingual, multi-granularity (sentence, passage, document) — the default backbone for semantic search.

MIT / Apache 2.0
Whisper Large v3
OpenAI (open weights)
Speech

Speech-to-text for voice agents, meeting transcription and call-center analytics. 99 languages, robust to accents, runs on a single GPU.

MIT

+ DBRX, Phi-4, Command-R+, Llama 3.2-Vision and your own fine-tunes. Bring any GGUF / safetensors checkpoint.

Certified partners

Installed and operated by people you can vet.

Harness Machina is free. Certified partners handle installation, hardening, operations and ongoing support — under contracts that fit your jurisdiction.

🟢GreenCo
Founding partner

Hardware procurement, install and 24/7 operations for regulated industries across LatAm.

🟠OrangeCo
Founding partner

Government and defense-grade deployments, air-gapped installs and compliance support.

Are you a systems integrator or MSP? We're onboarding new partners. Apply for certification →
Use cases

Built for the organizations the cloud forgot.

Government

Citizen services, internal knowledge, classified workflows — fully on-prem.

Finance

Research, compliance, KYC and ops automation under strict data residency rules.

Healthcare

Clinical notes, RAG over patient records, regulatory-grade audit trails.

Defense

Air-gapped agentic systems, multi-classification segregation, signed update channels.

Open source

Inspectable. Forkable. Yours.

Harness Machina is released under the Apache 2.0 license. No usage limits, no telemetry, no “enterprise tier” gating the parts you actually need. We make money through certifications, training and partner support — not by holding your AI hostage.

# Quick start
git clone https://github.com/draix/openharness
cd openharness
./scripts/bootstrap.sh
harness up
# For production, work with a certified partner.

Run AI where your data lives.

Open source today. Certified deployments tomorrow. Get in touch with a partner or start exploring the code.