Agent Infrastructure
as Code.

Orloj is an open-source orchestration runtime for multi-agent AI systems. Define agents, tools, policies, and workflows in YAML. Orloj schedules, executes, and governs them.

The Problem

Production agents need governance.

Same agent ambition. Different operational outcomes once runtime constraints are enforced as policy, not convention.

CapabilityTodayWith Orloj
Tool Boundaries

Agents call tools they should not touch.

Tool permissions enforced at execution time.

Cost Controls

Token spend spikes without policy limits.

Per-agent token caps and model allowlists.

Failure Handling

Retries and dead-letter handling are hand-rolled.

Lease-based retry, replay, and dead-letter primitives.

System Composition

Multi-agent wiring lives in bespoke glue code.

Declarative YAML graphs with fan-out and join gates.

Auditability

No end-to-end trace when incidents hit production.

Full task trace and message lifecycle logging.

Tool Boundaries

TodayAgents call tools they should not touch.

With OrlojTool permissions enforced at execution time.

Cost Controls

TodayToken spend spikes without policy limits.

With OrlojPer-agent token caps and model allowlists.

Failure Handling

TodayRetries and dead-letter handling are hand-rolled.

With OrlojLease-based retry, replay, and dead-letter primitives.

System Composition

TodayMulti-agent wiring lives in bespoke glue code.

With OrlojDeclarative YAML graphs with fan-out and join gates.

Auditability

TodayNo end-to-end trace when incidents hit production.

With OrlojFull task trace and message lifecycle logging.

Why Orloj

From prototype logic to production runtime guarantees.

The platform is designed for teams that need deterministic execution, policy enforcement, and safe operations under real production load.

01

Agents as declarative manifests, not programs

Version-controlled manifests for agents, tools, models, and workflows. Apply once, diff in PRs, and roll back safely.

  • Git-native change control
  • Apply + rollback via CLI
  • Data contracts over glue code
02

Governance enforced at the execution layer

Policies and permissions are evaluated inline on every turn and tool call. Unauthorized actions fail closed with traceable outcomes.

  • Hard runtime gates
  • Scoped roles and policy sets
  • Structured denial events
03

Production reliability built into the runtime

Lease-based ownership, bounded retries with jitter, dead-letter handling, fan-out/fan-in orchestration, and cron scheduling.

  • Concurrency-safe task ownership
  • Native retry + dead-letter flows
  • Operational primitives included
See It

One command. Full agent system.

orlojctl apply -f ./your-system/ reconciles agents, graph, governance, and tasks in a single declarative pass.

Step 1

Define an agent

Agents declare model, tools, permissions, and execution limits as data. No bespoke orchestration code required.

agent.yamlYAML
apiVersion: orloj.dev/v1
kind: Agent
metadata:
  name: research-agent
spec:
  model_ref: openai-default
  prompt: |
    You are a research assistant.
    Produce concise, evidence-backed answers.
  tools:
    - web_search
    - vector_db
  roles:
    - analyst-role
  limits:
    max_steps: 6
    timeout: 30s
Step 2

Compose a workflow graph

AgentSystem resources connect specialized agents into deterministic pipelines with explicit handoffs.

agent-system.yamlYAML
apiVersion: orloj.dev/v1
kind: AgentSystem
metadata:
  name: report-system
spec:
  agents:
    - planner-agent
    - research-agent
    - writer-agent
  graph:
    planner-agent:
      next: research-agent
    research-agent:
      next: writer-agent
Step 3

Enforce governance

Policies are runtime gates. Blocked actions return structured errors and complete audit traces.

policy.yamlYAML
apiVersion: orloj.dev/v1
kind: AgentPolicy
metadata:
  name: cost-and-security-policy
spec:
  apply_mode: scoped
  target_systems:
    - report-system
  max_tokens_per_run: 50000
  allowed_models:
    - gpt-4o
  blocked_tools:
    - filesystem_delete
Capabilities

Built for production. Not prototypes.

Core runtime capabilities exposed as resources and controls your platform team can reason about, review, and operate.

Architecture

Server. Workers. Governance.

Orloj runs as a server/worker architecture that scales from a single process to distributed deployments. Governance is enforced inline at the worker layer.
Orlojruntime
Serverorlojd
API ServerREST, watch, web console
Resource Storemem or Postgres
Task Schedulerassignment, cron, webhooks
Servicesreconciliation loops per resource
assigns tasks
Governanceenforced inline at the worker layer
AgentPolicyAgentRoleToolPermission
Workersorlojworker
Model GatewayOpenAI, Anthropic, Ollama
Tool Runtimesandboxed, container, WASM
Message Busmem or NATS JetStream
Task Workerlease-based, concurrent

Single process. In-memory storage. Sequential execution. No external dependencies.

orlojd --embedded-worker --storage-backend=memory
Templates

Starter templates for real operational workflows.

Each template is a ready-to-deploy Orloj manifest for a common infrastructure task. These are on the roadmap and community contributions are welcome.
Coming soon

Incident response triage

Webhook-triggered. Agents pull logs, correlate metrics, check recent deployments. Read-only tool permissions mean investigation agents can look but can't roll back infrastructure.

Coming soon

Compliance evidence collector

Pipeline agents check contracts against regulatory requirements. Model whitelists keep sensitive content off unapproved providers. Every finding is traced and auditable.

Coming soon

CVE investigation pipeline

Researcher, analyst, and editor stages in a hierarchical agent system. The researcher can query CVE databases; only the editor can write to the output. Token budgets enforced per run.

Coming soon

Secret rotation auditor

Agents scan infrastructure for stale or exposed secrets using WASM-isolated tools. Metadata-only access patterns let agents audit secrets without reading secret values.

20 templates planned. See the full roadmap → or contribute a template →

Get Started

Running in five minutes.

1

Install CLI and init a project

brew tap OrlojHQ/orloj
brew install orlojctl
orlojctl init example-system
2

Install runtime binaries

curl -sSfL https://raw.githubusercontent.com/OrlojHQ/orloj/main/scripts/install.sh | sh
3

Run Orloj locally

orlojd --storage-backend=memory --embedded-worker
4

Deploy your agent system

orlojctl apply -f example-system
Need a full walkthrough and production setup guidance? Read the full quickstart →
Community

Built in the open. Contribute from day one.

Orloj is Apache 2.0. The full runtime is open source: governance, orchestration, scheduling, observability.

GitHub

Star the repo, read the source, open an issue.

github.com/OrlojHQ/orloj →

Discord

Ask questions, share what you're building, join weekly community calls.

discord.gg/orloj →

Contribute

Good first issues labeled. Architecture docs available. PRs welcome.

Contributing guide →

Stop wiring. Start declaring.

Define your agents, enforce your policies, and ship to production.