Customer developer docs

Integration requirements

Know which URLs, secrets, network paths, and gateway decisions your team needs before going live.

Production architecture

This page covers two audiences. SaaS customers integrating against a hosted Ledgix Vault only need the first section. Enterprise customers self-hosting Vault should read all three.

Integration requirements (SaaS customers)

Most customers do not need Ledgix hosting details. You do need a clean picture of what your own application must have before it can integrate safely.

  1. 01
    A Vault URL per environment
    Use the correct Ledgix URL for development, staging, and production. Do not mix API keys across environments.
  2. 02
    A tenant API key
    Create the key in the customer dashboard and store it in your own secret-management flow. Ledgix shows the raw key only once.
  3. 03
    Outbound HTTPS access
    Your application, worker, or server-side route must be able to reach the Vault URL over TLS.
  4. 04
    One protected tool boundary
    Decide whether your first rollout will call the protected service directly after approval or send the token to an optional gateway in front of that service.
Customer integration pattern
Step 01
Application or worker
Your backend, API route, or job worker makes the Ledgix request before the sensitive action happens.
Step 02
Ledgix Vault URL
The public Ledgix endpoint receives the request and returns the approval decision your code must honor.
Step 03
Protected tool or gateway
Your system performs the real payment, refund, or admin action only after the Ledgix response is approved.

Customer responsibilities

  • store LEDGIX_VAULT_API_KEY securely in your own environment
  • keep LEDGIX_VAULT_URL aligned with the environment you are deploying
  • make sure server-side code, workers, and background jobs use the same customer settings
  • decide who owns manual review and notification routing before going live

Optional gateway decision

You can start without a gateway if the protected action is simple and your application owns the full execution path. Add a gateway when:

  • more than one service can trigger the same sensitive action
  • you want the protected service to require a Ledgix token centrally
  • you need a clean boundary between approval and execution

Go-live checklist

  • The application can reach the Vault URL from every runtime that performs the protected action.
  • The correct API key is present in that runtime.
  • A policy source has already been uploaded for the first tool you are guarding.
  • Reviewers know where pending requests appear and how Slack or email notifications are routed.
  • The team has tested at least one approved and one blocked or paused request before production traffic.

Common integration failures

  • Wrong Vault URL for the environment.
  • Expired, deleted, or misplaced API key.
  • Policy content uploaded after the team started testing, which makes early results look inconsistent.
  • Threshold and notification settings left at defaults without deciding who handles review.

Runtime architecture (self-hosted enterprise)

A Ledgix production deployment is two services, one TLS ingress, and a backing data plane.

Vault + Judge production topology
Step 01
Caddy TLS ingress
Terminates TLS and routes customer traffic to Vault. Shipped in deployments/docker-compose.yml with an automatic Let's Encrypt flow.
Step 02
Vault (Go)
Signs A-JWTs with Ed25519, serves JWKS, maintains the per-tenant Merkle ledger, and exposes the customer HTTP API. Optionally fronted by the tool-gateway binary for burn-on-consume flows.
Step 03
llm-judge (Python)
FastAPI microservice. Does pgvector RAG over tenant policies and calls a LiteLLM-configured model to return allow/deny/review.
Step 04
Control plane + tenant databases
Control plane Postgres stores memberships and a clients table with tenant_secret_ref. Each tenant has its own isolated Postgres for policies and ledger.
Step 05
AWS Secrets Manager
Holds per-tenant DB passwords, ledger transport keys, and model credentials under TENANT_SECRET_PREFIX. Fetched on demand, cached briefly.
Step 06
S3 anchor bucket
Receives sealed Merkle checkpoints. Bucket versioning is required; object lock is optional. Backfilled on a fixed interval.

Two-database model

Ledgix deliberately splits control-plane state from tenant state.

  • Control plane (Supabase Postgres): memberships, tenant metadata, clients table. clients.tenant_secret_ref points at an AWS Secrets Manager entry — the row itself never contains tenant DB passwords, ledger transport keys, or Confluence tokens.
  • Tenant databases (one Postgres per tenant): policies (with pgvector embeddings) and the tenant's ledger. Credentials come from Secrets Manager at runtime.

This split is what makes tenant isolation real. A breach of the control plane does not expose tenant policy content or ledger entries.

A-JWT signing

Approval tokens are signed with Ed25519 (EdDSA). Vault publishes the public key at GET /.well-known/jwks.json.

Claims embedded in every A-JWT:

ClaimMeaning
issalcv-vault by default (override with VAULT_JWT_ISSUER)
audledgix-sdk by default (override with VAULT_JWT_AUDIENCE)
expiat + VAULT_JWT_TTL (default 300 seconds)
jtiUnique token ID. Burned on /consume-token; replay returns 409.
toolThe tool the request was approved for
agent_id, session_id, policy_idContext for audit and review
decisionyes, no, or review
tool_args_hashSHA-256 of canonical JSON of the approved arguments

Merkle ledger and anchoring

Every decision is persisted to the tenant's ledger on DB commit. A background anchor loop sequences accepted events into an append-only Merkle tree, signs the checkpoint, and exports it to S3.

  • Durability is synchronous on DB commit.
  • Sequencing and anchoring are asynchronous — they run on the VAULT_LEDGER_ANCHOR_BACKFILL_INTERVAL_SECONDS loop (default 30s).
  • The anchor bucket must have versioning enabled. Object lock is recommended but not required.

Async clearance queue

POST /request-clearance is queue-backed so slow judge calls cannot block the HTTP path.

  • Default: in-memory queue (VAULT_CLEARANCE_ASYNC_WORKERS, VAULT_CLEARANCE_ASYNC_QUEUE_SIZE).
  • Production: point VAULT_CLEARANCE_SQS_QUEUE_URL at an SQS FIFO queue. VAULT_CLEARANCE_SQS_FIFO_GROUP_SHARDS controls parallelism across FIFO groups.

Self-hosted configuration reference

Vault reads configuration from environment variables. When AWS_SECRET_NAME is set, Vault pulls the named Secrets Manager bundle on startup and merges it over the environment.

Vault — transport and signing

FieldTypeRequiredDescription
VAULT_HOSTstringNoBind address. Defaults to 0.0.0.0.
VAULT_PORTintNoListen port. Defaults to 8000.
VAULT_SIGNER_BACKENDenumNolocal or aws_kms. Default local.
VAULT_PRIVATE_KEY_FILEpathConditionalEd25519 private key path (local backend).
VAULT_PRIVATE_KEY_BASE64stringConditionalBase64-encoded Ed25519 private key (local backend).
VAULT_KMS_KEY_IDstringConditionalKMS key id (aws_kms backend).
VAULT_KEY_IDstringNoJWKS kid header. Default vault-key-001.
VAULT_JWT_TTLsecondsNoA-JWT validity. Default 300.
VAULT_JWT_ISSUERstringNoiss claim. Default alcv-vault.
VAULT_JWT_AUDIENCEstringNoaud claim. Default ledgix-sdk.
VAULT_CORS_ALLOWED_ORIGINstringNoCORS origin, e.g. https://dashboard.example.com.

Vault — data plane

FieldTypeRequiredDescription
DATABASE_URLDSNYesControl plane Postgres DSN.
VAULT_CONTROL_PLANE_DB_MAX_OPEN_CONNSintNoControl plane pool size. Default 50.
VAULT_TENANT_DB_MAX_OPEN_CONNSintNoPer-tenant pool size. Default 25.
VAULT_TENANT_DB_SSLMODEenumNoPer-tenant SSL mode. Default require.
TENANT_SECRET_PREFIXstringYesSecrets Manager prefix for per-tenant bundles. Example: ledgix/tenants/prod.
VAULT_TENANT_SECRET_CACHE_TTL_SECONDSsecondsNoTenant secret cache TTL. Default 300.
AWS_SECRET_NAMEstringNoName of the Vault bootstrap secret. Pulled on startup and merged over env.
AWS_REGIONstringNoDefault AWS region. Default us-east-1.

Vault — clearance queue and rate limiting

FieldTypeRequiredDescription
VAULT_CLEARANCE_ASYNC_WORKERSintNoWorkers draining the in-memory queue. Default 16.
VAULT_CLEARANCE_ASYNC_QUEUE_SIZEintNoIn-memory queue depth. Default 2048.
VAULT_CLEARANCE_SQS_QUEUE_URLURLNoWhen set, switches async clearance to SQS FIFO.
VAULT_CLEARANCE_SQS_FIFO_GROUP_SHARDSintNoFIFO group shard count. Default 32.
VAULT_CLEARANCE_SQS_WAIT_TIME_SECONDSsecondsNoLong-poll wait. Default 20.
VAULT_CLEARANCE_SQS_VISIBILITY_TIMEOUT_SECONDSsecondsNoMessage visibility timeout. Default 60.
VAULT_RATE_LIMIT_RPSintNoPer-principal rate limit. Default 20. Set 0 to disable.
VAULT_RATE_LIMIT_BURSTintNoBurst allowance. Default 40.
VAULT_RATE_LIMIT_TTL_SECONDSsecondsNoRate limit window TTL. Default 180.

Vault — ledger anchoring

FieldTypeRequiredDescription
VAULT_LEDGER_ANCHOR_BUCKETstringConditionalS3 bucket for manifest anchoring.
VAULT_LEDGER_ANCHOR_BUCKET_TEMPLATEstringNoPer-tenant bucket template, e.g. ledgix-anchor-{client_id}.
VAULT_LEDGER_ANCHOR_PREFIXstringNoObject key prefix inside the bucket.
VAULT_LEDGER_ANCHOR_REGIONstringNoBucket region. Falls back to AWS_REGION.
VAULT_LEDGER_REQUIRE_BUCKET_VERSIONINGboolNoDefault true. Startup fails if the bucket does not have versioning enabled.
VAULT_LEDGER_REQUIRE_OBJECT_LOCKboolNoDefault false. Enforces S3 object lock when true.
VAULT_LEDGER_ANCHOR_BACKFILL_INTERVAL_SECONDSsecondsNoCheckpoint export cadence. Default 30.
VAULT_LEDGER_ANCHOR_BACKFILL_BATCH_SIZEintNoMax entries per export batch. Default 500.
VAULT_TRANSPORT_KEY_BASE64stringConditional32-byte base64 key for ledger entry encryption at rest.

Vault — judge integration

FieldTypeRequiredDescription
VAULT_JUDGE_URLURLYesllm-judge endpoint.
VAULT_JUDGE_API_KEYstringYesService-to-service key sent as X-API-Key.
VAULT_ALLOW_STUB_JUDGEboolNoUse the deterministic stub judge (development only).

llm-judge — configuration

FieldTypeRequiredDescription
EMBEDDING_MODELstringNoLiteLLM embedding model. Default bedrock/amazon.titan-embed-text-v2:0. Changing this requires re-embedding existing policy chunks.
EVAL_MODELstringNoLiteLLM evaluation model. Default bedrock/amazon.nova-pro-v1:0.
JUDGE_API_KEYstringYesService-to-service API key matching VAULT_JUDGE_API_KEY.
JUDGE_ENDPOINT_RATE_LIMITstringNoslowapi rate limit string. Default 600/minute.
JUDGE_UVICORN_WORKERSintNoWorker process count. Default 2.
DATABASE_URLDSNYesPostgres DSN (control plane access for tenant routing).
TENANT_SECRET_PREFIXstringYesMust match the Vault prefix.
AWS_SECRET_NAMEstringNoJudge bootstrap secret name.
LOG_FORMATenumNojson (default, Datadog/CloudWatch) or text.

Deploy via docker-compose

The reference stack lives in deployments/docker-compose.yml. It launches vault and llm_judge on a private Docker network with Caddy as the TLS ingress.

text
docker compose --env-file .env.prod up -d --build

.env.prod only needs non-sensitive values — primarily VAULT_AWS_SECRET_NAME, JUDGE_AWS_SECRET_NAME, TENANT_SECRET_PREFIX, and AWS_REGION. Each container fetches its secret bundle from AWS Secrets Manager on startup.

For brownfield enterprise rollouts onto pre-existing infrastructure, follow deployments/EXISTING_INFRA_ENTERPRISE_ROLLOUT.md rather than the Terraform stack in deployments/terraform/ (which is greenfield only).