Install · happy path in five steps

One Helm chart. One upstream.

The adapter is stateless and OpenAI-API-compatible. If you already run a gateway and have a Triton endpoint reachable from the mesh, you have everything you need.

Install the chart
Pulls the cosign-signed OCI image, deploys 3 replicas with a PodDisruptionBudget, NetworkPolicy, OPA sidecar, and OTel scrape annotations.
$ helm installbash
```
helm install mamba-nemotron-agw-adapter \
  oci://ghcr.io/yawningmonsoon/charts/mamba-nemotron-agw-adapter \
  --version 0.1.0 \
  -n agent-runtime --create-namespace \
  --set agentos.certificateRef=cert-mamba-nemotron-agw-adapter-0.1.0 \
  --set triton.endpoint=triton-inference.gpu-pool.svc.cluster.local:8001
```
Verify
$ kubectl rollout status deploy/mamba-nemotron-agw-adapter -n agent-runtime
$ kubectl get pods -n agent-runtime -l app.kubernetes.io/name=mamba-nemotron-agw-adapter
Expect: 3/3 Running, certificate annotation present on Deployment

Register the upstream (Solo.io shown — any OpenAI-compat gateway works)

Drop the upstream into your gateway’s namespace. With Solo.io Agentgateway it’s picked up within 30 seconds.

upstream.yamlyaml

apiVersion: gateway.solo.io/v1
kind: Upstream
metadata:
  name: nemotron-on-prem
  namespace: gloo-system
spec:
  kube:
    serviceName: mamba-nemotron-agw-adapter
    serviceNamespace: agent-runtime
    servicePort: 8080
  ai:
    provider:
      openaiCompatible:
        baseUrl: "http://mamba-nemotron-agw-adapter.agent-runtime.svc.cluster.local:8080/v1"
        models:
          - nemotron-mini-4b-instruct
          - llama-3.1-nemotron-70b-instruct
          - nemotron-4-340b-instruct

bash

kubectl apply -f upstream.yaml

Smoke-test through the gateway
Send any OpenAI-format chat completion to your gateway with model: nemotron-mini-4b-instruct. The adapter answers within the §17 SLO.
$ curlbash
```
curl -sS https://agentgateway.local/v1/chat/completions \
  -H 'Authorization: Bearer $AGENT_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "nemotron-mini-4b-instruct",
    "messages": [{"role":"user","content":"Hello from the adapter."}]
  }' | jq
```
Verify
$ curl -sS http://mamba-nemotron-agw-adapter.agent-runtime.svc.cluster.local:8080/healthz
$ curl -sS http://mamba-nemotron-agw-adapter.agent-runtime.svc.cluster.local:8080/metrics | grep nemotron_adapter_requests_total
Expect: 200 from /healthz; nemotron_adapter_requests_total counter increments per call
Verify the governance signals
Within five seconds of the smoke test, all four signals below should be present.
- AuditAn event in s3://agentos-audit-immutable/component=mamba-nemotron-agw-adapter/… queryable via Athena
- LineageAn OpenLineage RunEvent in Marquez under namespace agentos.llm-calls
- Metricsnemotron_adapter_requests_total increments per call, labelled by agent_cert_id / lob / model / status
- TracesA span per request, parented from the gateway, exported via the OTel collector
Pin the version in your cluster manifests
The certificate is bound to a specific image digest. Pin image.digest in your Helm values before rolling beyond evaluation, so a re-pull cannot substitute an unsigned image.
values.production.yamlyaml
```
image:
  repository: ghcr.io/yawningmonsoon/mamba-nemotron-agw-adapter
  tag: "0.1.0"
  digest: "sha256:<from-cosign-verify-output>"

agentos:
  certificateRef: cert-mamba-nemotron-agw-adapter-0.1.0
```

Uninstall

bash

kubectl delete -f upstream.yaml
helm uninstall mamba-nemotron-agw-adapter -n agent-runtime

Audit data in S3 Object Lock and lineage data in Marquez persist independently. Uninstalling does not, and cannot, remove either.

Read the spec Pricing

One Helm chart. One upstream.

Install the chart

Register the upstream (Solo.io shown — any OpenAI-compat gateway works)

Smoke-test through the gateway

Verify the governance signals

Pin the version in your cluster manifests

Uninstall