Open-core. Apache-2.0. Self-hosted.

Detect. Remediate. Verify.

The autonomous health autopilot for Kubernetes and the clouds underneath it. Across Kubernetes, AWS, GCP, Azure, and the edge.

Try it now View pricing

$ helm install cha cha/cluster-health-autopilot

One operator. Every cloud. Every stack you already run.

Cloud AWS GCP Azure

K8s distros Kubernetes EKS GKE AKS k3s OpenShift RKE2

Observability & Ticketing Prometheus Alertmanager Grafana OpenProject Jira ServiceNow Slack

K8s-native infra Vault cert-manager CNPG Rook/Ceph External Secrets

AI providers OpenAI Anthropic

The autopilot loop, by default.

Five steps. Re-run on every cycle. Closed-loop is the default mode, not a roadmap milestone.

Install

helm install cha cha/cluster-health-autopilot — works on any conformant K8s 1.27+, including EKS, GKE, AKS, and k3s edge.

$ helm repo add cha https://bionic-ai-solutions.github.io/cluster-health-autopilot
$ helm install cha cha/cluster-health-autopilot
$ kubectl get driftreports -A
NAMESPACE    NAME                                  AGE
kube-system  driftreport-stuck-certreq-xz4q9       12s

Detect

6 K8s probes + 30+ cloud probes across AWS, GCP, and Azure, plus 8 analyzers for the K8s-cloud boundary (secrets, certs, IRSA drift).

Probes  K8s 6   AWS 10  GCP 10  Azure 10
       events  RDS     CloudSQL  SQL
       pods    EBS     GCS       Storage
       nodes   IAM     IAM       Identity
       PVCs    ALB     LB        AppGateway
       certs   ACM     KMS       KeyVault
       ESO     KMS     GKE       AKS  + more

Remediate

5 whitelisted fixers run by default. AI-tier fix proposals require human approval via signed click-to-fix URLs.

DriftReport  StaleErrorPod    — fixer ran    OK
DriftReport  StuckJob         — fixer ran    OK
DriftReport  StuckRS          — fixer ran    OK
DriftReport  StuckCertReq     — fixer ran    OK
DriftReport  TLSSecretMismatch — fixer ran   OK
Re-verify in 60s ...

Report

Findings flow to Slack, Alertmanager, OpenProject (OSS), and Jira / ServiceNow (paid). DriftReport CRDs let you kubectl get your cluster’s drift state.

kubectl get driftreports -A -o wide
NAMESPACE   NAME                    KIND              STATUS
default     drift-tls-secret-1      TLSSecretMismatch resolved
ai          drift-stuck-job-7       StuckJob          resolved
mcp         drift-stuck-certreq-3   StuckCertReq      open    → ticketed

Verify

Re-diagnose after every fix. No "the fix maybe worked" — CHA actively re-checks and closes the loop.

diagnose → fix → re-diagnose → resolve
                            ↑        |
                            +--------+
   Closed-loop is the DEFAULT, not a roadmap milestone.

Why CHA, structurally.

Three architectural commitments that competitors cannot copy without rewriting their product.

In-cluster, on every cloud

The only K8s + cloud health autopilot you can run entirely inside your own cluster — air-gapped, sovereign, k3s edge — that ALSO probes the underlying AWS, GCP, and Azure resources with the workload-identity auth your cluster already uses.

Autonomous remediation, with a safety envelope

CHA is the only product in this category that ships detect → fix → re-verify as the default loop. Everyone else is a copilot for the on-call rotation you already have. We are the autopilot for the on-call rotation you want to retire.

Open-core with predictable pricing

Bottom-up adoption via Helm install. CHA-com paid tier adds LLM-augmented investigation, ticketing, multi-cluster federation, and approval-server — at flat per-cluster pricing. No per-investigation surprises.

vs. the competition.

Detect-fix-verify is the default loop. Every other player is a copilot for the on-call rotation you already have.

Product	Where it runs	Closed-loop?	Pricing
CHA (us)	In-cluster operator	Yes, by default	Flat per-cluster (OSS / Team / Enterprise)
NeuBird	SaaS, pulls telemetry	No — "architecturally enforced read-only"	$15–25 per investigation
Resolve AI	SaaS + thin Satellite	Roadmap (their words: "next milestone")	Contact sales
Ciroos	SaaS, zero-copy queries	Opaque "autonomy slider"	Contact sales
OpenSRE (Tracer)	Customer-hosted (docker-compose)	No — code-blocks mutations	OSS only

See detailed comparisons →

On-call should be quieter every week.

CHA is how you get there. Helm install in 5 minutes. No telemetry exfiltration. No per-investigation surprises.

Try the live playground Read the code