Detect. Remediate. Verify.
The autonomous health autopilot for Kubernetes and the clouds underneath it. Across Kubernetes, AWS, GCP, Azure, and the edge.
$ helm install cha cha/cluster-health-autopilot
One operator. Every cloud. Every stack you already run.
The autopilot loop, by default.
Five steps. Re-run on every cycle. Closed-loop is the default mode, not a roadmap milestone.
Install
helm install cha cha/cluster-health-autopilot — works on any conformant K8s 1.27+, including EKS, GKE, AKS, and k3s edge.
$ helm repo add cha https://bionic-ai-solutions.github.io/cluster-health-autopilot
$ helm install cha cha/cluster-health-autopilot
$ kubectl get driftreports -A
NAMESPACE NAME AGE
kube-system driftreport-stuck-certreq-xz4q9 12s Detect
6 K8s probes + 30+ cloud probes across AWS, GCP, and Azure, plus 8 analyzers for the K8s-cloud boundary (secrets, certs, IRSA drift).
Probes K8s 6 AWS 10 GCP 10 Azure 10
events RDS CloudSQL SQL
pods EBS GCS Storage
nodes IAM IAM Identity
PVCs ALB LB AppGateway
certs ACM KMS KeyVault
ESO KMS GKE AKS + more Remediate
5 whitelisted fixers run by default. AI-tier fix proposals require human approval via signed click-to-fix URLs.
DriftReport StaleErrorPod — fixer ran OK
DriftReport StuckJob — fixer ran OK
DriftReport StuckRS — fixer ran OK
DriftReport StuckCertReq — fixer ran OK
DriftReport TLSSecretMismatch — fixer ran OK
Re-verify in 60s ... Report
Findings flow to Slack, Alertmanager, OpenProject (OSS), and Jira / ServiceNow (paid). DriftReport CRDs let you kubectl get your cluster’s drift state.
kubectl get driftreports -A -o wide
NAMESPACE NAME KIND STATUS
default drift-tls-secret-1 TLSSecretMismatch resolved
ai drift-stuck-job-7 StuckJob resolved
mcp drift-stuck-certreq-3 StuckCertReq open → ticketed Verify
Re-diagnose after every fix. No "the fix maybe worked" — CHA actively re-checks and closes the loop.
diagnose → fix → re-diagnose → resolve
↑ |
+--------+
Closed-loop is the DEFAULT, not a roadmap milestone. Why CHA, structurally.
Three architectural commitments that competitors cannot copy without rewriting their product.
In-cluster, on every cloud
The only K8s + cloud health autopilot you can run entirely inside your own cluster — air-gapped, sovereign, k3s edge — that ALSO probes the underlying AWS, GCP, and Azure resources with the workload-identity auth your cluster already uses.
Autonomous remediation, with a safety envelope
CHA is the only product in this category that ships detect → fix → re-verify as the default loop. Everyone else is a copilot for the on-call rotation you already have. We are the autopilot for the on-call rotation you want to retire.
Open-core with predictable pricing
Bottom-up adoption via Helm install. CHA-com paid tier adds LLM-augmented investigation, ticketing, multi-cluster federation, and approval-server — at flat per-cluster pricing. No per-investigation surprises.
vs. the competition.
Detect-fix-verify is the default loop. Every other player is a copilot for the on-call rotation you already have.
| Product | Where it runs | Closed-loop? | Pricing |
|---|---|---|---|
| CHA (us) | In-cluster operator | Yes, by default | Flat per-cluster (OSS / Team / Enterprise) |
| NeuBird | SaaS, pulls telemetry | No — "architecturally enforced read-only" | $15–25 per investigation |
| Resolve AI | SaaS + thin Satellite | Roadmap (their words: "next milestone") | Contact sales |
| Ciroos | SaaS, zero-copy queries | Opaque "autonomy slider" | Contact sales |
| OpenSRE (Tracer) | Customer-hosted (docker-compose) | No — code-blocks mutations | OSS only |
On-call should be quieter every week.
CHA is how you get there. Helm install in 5 minutes. No telemetry exfiltration. No per-investigation surprises.