February 2026 - V1.16.0
✨✨✨ The most recent release brings important stability improvements, performance optimizations, and a new maintenance mode capability. Check out what’s changed.
🚀 Highlights
Agent Maintenance Mode
CloudPilot AI now supports Maintenance Mode for remote troubleshooting. When enabled, a secure tunnel is created between your cluster and the CloudPilot AI backend, allowing the support team to use kubectl to diagnose and resolve issues directly. The tunnel is not open to the public internet and is disabled by default.
- Enable or disable maintenance mode via a simple
kubectlcommand from your environment. - The secure tunnel is fully controlled by the user — it must be manually enabled and should be disabled after the issue is resolved.
- Provides maintenance mode related APIs for integration with external tooling and workflows.
Java Workload Optimization
Workload Autoscaler now provides a dedicated optimization pipeline for Java workloads, upgrading from container-level tuning to JVM + Container joint optimization. Unlike traditional approaches that only adjust Pod Requests/Limits, CloudPilot AI sees inside the JVM and directly manages Heap parameters through a complete “observe → decide → execute → validate” closed loop.
- Automatic runtime detection:
cloudpilot-node-agentdetects each Pod’s language and runtime profile (Java, Tomcat, etc.) and automatically classifies the workload. - Deep JVM observability: Collects key JVM metrics including Heap Used/Committed/Max, GC frequency, GC pause time, GC pressure trends, and correlates them with container-level signals (OOMKill, restarts, memory pressure).
- Joint Heap and container recommendations: Produces coordinated
-Xmxtarget ranges and Pod Requests/Limits recommendations, accounting for non-Heap usage (Metaspace, Code Cache, Thread Stack, Direct Memory) and system overhead. - GC risk-aware decisioning: Models both stability goals (avoid OOM, reduce Full GC risk) and cost goals (eliminate idle memory waste). Dynamically increases safety margins when recent OOMs or high GC pressure are detected.
- ResourceStartupBoost: Temporarily increases Pod CPU/Memory during startup windows to absorb peak overhead, then automatically falls back to steady-state recommendations after stabilization — no more choosing between startup reliability and steady-state cost.
⚙️ Enhancements
- Use gzip compression when sending time-series data to reduce bandwidth usage.
- Optimize the time cost of Workload Autoscaler metrics collection.
- Enable Universal optimization automatically when there are pending pods.
- Send warnings when Cluster Autoscaler is detected, guiding users toward better optimization.
- Evict pods slowly during optimization to ensure workload stability.
- Add node tags for better visualization in the console.
- Update managed node selection logic for better virtualization.
- Streamline delta sending logic in the Workload Autoscaler processor.
- Add configurable in-place fallback policy for Workload Autoscaler.
🛠️ Bug Fixes
- Fix an issue where affinity rules were incorrectly added for DaemonSet workloads.
- Fix metrics processor incorrectly calculating pods that are already gone during scheduling simulation.
- Ignore pending pods when running scheduling simulation to avoid inaccurate results.
- Suppress warnings when Cluster Autoscaler has zero replicas or zero healthy replicas.
These updates further improve CloudPilot AI’s overall reliability and user experience. For questions or support, join our Slack community
Stay tuned for more updates! 🚀
