HoloInsight is a cloud-native observability platform with a special focus on real-time log analysis and AI integration.
-
Updated
Jul 10, 2025 - Java
HoloInsight is a cloud-native observability platform with a special focus on real-time log analysis and AI integration.
GMSSH: Desktop-Grade AI-Driven Operations Terminal High Performance · Non-Intrusive · AI-Powered;GMSSH 桌面级 AI 运维终端.高性能·零侵入·AI 智驱
AI-powered SRE platform for automated incident investigation
clank your infra
InsightSolver: Colab notebooks for exploring and solving operational issues using deep learning, machine learning, and related models.
An autonomous SRE agent that monitors cloud logs across multiple platforms, leveraging AI models from various providers to detect anomalies, perform root cause analysis, and automate remediation by creating GitHub Pull Requests.
AI that ships your code. Deploy to any cloud with plain English.
ARF is an agentic reliability intelligence platform that separates decision intelligence (OSS) from governed execution (Enterprise), enabling autonomous operations with deterministic safety guarantees.
🏰 Real-time dashboard for monitoring and managing Clawdbot AI agents
Open source code for AIOpsServing
tpu-doc is a zero-dependency diagnostic binary for Google Cloud TPU environments that instantly validates hardware health, discovers software stack configurations, and provides AI-powered log analysis to eliminate expensive debugging downtime.
🤖 Build and deploy scalable Multi-AI Agent systems with LangGraph and Groq LLMs to enhance intelligence across enterprise applications.
🚀 Enhance Google Cloud operations with the Gemini SRE Agent, automating log monitoring and incident response for smarter site reliability.
Real-world patterns for shipping AI agents to production. Learn versioning, cost optimization, multi-tenancy, guardrails, and observability through runnable TypeScript examples.
It is an AI-powered DevOps tool that analyzes Linux server logs to detect anomalies and predict failures. It integrates ML models, automated fixes via Ansible, containerization with Docker, and orchestration using Kubernetes—providing a full-stack solution for predictive maintenance.
Production-ready MLOps platform for monitoring and evaluating LLM response quality with automated alerts and real-time analytics
AI-powered alert automation for n8n — unify alerts from monitoring systems, analyze via LLM, and auto-notify DevOps teams on Telegram.
Advanced, end-to-end, enterprise-grade agentic AI pipeline that automates competitor ad intelligence, performs multimodal creative strategy extraction, enables brand-safe adaptation, and generates AI video ads using LLM reasoning, multimodal analysis, and deterministic workflow orchestration with full auditability.
ReliaKit TL-15 is an open-source, planet-grade resilience framework for distributed infrastructure. It integrates automated DDoS protection, geo-aware routing, chaos engineering, and symbolic AI hooks to achieve fault tolerance beyond traditional benchmarks.
Add a description, image, and links to the ai-ops topic page so that developers can more easily learn about it.
To associate your repository with the ai-ops topic, visit your repo's landing page and select "manage topics."