Architecture

GNETiX is composed of four deployable components that work together to bridge messaging platforms to AI-powered infrastructure operations.

The FastAPI backend is the central hub. It handles authentication, organization configuration, the Webex/Slack/Teams WebSocket managers, the relay endpoint for agent communication, and the Director orchestration engine. All LLM calls originate here.

Frontend (Cloud)

The Next.js portal provides the admin UI for managing organizations, users, agents, MCP servers, LLM providers, guardrails, and the real-time pipeline monitor. Built with the App Router, shadcn/ui, and Tailwind CSS.

Agent (On-Prem)

The agent container runs inside your network. It establishes an outbound WebSocket connection to the GNETiX relay and acts as a bridge between the cloud Director and your local MCP servers. The agent discovers tools from connected MCP servers and reports them to the backend. When the Director dispatches a tool call, the agent executes it against the appropriate MCP server and returns the result.

Agents are pure tool executors. They contain no LLM and make no AI decisions. All intelligence lives in the cloud Director. This keeps the on-prem footprint minimal and eliminates the need for GPU resources at the edge.

MCP Servers (On-Prem)

FastMCP-based tool servers expose your infrastructure capabilities over the Model Context Protocol using Streamable HTTP transport. The GNETiX examples repository provides reference implementations for network devices, Kubernetes, Dynatrace, and more. Organizations customize and deploy their own MCP servers for their specific infrastructure.

Data Flow

The end-to-end path for a user message:

User (Webex / Slack / Teams / Voice)
  |
  v
GNETiX Backend --- receives message via platform WebSocket
  |
  v
Director --- builds context, selects LLM, runs tool-call loop
  |
  v
LLM (via LiteLLM) --- determines which tools to call
  |
  v
WebSocket Relay --- dispatches tool calls to on-prem agent
  |
  v
Agent --- executes tool against local MCP server
  |
  v
MCP Server --- runs operation against infrastructure
  |
  v
Results flow back: MCP Server -> Agent -> Relay -> Director -> LLM -> User

A user sends a message through a supported platform (Webex, Slack, Teams, Voice, or Webhook).
The backend's connector manager receives the message and forwards it to the Director.
The Director loads MCP tool definitions from the database, builds a system prompt with soul content and resource context, and calls the LLM.
The LLM decides which tools to invoke. The Director dispatches each tool call through the WebSocket relay to the appropriate on-prem agent.
The agent executes the tool against its local MCP server and returns the result.
The Director feeds tool results back to the LLM, which may request additional tools or produce a final response.
The synthesized response is sent back to the user through the originating platform.

No inbound firewall rules are ever required. All agent communication uses outbound WebSocket connections initiated from your network.

LLM Tiers

Every LLM call specifies a tier rather than a specific model. The tier is mapped to a concrete model based on the organization's configured provider. This allows swapping models without code changes.

Tier	Purpose	Anthropic	OpenAI	Bedrock	Azure
fast	Low-latency, simple tasks	Claude Haiku 4.5	GPT-4o-mini	Claude Haiku 3.5	GPT-4o-mini
balanced	General-purpose (default)	Claude Sonnet 4.6	GPT-4o	Claude Sonnet 4	GPT-4o
powerful	Complex reasoning	Claude Opus 4.6	GPT-4o	Claude Sonnet 4	GPT-4o

Tier-to-model mappings are defined in models.yaml and can be updated without redeploying. The Director currently uses the balanced tier by default.

Components

Backend (Cloud)

Frontend (Cloud)

Agent (On-Prem)

MCP Servers (On-Prem)

Data Flow

LLM Tiers

On this page