Running MCP Servers Declaratively with Toolhive on Kubernetes

Martin Buchleitner | 24.10.2025 Artificial Intelligence, Platform Engineering, Cloud Native, DevOps

Short read (~6–7 min) – focused on the operational side of Model Context Protocol (MCP) enablement.

1. Problem in Practice

Single MCP servers are easy. Operating dozens (search, knowledge, cost calculators, monitors, domain data fetchers) introduces challenges:

Drift in versions & schemas
Inconsistent auth and network policies
Manual onboarding into LiteLLM / orchestrators
Limited visibility (which tool is healthy / slow / deprecated?)

You need declarative lifecycle, registry synchronization, and standardized deployment primitives.

2. Why Toolhive?

Toolhive (by Stacklok) positions itself as a control plane for AI tooling inventories. Core value when pairing with MCP:

Declarative tool specification (GitOps friendly)
Central registry API (discovery contract for orchestrators / gateways)
Metadata & policy attachment (owners, categories, cost class, sensitivity)
Lifecycle hooks (retire / replace) without touching application code

This collapses per-team custom glue into a platform service.

3. Install the Operator (Prerequisite)

Before defining any MCPServer resources you must deploy the Toolhive Operator.

Primary docs:

Deploy the operator (Helm): https://docs.stacklok.com/toolhive/guides-k8s/deploy-operator-helm
Then run an MCP server: https://docs.stacklok.com/toolhive/guides-k8s/run-mcp-k8s/

Quick prerequisite checklist:

Kubernetes cluster with RBAC + admission controls configured
Helm installed and access to the target namespace
(Optional) NetworkPolicies enforced if you restrict egress

After Helm install of CRDs and the actual operator, you can apply a minimal MCPServer definition.

4. Minimal MCPServer Definition

 1apiVersion: toolhive.stacklok.dev/v1alpha1
 2kind: MCPServer
 3metadata:
 4  name: osv
 5  namespace: my-namespace # Update with your namespace
 6spec:
 7  image: ghcr.io/stackloklabs/osv-mcp/server
 8  transport: streamable-http
 9  targetPort: 8080
10  port: 8080
11  permissionProfile:
12    type: builtin
13    name: network
14  resources:
15    limits:
16      cpu: '100m'
17      memory: '128Mi'
18    requests:
19      cpu: '50m'
20      memory: '64Mi'

Toolhive controller renders a Deployment + Service and records the server (including live schema fetch) into its registry.

Transport Options

The spec.transport field determines how clients interact with the MCP server:

Value	When to Use	Notes
`stdio`	Tight in-cluster or sidecar style execution where process stdio is proxied	Lower network surface; see docs: https://docs.stacklok.com/toolhive/guides-k8s/run-mcp-k8s#stdio-transport-flow
`streamable-http`	Default for most shared tools needing bidirectional streaming (SSE / incremental outputs)	Supports richer interaction patterns; docs: https://docs.stacklok.com/toolhive/guides-k8s/run-mcp-k8s#streamable-http-and-sse-transport-flow
`http` (if supported separately)	Simple request/response without streaming needs	Use only if streaming adds overhead you do not need

Selection heuristic:

Need incremental updates / long-running operations → streamable-http.
Ultra-low latency internal helper with no network exposure → stdio.
Legacy/simple endpoints returning fast deterministic payloads → http.

Keep this consistent across similar tool classes to simplify client capability negotiation.

5. Security Considerations

Securing a growing fleet of MCP servers is about shrinking blast radius, enforcing provenance, and preventing silent drift. Start with namespace isolation: group related MCPServer workloads into bounded network zones and apply Kubernetes NetworkPolicies so only approved egress (or east‑west traffic) is permitted. Pair that with deliberate ServiceAccount scoping—issue separate accounts per sensitivity tier (internal metrics fetchers vs. external data enrichers) so a compromised low‑trust tool cannot laterally escalate.

Supply chain integrity matters early. Require image signatures (e.g. Cosign) and gate admission so only attested MCP images are deployed. This reduces the risk of a malicious or tampered server quietly joining your registry. At the interface boundary, enforce schema stability: run a CI contract test that validates each exported function schema contains only approved field patterns (no accidental PII, secrets, or oversized blobs). Reject PRs that broaden scope without review.

Resource governance closes the loop: apply conservative CPU and memory limits to prevent noisy‑neighbor exhaustion and tune requests so autoscaling signals remain meaningful. For multi‑tenant clusters, consider PodSecurity standards and seccomp / read‑only root filesystem for higher risk connectors.

Finally, if you want to centrally control which tools are actually exposed to downstream consumers, leverage the Toolhive tool configuration CRD (see: https://docs.stacklok.com/toolhive/guides-k8s/tool-config). This lets platform teams curate an allow/deny set or apply metadata filters before discovery—separating raw deployed MCPServer instances from the subset officially published to orchestrators or gateways.

6. Summary

Toolhive provides the missing control plane layer once MCP usage expands beyond a handful of ad‑hoc deployments. By installing the operator first, you unlock a declarative MCPServer resource that standardizes runtime, metadata, permission profiles, and transport selection (stdio, streamable-http, or simple http). The registry + reconciliation loop eliminates manual wiring while keeping schemas current.

What materially improves:

Onboarding speed: a reviewed PR defining an MCPServer becomes a live, discoverable tool with consistent defaults.
Consistency & drift reduction: image, transport, and permission conventions enforced centrally instead of per team.
Security posture: signed images, namespace segmentation, least‑privilege ServiceAccounts, contract‑tested schemas.
Curated exposure: the tool configuration CRD lets you publish only approved tools to downstream orchestrators, separating deployment from availability.

Adopt Toolhive when governance and operational overhead start to dominate—typically around the moment multiple teams are duplicating YAML and arguing over version drift. Until then, a few manual manifests are fine; after that inflection point, a control plane repays itself quickly in reduced toil and safer iteration.

Go Back explore our courses

Running MCP Servers Declaratively with Toolhive on Kubernetes

Short read (~6–7 min) – focused on the operational side of Model Context Protocol (MCP) enablement. 1. Problem in Practice Single MCP servers are easy.

Martin Buchleitner | 23.10.2025 Artificial Intelligence, DevOps, Cloud Native, Platform Engineering, Security

Operational Patterns: LiteLLM with MCP Servers (and an n8n + Open WebUI Alternative)

Introduction Model access alone rarely delivers differentiated organizational value. Real leverage appears when language models can safely invoke tools

Jürgen Brüder | 21.10.2025 DevOps, Cloud Native

Docker vs Podman - Choosing the Right Container Platform for Your Team

The container ecosystem has evolved significantly over the past few years, and teams today have more choices than ever when selecting their container runtime

Martin Buchleitner | 16.10.2025 Artificial Intelligence, DevOps, Cloud Native, Security

LiteLLM: Flexible and Secure LLM Access for Organizations

Introduction As organizations increasingly adopt AI-powered solutions, providing secure and flexible access to large language models (LLMs) becomes a critical

Jürgen Brüder | 14.10.2025 Kubernetes, DevOps, Cloud Native

OpenShift Local Development - Comparing Your Options

When developing applications for OpenShift, having a reliable local development environment is crucial. But with several options available, each with different

We are here for you

You are interested in our courses or you simply have a question that needs answering? You can contact us at anytime! We will do our best to answer all your questions.