Running MCP Servers Declaratively with Toolhive on Kubernetes


Bicycle

Short read (~6–7 min) – focused on the operational side of Model Context Protocol (MCP) enablement.

1. Problem in Practice

Single MCP servers are easy. Operating dozens (search, knowledge, cost calculators, monitors, domain data fetchers) introduces challenges:

  • Drift in versions & schemas
  • Inconsistent auth and network policies
  • Manual onboarding into LiteLLM / orchestrators
  • Limited visibility (which tool is healthy / slow / deprecated?)

You need declarative lifecycle, registry synchronization, and standardized deployment primitives.

2. Why Toolhive?

Toolhive (by Stacklok) positions itself as a control plane for AI tooling inventories. Core value when pairing with MCP:

  • Declarative tool specification (GitOps friendly)
  • Central registry API (discovery contract for orchestrators / gateways)
  • Metadata & policy attachment (owners, categories, cost class, sensitivity)
  • Lifecycle hooks (retire / replace) without touching application code

This collapses per-team custom glue into a platform service.

3. Install the Operator (Prerequisite)

Before defining any MCPServer resources you must deploy the Toolhive Operator.

Primary docs:

  1. Deploy the operator (Helm): https://docs.stacklok.com/toolhive/guides-k8s/deploy-operator-helm
  2. Then run an MCP server: https://docs.stacklok.com/toolhive/guides-k8s/run-mcp-k8s/

Quick prerequisite checklist:

  • Kubernetes cluster with RBAC + admission controls configured
  • Helm installed and access to the target namespace
  • (Optional) NetworkPolicies enforced if you restrict egress

After Helm install of CRDs and the actual operator, you can apply a minimal MCPServer definition.

4. Minimal MCPServer Definition

 1apiVersion: toolhive.stacklok.dev/v1alpha1
 2kind: MCPServer
 3metadata:
 4  name: osv
 5  namespace: my-namespace # Update with your namespace
 6spec:
 7  image: ghcr.io/stackloklabs/osv-mcp/server
 8  transport: streamable-http
 9  targetPort: 8080
10  port: 8080
11  permissionProfile:
12    type: builtin
13    name: network
14  resources:
15    limits:
16      cpu: '100m'
17      memory: '128Mi'
18    requests:
19      cpu: '50m'
20      memory: '64Mi'

Toolhive controller renders a Deployment + Service and records the server (including live schema fetch) into its registry.

Transport Options

The spec.transport field determines how clients interact with the MCP server:

ValueWhen to UseNotes
stdioTight in-cluster or sidecar style execution where process stdio is proxiedLower network surface; see docs: https://docs.stacklok.com/toolhive/guides-k8s/run-mcp-k8s#stdio-transport-flow
streamable-httpDefault for most shared tools needing bidirectional streaming (SSE / incremental outputs)Supports richer interaction patterns; docs: https://docs.stacklok.com/toolhive/guides-k8s/run-mcp-k8s#streamable-http-and-sse-transport-flow
http (if supported separately)Simple request/response without streaming needsUse only if streaming adds overhead you do not need

Selection heuristic:

  • Need incremental updates / long-running operations → streamable-http.
  • Ultra-low latency internal helper with no network exposure → stdio.
  • Legacy/simple endpoints returning fast deterministic payloads → http.

Keep this consistent across similar tool classes to simplify client capability negotiation.

5. Security Considerations

Securing a growing fleet of MCP servers is about shrinking blast radius, enforcing provenance, and preventing silent drift. Start with namespace isolation: group related MCPServer workloads into bounded network zones and apply Kubernetes NetworkPolicies so only approved egress (or east‑west traffic) is permitted. Pair that with deliberate ServiceAccount scoping—issue separate accounts per sensitivity tier (internal metrics fetchers vs. external data enrichers) so a compromised low‑trust tool cannot laterally escalate.

Supply chain integrity matters early. Require image signatures (e.g. Cosign) and gate admission so only attested MCP images are deployed. This reduces the risk of a malicious or tampered server quietly joining your registry. At the interface boundary, enforce schema stability: run a CI contract test that validates each exported function schema contains only approved field patterns (no accidental PII, secrets, or oversized blobs). Reject PRs that broaden scope without review.

Resource governance closes the loop: apply conservative CPU and memory limits to prevent noisy‑neighbor exhaustion and tune requests so autoscaling signals remain meaningful. For multi‑tenant clusters, consider PodSecurity standards and seccomp / read‑only root filesystem for higher risk connectors.

Finally, if you want to centrally control which tools are actually exposed to downstream consumers, leverage the Toolhive tool configuration CRD (see: https://docs.stacklok.com/toolhive/guides-k8s/tool-config). This lets platform teams curate an allow/deny set or apply metadata filters before discovery—separating raw deployed MCPServer instances from the subset officially published to orchestrators or gateways.

6. Summary

Toolhive provides the missing control plane layer once MCP usage expands beyond a handful of ad‑hoc deployments. By installing the operator first, you unlock a declarative MCPServer resource that standardizes runtime, metadata, permission profiles, and transport selection (stdio, streamable-http, or simple http). The registry + reconciliation loop eliminates manual wiring while keeping schemas current.

What materially improves:

  • Onboarding speed: a reviewed PR defining an MCPServer becomes a live, discoverable tool with consistent defaults.
  • Consistency & drift reduction: image, transport, and permission conventions enforced centrally instead of per team.
  • Security posture: signed images, namespace segmentation, least‑privilege ServiceAccounts, contract‑tested schemas.
  • Curated exposure: the tool configuration CRD lets you publish only approved tools to downstream orchestrators, separating deployment from availability.

Adopt Toolhive when governance and operational overhead start to dominate—typically around the moment multiple teams are duplicating YAML and arguing over version drift. Until then, a few manual manifests are fine; after that inflection point, a control plane repays itself quickly in reduced toil and safer iteration.

Go Back explore our courses

We are here for you

You are interested in our courses or you simply have a question that needs answering? You can contact us at anytime! We will do our best to answer all your questions.

Contact us