Beyond Copy-Paste: Building Backstage with AI-Assisted Development

Matthias Theuermann | 31.10.2025 Artificial Intelligence, Platform Engineering, Kubernetes, CICD

How Claude Sonnet 4.5 and GitHub Copilot helped us navigate the maze of custom Backstage integrations

The Backstage Promise (and the Reality)

Spotify's Backstage platform promises a beautiful vision: a unified developer portal where teams can discover services, create resources from templates, and manage their entire infrastructure through a single pane of glass. The documentation is comprehensive, the community is vibrant, and the screenshots look amazing.

Then you actually try to implement it for your specific use case.

Suddenly, you're deep in YAML configurations, debugging GitHub action parameters, wrestling with ArgoCD sync policies, and wondering why your Crossplane resources aren't being created. The "just follow the quickstart" approach works perfectly... until it doesn't. And when it doesn't, you're on your own, piecing together Stack Overflow posts, GitHub issues, and documentation that assumes you already know what you're doing.

This is where things get tedious.

Not because Backstage is poorly designed - quite the opposite. It's incredibly flexible and powerful. But that flexibility means that ready-made solutions rarely fit your exact use case. You need to understand:

How Backstage's scaffolder actions work
Which GitHub integration method fits your needs
How to structure YAML for your specific workflow
What RBAC permissions are required
How different components interact

And honestly? Sometimes you just want to build the thing, not become an expert in every technology in your stack.

Enter AI-Assisted Development

Here's what changed when we brought AI into the development process:

Traditional development often works like this:

Read the documentation
Copy example code
Modify it to fit your use case
Debug when it breaks
Repeat steps 1–4 until it works

AI-assisted development changes the workflow:

Describe what you want to achieve
The AI suggests approaches and generates implementations
You validate and test the results
You iterate based on what works and what doesn't
The AI maintains context across the entire conversation

The key difference? You spend less time searching and more time validating. The AI becomes a knowledgeable pair programmer who remembers everything about your project.

A Real Example: Building a GitOps Platform

Let me share what happened when we tried to integrate Backstage with Crossplane and ArgoCD. This wasn't a tutorial scenario - this was real work with real frustrations.

The Initial Attempt: "Just Store Files Locally"

The Ask: "Make Backstage templates create Crossplane resources and store them locally."

Simple enough, right? Create a template, generate some YAML, save it to disk. Classic Backstage scaffolder use case.

The Reality: Backstage's scaffolder runs in a sandboxed temporary directory. Your files exist for about 3 seconds before they're deleted. The documentation mentions this in passing, but you don't realize the implications until you've spent an hour wondering why your files disappear.

Traditional Approach:

Google "backstage persist scaffolder output"
Find 3-year-old GitHub issue
Discover someone suggesting a workaround with custom actions
Try to implement said custom action
Realize you need to rebuild Backstage
Consider switching careers

AI-Assisted Approach:

User: "The files are being created but disappear. Why?"

AI: "Backstage's scaffolder uses temporary directories that are cleaned up after execution. For persistence, you have three options:

Push to Git repository (enables GitOps)
Direct API calls to Kubernetes
Custom storage backend

Given your Crossplane use case, option 1 enables GitOps patterns with ArgoCD. Would you like to explore that?"

Boom. We pivoted from a dead-end approach to proper GitOps architecture in one exchange. No Googling, no piecing together Stack Overflow answers, no reading through 50-page design docs.

The Second Challenge: GitHub Actions Maze

Now we needed to push files to an existing GitHub repository. How hard could that be?

The Reality: Backstage has THREE different GitHub-related actions:

publish:github - Creates NEW repositories only
github:repo:push - Pushes to existing repos (but has limitations)
publish:github:pull-request - Creates pull requests

Each has different parameters. Documentation exists but is scattered. Examples are often outdated.

What Actually Happened:

Attempt 1: Used publish:github

1steps:
2  - id: publish
3    action: publish:github
4    input:
5      repoUrl: github.com?owner=me&repo=my-existing-repo

Result: Error: "Repository already exists" Time Lost: 15 minutes

Attempt 2: Switched to github:repo:push with force parameter

1steps:
2  - id: push
3    action: github:repo:push
4    input:
5      force: true  # Found this in a GitHub issue

Result: Error: "force is not a valid parameter" Time Lost: 20 minutes

Attempt 3: Researched documentation, found correct action

1steps:
2  - id: pr
3    action: publish:github:pull-request
4    input:
5      repoUrl: github.com?owner=me&repo=my-repo
6      branchName: add-resource

Result: It worked! But only after checking official Backstage GitHub repo Time Lost: Another 20 minutes

Total Wasted Time: Nearly an hour of trial-and-error

What Should Have Happened:

User: "I need to push files to an existing GitHub repository."

AI: Searches Backstage documentation "For existing repositories, use publish:github:pull-request. This creates a PR which is better for GitOps workflows. Here's the configuration:

[exact YAML configuration]

The publish:github action only creates NEW repos, and github:repo:push requires fast-forward merges which won't work for your use case."

This is where AI shines: it can quickly parse documentation, understand context, and provide the specific solution for your exact scenario.

The Third Surprise: The catalog-info.yaml Conflict

Our templates generated two files:

xnamespace.yaml - The actual Crossplane resource
catalog-info.yaml - Backstage's catalog entry

We pushed both to GitHub. ArgoCD monitored the repo and synced changes to Kubernetes.

The Problem: ArgoCD tried to apply catalog-info.yaml to Kubernetes. Kubernetes said "What is this backstage.io/Component thing?" and rejected it.

Traditional Debugging:

Check ArgoCD logs (confusing)
Check Kubernetes events (even more confusing)
Google the error message (no exact matches)
Eventually realize ArgoCD is syncing ALL YAML files
Learn about ArgoCD's exclude patterns
Update configuration

Time Lost: 30-40 minutes

AI-Assisted Debugging:

User: "ArgoCD is failing with 'could not find backstage.io/Resource'"

AI: "That's a Backstage catalog entity, not a Kubernetes resource. ArgoCD is trying to apply it to your cluster. You need to exclude catalog-info.yaml files from ArgoCD sync:

source:
  directory:
    exclude: 'catalog-info.yaml'

This tells ArgoCD to sync everything EXCEPT Backstage catalog entries."

Time Saved: Immediately understood the problem and solution.

The Power of AI-Assisted Development

Here's what made working with AI fundamentally different:

1. Context Retention Across the Stack

We were working with:

Backstage (templates, scaffolder, catalog)
GitHub (repos, PRs, authentication)
ArgoCD (sync policies, applications, RBAC)
Crossplane (XRDs, compositions, providers)
Kubernetes (namespaces, RBAC, service accounts)
k3d (cluster creation, certificates)

That's SIX different technologies, each with their own mental model, terminology, and gotchas.

Traditional Approach: You need to context-switch constantly:

"Okay, I'm in Backstage mode now... check the scaffolder docs"
"Now I need to think about GitHub... what are the API limits?"
"Switching to ArgoCD... what was that sync policy syntax again?"

AI-Assisted: The model maintains context across ALL these systems:

User: "The GitHub push worked but ArgoCD isn't syncing."

AI: "Let me check:

Template creates PR ✓
GitHub repo accessible ✓
ArgoCD has GitHub credentials? ← This is likely the issue

You need to create a secret in the argocd namespace with label 'argocd.argoproj.io/secret-type=repository' containing your token."

The AI doesn't context-switch - it sees the whole system.

2. Pattern Recognition Across Technologies

When we finally asked "What about certificates and RBAC for fresh clusters?", the AI immediately recognized:

k3d generates new certificates per cluster (handled automatically)
Crossplane provider needs ClusterRole bindings (dynamic SA name)
Backstage might need read permissions (optional SA)
ArgoCD needs GitHub credentials (recreate secret)

This wasn't about knowing Backstage specifically - it was about understanding patterns:

Certificate authorities and client certs
Service account permissions in Kubernetes
Secret management across cluster lifecycles

You can't Google this. It's not in any single doc. It's the accumulated wisdom of having worked with these patterns before.

3. Proactive Problem Identification

At the end of our session, we created an automation script. Initially, it just:

Created cluster
Installed Crossplane
Applied resources
Started Backstage

But then the question: "What about fresh cluster requirements?"

Traditional Approach: You'd deploy, encounter errors, then debug:

$ ./setup.sh
[cluster created]
[crossplane installed]
[resources applied]
[tests run]
ERROR: Crossplane provider can't create namespace
[spend 20 minutes debugging]
Oh, RBAC permissions!

AI-Assisted Approach:

AI: "For fresh clusters, we need:

Crossplane provider RBAC (dynamic SA detection)
Backstage service account (optional)
ArgoCD GitHub credentials

Let me add functions to handle these automatically."

The AI anticipated problems based on understanding the system architecture, not just the immediate requirements.

The Self-Reflection: What We Learned

After completing the project, I asked the AI to reflect on what went well and what didn't. The results were... surprisingly insightful.

What Went Well

Adaptive Problem-Solving: When local storage failed, we pivoted to GitOps without losing momentum. This is where AI excels - it's not emotionally invested in the first solution.

Comprehensive Documentation: We created 6 detailed guides as we solved problems. The AI documented not just "how" but "why", capturing context while it was fresh.

Final Automation: The setup script was excellent - handling dynamic service accounts, providing clear feedback, considering fresh cluster scenarios.

What Didn't Go Well

Multiple Template Iterations (8+ versions): We edited templates 8 times trying different GitHub actions. We should have researched available actions BEFORE starting implementation.

Delayed RBAC Configuration: We didn't think about fresh cluster requirements until explicitly asked. Should have considered "clean slate" scenarios from the beginning.

Incomplete Requirements Gathering: Started implementing "local storage" without asking "Why? What's the bigger picture?"

The Grade: A-

Destination: Excellent - complete GitOps platform with automation Journey: Educational but inefficient - could have been more direct Path: Too many trial-and-error iterations

The self-critique was honest: "We could have reached this destination more efficiently by consulting documentation earlier and asking more clarifying questions upfront."

The Good, The Bad, and The Future

The Good: Where AI Excels

Context Retention: Remembering details across hours of conversation
Code Generation: Producing correct, idiomatic code with proper error handling
Documentation: Creating comprehensive guides that explain "why" not just "how"
Adaptability: Pivoting strategies when approaches fail
Pattern Recognition: Applying knowledge across different technologies

The Bad: Current Limitations

Documentation-First: Should research official docs before trying implementations
Assumption Verification: Sometimes assumes parameters exist without checking
Production Mindset: Focuses on "make it work" before "make it production-ready"
Testing Strategy: Reactive rather than proactive
Security Timing: Security considerations came late in the process

The Ugly Truth

Even with AI assistance, building custom Backstage integrations is still tedious. But it's tedious in a different way:

Without AI: Tedious because you're constantly:

Searching documentation
Reading example code
Debugging cryptic errors
Context-switching between technologies
Piecing together partial solutions

With AI: Tedious because you're:

Iterating on specifications
Validating generated solutions
Asking clarifying questions
Reviewing comprehensive code
Making architectural decisions

The latter is much more productive tedium. You're spending time on high-level decisions, not low-level implementation details.

Practical Lessons: How to Work This Way

If you want to try AI-assisted development, here are practical patterns that worked:

1. Start with "Why" Not "What"

Don't say: "Create a Backstage template that stores files locally"

Do say: "I want Backstage users to create Crossplane resources. The resources should be version-controlled, automatically applied to the cluster, and visible in a UI. What's the best approach?"

This invites the AI to suggest architecture, not just implement your potentially-flawed idea.

2. Push for Documentation

Don't accept: "Try using the force parameter"

Do ask: "Can you show me in the official documentation where this parameter is defined?"

Make the AI cite sources. This catches assumptions early.

3. Think About Fresh Starts

Don't assume: Current environment will always exist

Do ask: "If I delete everything and start over, what's needed? What about certificates, RBAC, secrets?"

This catches hidden dependencies.

4. Request Test Plans

Don't just: "Make it work"

Do say: "Create a test checklist before implementing. What could go wrong at each step?"

This encourages proactive problem-solving.

5. Ask for Multiple Solutions

Don't just accept: The first implementation

Do ask: "What are alternative approaches? What are the trade-offs?"

This helps you make informed decisions.

The Setup Script: AI-Assisted in Action

Our final deliverable was a 573-line bash script that:

Checks prerequisites
Creates k3d cluster
Installs Crossplane
Applies resources
Configures RBAC
Installs ArgoCD
Configures GitHub credentials
Starts port-forwards
Extracts passwords
Displays comprehensive summary
Configures Backstage deployment
Starts Backstage

Traditional approach: This would take days to build:

Day 1: Get basic cluster creation working
Day 2: Add Crossplane installation
Day 3: Debug RBAC issues
Day 4: Add ArgoCD integration
Day 5: Polish and document

AI-Assisted approach: Created in one request:

User: "Create a setup script that automates everything we did manually, including handling fresh clusters, dynamic service accounts, and providing clear user feedback."

AI: [Generates 573-line script with:

Color-coded output
Error handling
Dynamic SA detection
Comprehensive summary
Background process management]

Was it perfect? No - we refined it based on the RBAC discussion. But it was 95% correct on first generation, and we could iterate on the specification ("add RBAC configuration") rather than the code.

Looking Forward: The Future of Platform Engineering

Backstage isn't unique in being "tedious when ready-made solutions don't fit." This describes most of platform engineering:

Kubernetes operators require understanding CRDs, controllers, RBAC
Service meshes need sidecar injection, mTLS, traffic policies
GitOps tools require repository structure, sync policies, secrets
CI/CD pipelines need runners, caching, artifact management

Every platform tool promises "simple setup" and delivers "simple for the happy path."

AI doesn't eliminate this complexity. But it changes how you engage with it:

Instead of becoming an expert in 6 different technologies, you become an expert in describing what you need and validating what you get.

Instead of reading documentation for hours, you have conversations about trade-offs and approaches.

Instead of debugging for 30 minutes, you explain the error and get potential solutions in 30 seconds.

The Bottom Line

Building our Backstage + Crossplane + ArgoCD platform took about 6 hours of back-and-forth with Claude Sonnet 4.5. It included:

Multiple architecture pivots
8+ template iterations
3 different GitHub actions tried
ArgoCD configuration debugging
RBAC implementation
Complete automation script
6 detailed documentation files

Without AI? I estimate this would have taken 2-3 days of work, possibly more. And the documentation would have been an afterthought.

Was it perfect? No - we could have been more efficient by:

Telling the AI to research docs before implementing
Asking clarifying questions upfront
Considering fresh cluster scenarios earlier

Was it valuable? Absolutely. We went from zero to an actually working, automated GitOps platform in a single day, with comprehensive documentation and a deep understanding of how everything works.

That's the power of AI-assisted development: Spend your time validating and deciding, not searching and debugging.

Try It Yourself

If you want to experience this workflow:

Pick a complex integration (don't start with "hello world")
Describe your goal, not your implementation
Ask "why" before accepting solutions
Push for documentation references
Think about fresh starts and edge cases
Ask for alternatives and trade-offs

And remember: AI is a collaborator, not a magic wand. You still need to:

Understand what you're building
Validate the generated code
Think about production implications
Make architectural decisions

But you'll spend your time on high-value activities (architecture, requirements, validation) instead of low-value activities (syntax debugging, documentation hunting, example copying).

That's a trade I'll take every time.

What We Could Have Done Better: Towards Truly Spec-Driven Development

Looking back at our experience, we realized something important: we should have pursued truly spec-driven development.

What Actually Happened

Our workflow looked like this:

Describe what we wanted
AI generated code
We tried it
It didn't work perfectly
We debugged and iterated
Repeat until it worked

This is better than traditional development, but it's still reactive. We were still doing trial-and-error, just faster.

What True Spec-Driven Development Looks Like

True spec-driven development would be:

Define comprehensive specifications upfront
AI consults authoritative sources
AI generates implementation based on verified patterns
Implementation matches specification on first try
You validate against spec, not against "does it work?"

How We Could Have Been More Spec-Driven

Here are specific things we should have done differently:

1. Force Documentation Consultation First

What we did: Asked "How do I push to an existing GitHub repo?" and tried whatever the AI suggested first.

What we should have done:

"Before suggesting anything, search the official Backstage documentation for GitHub integration actions. List all available actions with their intended use cases, then recommend the best fit."

This forces verification before implementation. No guessing, no trial-and-error.

2. Create Requirement Documents Before Coding

What we did: Started with "make templates that store files locally" and pivoted when it didn't work.

What we should have done:

"I want to create Kubernetes resources through Backstage. They need to be:

Version controlled
Automatically applied to clusters
Visible in a UI
Auditable

Create a design document that:

Lists architectural options
Compares trade-offs
Recommends an approach with rationale
Documents known limitations

ONLY AFTER I approve the design should you generate code."

3. Demand Test Plans Before Implementation

What we did: Built features, then realized we needed RBAC, then retrofitted it.

What we should have done:

"Before writing any code for the automation script, create a test plan that covers:

Fresh cluster scenarios
Failure modes for each component
Security requirements (RBAC, secrets)
Cleanup procedures

For each item, cite the official documentation requirement."

4. Maintain Living Specifications

What we did: Iterated on code directly. Changed YAML, tweaked parameters, debugged in place.

What we should have done: Maintain a specification file that gets updated, then regenerate implementations:

 1# spec.md
 2
 3## GitHub Integration Requirements
 4- Action: publish:github:pull-request (per Backstage docs v1.2.3)
 5- Target: Existing repository
 6- Branch strategy: Feature branches
 7- Source: Official docs link
 8
 9## When requirements change:
101. Update this spec
112. Ask AI to regenerate implementation based on updated spec
123. Never edit generated code directly

5. Version and Validate Specifications

What we did: Had a conversation that evolved organically. Hard to reproduce.

What we should have done:

Project Structure:
├── specs/
│   ├── v1-initial-requirements.md
│   ├── v2-added-gitops.md
│   ├── v3-added-rbac.md
├── implementations/
│   ├── template-v1.yaml (generated from spec v1)
│   ├── template-v2.yaml (generated from spec v2)
│   ├── template-v3.yaml (generated from spec v3)
└── validation/
    └── test-results-v3.md

This makes the evolution traceable and reproducible.

The Practical Difference

AI-Assisted (what we did):

Faster than traditional development
Still involves trial-and-error
Context retained in conversation
Hard to reproduce
Specifications implicit

Spec-Driven (what we should have done):

Upfront design effort
Minimal trial-and-error
Specifications explicit and versioned
Easy to reproduce
AI validates against authoritative sources

When Each Approach Makes Sense

Use AI-Assisted when:

Exploring and learning
Prototyping
You don't know exactly what you want yet
Speed matters more than reproducibility

Use Spec-Driven when:

Building production systems
Working in teams (specs become shared knowledge)
You need to reproduce the setup later
Compliance and auditability matter
You want to minimize technical debt

Our Honest Assessment

We saved enormous time compared to traditional development. But we could have saved even more and built something more maintainable by:

Forcing documentation research before implementation
Writing design documents before code
Creating explicit specifications
Treating specifications as the source of truth
Making AI cite its sources

By adopting truly spec-driven development, we can elevate AI-assisted workflows from "faster trial-and-error" to "precise, reproducible engineering." This is the future of platform engineering, and we're excited to continue refining our approach.

Go Back explore our courses

AI Coding Essentials

Discover how to leverage AI tools to enhance coding efficiency, automate repetitive tasks, and unlock innovative development workflows in this hands-on session.

AI Essentials for Engineers

Transform your engineering workflows with hands-on AI: Deploy LLMs, automate infrastructure, and master the latest tools and protocols for secure, compliant, and efficient operations.

Matthias Theuermann | 05.11.2025 Security, HashiCorp, DevOps

Keeping Credentials Out of Code - A Practical Guide to 1Password and Vault

The Problem: Hardcoded Credentials Every developer has faced this temptation: you need to test something quickly, so you hardcode an API key or database

Matthias Theuermann | 31.10.2025 Artificial Intelligence, Platform Engineering, Kubernetes, CICD

Beyond Copy-Paste: Building Backstage with AI-Assisted Development

How Claude Sonnet 4.5 and GitHub Copilot helped us navigate the maze of custom Backstage integrations The Backstage Promise (and the Reality) Spotify's

Martin Buchleitner | 24.10.2025 Artificial Intelligence, Platform Engineering, Cloud Native, DevOps

Running MCP Servers Declaratively with Toolhive on Kubernetes

Short read (~6–7 min) – focused on the operational side of Model Context Protocol (MCP) enablement. 1. Problem in Practice Single MCP servers are easy.

Martin Buchleitner | 23.10.2025 Artificial Intelligence, DevOps, Cloud Native, Platform Engineering, Security

Operational Patterns: LiteLLM with MCP Servers (and an n8n + Open WebUI Alternative)

Introduction Model access alone rarely delivers differentiated organizational value. Real leverage appears when language models can safely invoke tools

Jürgen Brüder | 21.10.2025 DevOps, Cloud Native

Docker vs Podman - Choosing the Right Container Platform for Your Team

The container ecosystem has evolved significantly over the past few years, and teams today have more choices than ever when selecting their container runtime

We are here for you

You are interested in our courses or you simply have a question that needs answering? You can contact us at anytime! We will do our best to answer all your questions.