Beyond Copy-Paste: Building Backstage with AI-Assisted Development


Bicycle

How Claude Sonnet 4.5 and GitHub Copilot helped us navigate the maze of custom Backstage integrations

The Backstage Promise (and the Reality)

Spotify's Backstage platform promises a beautiful vision: a unified developer portal where teams can discover services, create resources from templates, and manage their entire infrastructure through a single pane of glass. The documentation is comprehensive, the community is vibrant, and the screenshots look amazing.

Then you actually try to implement it for your specific use case.

Suddenly, you're deep in YAML configurations, debugging GitHub action parameters, wrestling with ArgoCD sync policies, and wondering why your Crossplane resources aren't being created. The "just follow the quickstart" approach works perfectly... until it doesn't. And when it doesn't, you're on your own, piecing together Stack Overflow posts, GitHub issues, and documentation that assumes you already know what you're doing.

This is where things get tedious.

Not because Backstage is poorly designed - quite the opposite. It's incredibly flexible and powerful. But that flexibility means that ready-made solutions rarely fit your exact use case. You need to understand:

  • How Backstage's scaffolder actions work
  • Which GitHub integration method fits your needs
  • How to structure YAML for your specific workflow
  • What RBAC permissions are required
  • How different components interact

And honestly? Sometimes you just want to build the thing, not become an expert in every technology in your stack.

Enter AI-Assisted Development

Here's what changed when we brought AI into the development process:

Traditional development often works like this:

  1. Read the documentation
  2. Copy example code
  3. Modify it to fit your use case
  4. Debug when it breaks
  5. Repeat steps 1–4 until it works

AI-assisted development changes the workflow:

  1. Describe what you want to achieve
  2. The AI suggests approaches and generates implementations
  3. You validate and test the results
  4. You iterate based on what works and what doesn't
  5. The AI maintains context across the entire conversation

The key difference? You spend less time searching and more time validating. The AI becomes a knowledgeable pair programmer who remembers everything about your project.

A Real Example: Building a GitOps Platform

Let me share what happened when we tried to integrate Backstage with Crossplane and ArgoCD. This wasn't a tutorial scenario - this was real work with real frustrations.

The Initial Attempt: "Just Store Files Locally"

The Ask: "Make Backstage templates create Crossplane resources and store them locally."

Simple enough, right? Create a template, generate some YAML, save it to disk. Classic Backstage scaffolder use case.

The Reality: Backstage's scaffolder runs in a sandboxed temporary directory. Your files exist for about 3 seconds before they're deleted. The documentation mentions this in passing, but you don't realize the implications until you've spent an hour wondering why your files disappear.

Traditional Approach:

  • Google "backstage persist scaffolder output"
  • Find 3-year-old GitHub issue
  • Discover someone suggesting a workaround with custom actions
  • Try to implement said custom action
  • Realize you need to rebuild Backstage
  • Consider switching careers

AI-Assisted Approach:

User: "The files are being created but disappear. Why?"

AI: "Backstage's scaffolder uses temporary directories that are cleaned up after execution. For persistence, you have three options:

  1. Push to Git repository (enables GitOps)
  2. Direct API calls to Kubernetes
  3. Custom storage backend

Given your Crossplane use case, option 1 enables GitOps patterns with ArgoCD. Would you like to explore that?"

Boom. We pivoted from a dead-end approach to proper GitOps architecture in one exchange. No Googling, no piecing together Stack Overflow answers, no reading through 50-page design docs.

The Second Challenge: GitHub Actions Maze

Now we needed to push files to an existing GitHub repository. How hard could that be?

The Reality: Backstage has THREE different GitHub-related actions:

  • publish:github - Creates NEW repositories only
  • github:repo:push - Pushes to existing repos (but has limitations)
  • publish:github:pull-request - Creates pull requests

Each has different parameters. Documentation exists but is scattered. Examples are often outdated.

What Actually Happened:

Attempt 1: Used publish:github

1steps:
2  - id: publish
3    action: publish:github
4    input:
5      repoUrl: github.com?owner=me&repo=my-existing-repo

Result: Error: "Repository already exists" Time Lost: 15 minutes

Attempt 2: Switched to github:repo:push with force parameter

1steps:
2  - id: push
3    action: github:repo:push
4    input:
5      force: true  # Found this in a GitHub issue

Result: Error: "force is not a valid parameter" Time Lost: 20 minutes

Attempt 3: Researched documentation, found correct action

1steps:
2  - id: pr
3    action: publish:github:pull-request
4    input:
5      repoUrl: github.com?owner=me&repo=my-repo
6      branchName: add-resource

Result: It worked! But only after checking official Backstage GitHub repo Time Lost: Another 20 minutes

Total Wasted Time: Nearly an hour of trial-and-error

What Should Have Happened:

User: "I need to push files to an existing GitHub repository."

AI: Searches Backstage documentation "For existing repositories, use publish:github:pull-request. This creates a PR which is better for GitOps workflows. Here's the configuration:

[exact YAML configuration]

The publish:github action only creates NEW repos, and github:repo:push requires fast-forward merges which won't work for your use case."

This is where AI shines: it can quickly parse documentation, understand context, and provide the specific solution for your exact scenario.

The Third Surprise: The catalog-info.yaml Conflict

Our templates generated two files:

  • xnamespace.yaml - The actual Crossplane resource
  • catalog-info.yaml - Backstage's catalog entry

We pushed both to GitHub. ArgoCD monitored the repo and synced changes to Kubernetes.

The Problem: ArgoCD tried to apply catalog-info.yaml to Kubernetes. Kubernetes said "What is this backstage.io/Component thing?" and rejected it.

Traditional Debugging:

  • Check ArgoCD logs (confusing)
  • Check Kubernetes events (even more confusing)
  • Google the error message (no exact matches)
  • Eventually realize ArgoCD is syncing ALL YAML files
  • Learn about ArgoCD's exclude patterns
  • Update configuration

Time Lost: 30-40 minutes

AI-Assisted Debugging:

User: "ArgoCD is failing with 'could not find backstage.io/Resource'"

AI: "That's a Backstage catalog entity, not a Kubernetes resource. ArgoCD is trying to apply it to your cluster. You need to exclude catalog-info.yaml files from ArgoCD sync:

source:
  directory:
    exclude: 'catalog-info.yaml'

This tells ArgoCD to sync everything EXCEPT Backstage catalog entries."

Time Saved: Immediately understood the problem and solution.


The Power of AI-Assisted Development

Here's what made working with AI fundamentally different:

1. Context Retention Across the Stack

We were working with:

  • Backstage (templates, scaffolder, catalog)
  • GitHub (repos, PRs, authentication)
  • ArgoCD (sync policies, applications, RBAC)
  • Crossplane (XRDs, compositions, providers)
  • Kubernetes (namespaces, RBAC, service accounts)
  • k3d (cluster creation, certificates)

That's SIX different technologies, each with their own mental model, terminology, and gotchas.

Traditional Approach: You need to context-switch constantly:

  • "Okay, I'm in Backstage mode now... check the scaffolder docs"
  • "Now I need to think about GitHub... what are the API limits?"
  • "Switching to ArgoCD... what was that sync policy syntax again?"

AI-Assisted: The model maintains context across ALL these systems:

User: "The GitHub push worked but ArgoCD isn't syncing."

AI: "Let me check:

  • Template creates PR ✓
  • GitHub repo accessible ✓
  • ArgoCD has GitHub credentials? ← This is likely the issue

You need to create a secret in the argocd namespace with label 'argocd.argoproj.io/secret-type=repository' containing your token."

The AI doesn't context-switch - it sees the whole system.

2. Pattern Recognition Across Technologies

When we finally asked "What about certificates and RBAC for fresh clusters?", the AI immediately recognized:

  • k3d generates new certificates per cluster (handled automatically)
  • Crossplane provider needs ClusterRole bindings (dynamic SA name)
  • Backstage might need read permissions (optional SA)
  • ArgoCD needs GitHub credentials (recreate secret)

This wasn't about knowing Backstage specifically - it was about understanding patterns:

  • Certificate authorities and client certs
  • Service account permissions in Kubernetes
  • Secret management across cluster lifecycles

You can't Google this. It's not in any single doc. It's the accumulated wisdom of having worked with these patterns before.

3. Proactive Problem Identification

At the end of our session, we created an automation script. Initially, it just:

  • Created cluster
  • Installed Crossplane
  • Applied resources
  • Started Backstage

But then the question: "What about fresh cluster requirements?"

Traditional Approach: You'd deploy, encounter errors, then debug:

$ ./setup.sh
[cluster created]
[crossplane installed]
[resources applied]
[tests run]
ERROR: Crossplane provider can't create namespace
[spend 20 minutes debugging]
Oh, RBAC permissions!

AI-Assisted Approach:

AI: "For fresh clusters, we need:

  1. Crossplane provider RBAC (dynamic SA detection)
  2. Backstage service account (optional)
  3. ArgoCD GitHub credentials

Let me add functions to handle these automatically."

The AI anticipated problems based on understanding the system architecture, not just the immediate requirements.

The Self-Reflection: What We Learned

After completing the project, I asked the AI to reflect on what went well and what didn't. The results were... surprisingly insightful.

What Went Well

Adaptive Problem-Solving: When local storage failed, we pivoted to GitOps without losing momentum. This is where AI excels - it's not emotionally invested in the first solution.

Comprehensive Documentation: We created 6 detailed guides as we solved problems. The AI documented not just "how" but "why", capturing context while it was fresh.

Final Automation: The setup script was excellent - handling dynamic service accounts, providing clear feedback, considering fresh cluster scenarios.

What Didn't Go Well

Multiple Template Iterations (8+ versions): We edited templates 8 times trying different GitHub actions. We should have researched available actions BEFORE starting implementation.

Delayed RBAC Configuration: We didn't think about fresh cluster requirements until explicitly asked. Should have considered "clean slate" scenarios from the beginning.

Incomplete Requirements Gathering: Started implementing "local storage" without asking "Why? What's the bigger picture?"

The Grade: A-

Destination: Excellent - complete GitOps platform with automation Journey: Educational but inefficient - could have been more direct Path: Too many trial-and-error iterations

The self-critique was honest: "We could have reached this destination more efficiently by consulting documentation earlier and asking more clarifying questions upfront."

The Good, The Bad, and The Future

The Good: Where AI Excels

  • Context Retention: Remembering details across hours of conversation
  • Code Generation: Producing correct, idiomatic code with proper error handling
  • Documentation: Creating comprehensive guides that explain "why" not just "how"
  • Adaptability: Pivoting strategies when approaches fail
  • Pattern Recognition: Applying knowledge across different technologies

The Bad: Current Limitations

  • Documentation-First: Should research official docs before trying implementations
  • Assumption Verification: Sometimes assumes parameters exist without checking
  • Production Mindset: Focuses on "make it work" before "make it production-ready"
  • Testing Strategy: Reactive rather than proactive
  • Security Timing: Security considerations came late in the process

The Ugly Truth

Even with AI assistance, building custom Backstage integrations is still tedious. But it's tedious in a different way:

Without AI: Tedious because you're constantly:

  • Searching documentation
  • Reading example code
  • Debugging cryptic errors
  • Context-switching between technologies
  • Piecing together partial solutions

With AI: Tedious because you're:

  • Iterating on specifications
  • Validating generated solutions
  • Asking clarifying questions
  • Reviewing comprehensive code
  • Making architectural decisions

The latter is much more productive tedium. You're spending time on high-level decisions, not low-level implementation details.

Practical Lessons: How to Work This Way

If you want to try AI-assisted development, here are practical patterns that worked:

1. Start with "Why" Not "What"

Don't say: "Create a Backstage template that stores files locally"

Do say: "I want Backstage users to create Crossplane resources. The resources should be version-controlled, automatically applied to the cluster, and visible in a UI. What's the best approach?"

This invites the AI to suggest architecture, not just implement your potentially-flawed idea.

2. Push for Documentation

Don't accept: "Try using the force parameter"

Do ask: "Can you show me in the official documentation where this parameter is defined?"

Make the AI cite sources. This catches assumptions early.

3. Think About Fresh Starts

Don't assume: Current environment will always exist

Do ask: "If I delete everything and start over, what's needed? What about certificates, RBAC, secrets?"

This catches hidden dependencies.

4. Request Test Plans

Don't just: "Make it work"

Do say: "Create a test checklist before implementing. What could go wrong at each step?"

This encourages proactive problem-solving.

5. Ask for Multiple Solutions

Don't just accept: The first implementation

Do ask: "What are alternative approaches? What are the trade-offs?"

This helps you make informed decisions.

The Setup Script: AI-Assisted in Action

Our final deliverable was a 573-line bash script that:

  • Checks prerequisites
  • Creates k3d cluster
  • Installs Crossplane
  • Applies resources
  • Configures RBAC
  • Installs ArgoCD
  • Configures GitHub credentials
  • Starts port-forwards
  • Extracts passwords
  • Displays comprehensive summary
  • Configures Backstage deployment
  • Starts Backstage

Traditional approach: This would take days to build:

  • Day 1: Get basic cluster creation working
  • Day 2: Add Crossplane installation
  • Day 3: Debug RBAC issues
  • Day 4: Add ArgoCD integration
  • Day 5: Polish and document

AI-Assisted approach: Created in one request:

User: "Create a setup script that automates everything we did manually, including handling fresh clusters, dynamic service accounts, and providing clear user feedback."

AI: [Generates 573-line script with:

  • Color-coded output
  • Error handling
  • Dynamic SA detection
  • Comprehensive summary
  • Background process management]

Was it perfect? No - we refined it based on the RBAC discussion. But it was 95% correct on first generation, and we could iterate on the specification ("add RBAC configuration") rather than the code.

Looking Forward: The Future of Platform Engineering

Backstage isn't unique in being "tedious when ready-made solutions don't fit." This describes most of platform engineering:

  • Kubernetes operators require understanding CRDs, controllers, RBAC
  • Service meshes need sidecar injection, mTLS, traffic policies
  • GitOps tools require repository structure, sync policies, secrets
  • CI/CD pipelines need runners, caching, artifact management

Every platform tool promises "simple setup" and delivers "simple for the happy path."

AI doesn't eliminate this complexity. But it changes how you engage with it:

Instead of becoming an expert in 6 different technologies, you become an expert in describing what you need and validating what you get.

Instead of reading documentation for hours, you have conversations about trade-offs and approaches.

Instead of debugging for 30 minutes, you explain the error and get potential solutions in 30 seconds.

The Bottom Line

Building our Backstage + Crossplane + ArgoCD platform took about 6 hours of back-and-forth with Claude Sonnet 4.5. It included:

  • Multiple architecture pivots
  • 8+ template iterations
  • 3 different GitHub actions tried
  • ArgoCD configuration debugging
  • RBAC implementation
  • Complete automation script
  • 6 detailed documentation files

Without AI? I estimate this would have taken 2-3 days of work, possibly more. And the documentation would have been an afterthought.

Was it perfect? No - we could have been more efficient by:

  • Telling the AI to research docs before implementing
  • Asking clarifying questions upfront
  • Considering fresh cluster scenarios earlier

Was it valuable? Absolutely. We went from zero to an actually working, automated GitOps platform in a single day, with comprehensive documentation and a deep understanding of how everything works.

That's the power of AI-assisted development: Spend your time validating and deciding, not searching and debugging.

Try It Yourself

If you want to experience this workflow:

  1. Pick a complex integration (don't start with "hello world")
  2. Describe your goal, not your implementation
  3. Ask "why" before accepting solutions
  4. Push for documentation references
  5. Think about fresh starts and edge cases
  6. Ask for alternatives and trade-offs

And remember: AI is a collaborator, not a magic wand. You still need to:

  • Understand what you're building
  • Validate the generated code
  • Think about production implications
  • Make architectural decisions

But you'll spend your time on high-value activities (architecture, requirements, validation) instead of low-value activities (syntax debugging, documentation hunting, example copying).

That's a trade I'll take every time.

What We Could Have Done Better: Towards Truly Spec-Driven Development

Looking back at our experience, we realized something important: we should have pursued truly spec-driven development.

What Actually Happened

Our workflow looked like this:

  1. Describe what we wanted
  2. AI generated code
  3. We tried it
  4. It didn't work perfectly
  5. We debugged and iterated
  6. Repeat until it worked

This is better than traditional development, but it's still reactive. We were still doing trial-and-error, just faster.

What True Spec-Driven Development Looks Like

True spec-driven development would be:

  1. Define comprehensive specifications upfront
  2. AI consults authoritative sources
  3. AI generates implementation based on verified patterns
  4. Implementation matches specification on first try
  5. You validate against spec, not against "does it work?"

How We Could Have Been More Spec-Driven

Here are specific things we should have done differently:

1. Force Documentation Consultation First

What we did: Asked "How do I push to an existing GitHub repo?" and tried whatever the AI suggested first.

What we should have done:

"Before suggesting anything, search the official Backstage documentation for GitHub integration actions. List all available actions with their intended use cases, then recommend the best fit."

This forces verification before implementation. No guessing, no trial-and-error.

2. Create Requirement Documents Before Coding

What we did: Started with "make templates that store files locally" and pivoted when it didn't work.

What we should have done:

"I want to create Kubernetes resources through Backstage. They need to be:

  • Version controlled
  • Automatically applied to clusters
  • Visible in a UI
  • Auditable

Create a design document that:

  1. Lists architectural options
  2. Compares trade-offs
  3. Recommends an approach with rationale
  4. Documents known limitations

ONLY AFTER I approve the design should you generate code."

3. Demand Test Plans Before Implementation

What we did: Built features, then realized we needed RBAC, then retrofitted it.

What we should have done:

"Before writing any code for the automation script, create a test plan that covers:

  • Fresh cluster scenarios
  • Failure modes for each component
  • Security requirements (RBAC, secrets)
  • Cleanup procedures

For each item, cite the official documentation requirement."

4. Maintain Living Specifications

What we did: Iterated on code directly. Changed YAML, tweaked parameters, debugged in place.

What we should have done: Maintain a specification file that gets updated, then regenerate implementations:

 1# spec.md
 2
 3## GitHub Integration Requirements
 4- Action: publish:github:pull-request (per Backstage docs v1.2.3)
 5- Target: Existing repository
 6- Branch strategy: Feature branches
 7- Source: Official docs link
 8
 9## When requirements change:
101. Update this spec
112. Ask AI to regenerate implementation based on updated spec
123. Never edit generated code directly

5. Version and Validate Specifications

What we did: Had a conversation that evolved organically. Hard to reproduce.

What we should have done:

Project Structure:
├── specs/
│   ├── v1-initial-requirements.md
│   ├── v2-added-gitops.md
│   ├── v3-added-rbac.md
├── implementations/
│   ├── template-v1.yaml (generated from spec v1)
│   ├── template-v2.yaml (generated from spec v2)
│   ├── template-v3.yaml (generated from spec v3)
└── validation/
    └── test-results-v3.md

This makes the evolution traceable and reproducible.

The Practical Difference

AI-Assisted (what we did):

  • Faster than traditional development
  • Still involves trial-and-error
  • Context retained in conversation
  • Hard to reproduce
  • Specifications implicit

Spec-Driven (what we should have done):

  • Upfront design effort
  • Minimal trial-and-error
  • Specifications explicit and versioned
  • Easy to reproduce
  • AI validates against authoritative sources

When Each Approach Makes Sense

Use AI-Assisted when:

  • Exploring and learning
  • Prototyping
  • You don't know exactly what you want yet
  • Speed matters more than reproducibility

Use Spec-Driven when:

  • Building production systems
  • Working in teams (specs become shared knowledge)
  • You need to reproduce the setup later
  • Compliance and auditability matter
  • You want to minimize technical debt

Our Honest Assessment

We saved enormous time compared to traditional development. But we could have saved even more and built something more maintainable by:

  1. Forcing documentation research before implementation
  2. Writing design documents before code
  3. Creating explicit specifications
  4. Treating specifications as the source of truth
  5. Making AI cite its sources

By adopting truly spec-driven development, we can elevate AI-assisted workflows from "faster trial-and-error" to "precise, reproducible engineering." This is the future of platform engineering, and we're excited to continue refining our approach.

Go Back explore our courses

We are here for you

You are interested in our courses or you simply have a question that needs answering? You can contact us at anytime! We will do our best to answer all your questions.

Contact us