How to Write Software with LLMs: A Practical Guide for CTOs

Learn how to write software with LLMs using battle-tested workflows. Fajarix shares practical strategies for shipping faster without sacrificing code quality.

How to Write Software with LLMs — The Definitive Playbook for Shipping Faster in 2025

How to write software with LLMs is the practice of using large language models — such as Claude, GPT, and Gemini — as active coding partners within structured development workflows, enabling engineers, CTOs, and startup founders to architect, generate, review, and iterate on production-grade code at dramatically higher speed while maintaining the quality, security, and maintainability standards that serious software demands.

Here's a stat that should get your attention: engineering teams that integrate LLM-assisted development workflows report shipping features 40–70% faster than teams relying on traditional coding alone, according to multiple 2024–2025 industry surveys from GitHub, McKinsey, and Stack Overflow. Yet the majority of teams still treat LLMs as glorified autocomplete — a copy-paste tool for boilerplate — and then wonder why the resulting code devolves into an unmaintainable mess after a week.

The difference between teams that thrive with LLMs and teams that struggle isn't the model they use. It's the workflow they wrap around it. At Fajarix AI automation, we've spent thousands of hours refining exactly this workflow — building production systems, internal tools, and client products where LLMs do the heavy lifting but human architecture decisions keep everything clean.

This guide is the result. We're going to break down every layer of a practical, repeatable LLM-assisted development process — from choosing the right harness and models, to structuring prompts, to the review discipline that prevents technical debt from spiralling. Whether you're a CTO evaluating how to roll this out across a team or a solo founder trying to ship your MVP in weeks instead of months, this is the article you bookmark.

Why Most People Get Poor Results Writing Software with LLMs

Misconception #1: LLMs Replace Developers

The most damaging misconception in the industry right now is that LLMs can replace software engineers. They cannot — at least not yet. What LLMs actually replace is the manual labour of translating architectural decisions into syntax. The thinking, the system design, the trade-off analysis — that's still entirely on you, and it matters more than ever.

Consider this analogy: a power drill doesn't replace a carpenter. It replaces the manual effort of turning a screwdriver. A carpenter who doesn't understand joints, load-bearing, or material properties will build something that collapses regardless of how fast they can drill. The same principle applies to LLM-assisted development. Your engineering judgment is the structural integrity; the LLM is the power tool.

Misconception #2: You Can Just Paste a Prompt and Ship the Output

The second misconception is the "single-shot" fallacy — the belief that you give an LLM one prompt and ship whatever comes back. This approach produces code that looks correct on the surface but hides subtle bugs, architectural inconsistencies, and security vulnerabilities underneath. It's the software equivalent of building on sand.

The developers and teams who get exceptional results — including the engineers at Fajarix and practitioners like Stavros Korokithakis, whose original workflow essay inspired parts of this guide — all share one thing in common: they treat LLM interaction as an iterative, multi-step, multi-model conversation embedded inside a disciplined engineering process. That's what we're going to teach you.

The Architecture-First Mindset: How to Write Software with LLMs That Actually Scales

Your Role Has Shifted from Writer to Architect

In the LLM-assisted paradigm, your primary job is no longer writing code correctly. Your primary job is architecting systems correctly and making the right technology and design choices that keep your codebase maintainable as it grows. This is the single most important insight in this entire article.

Here's what this means in practice: before you open your coding harness and start prompting, you should have clear answers to these questions:

What is the system boundary? — Define what this component owns and what it delegates to other services.
What are the data models and their relationships? — Sketch your schemas, even if roughly, before generating any code.
What are the key interfaces and contracts? — Define API shapes, function signatures, and module boundaries upfront.
What are the non-negotiable constraints? — Security requirements, performance budgets, compliance rules, technology mandates.
What is the testing strategy? — Decide whether you want unit tests, integration tests, or both generated alongside the code.

When you feed this level of architectural clarity into an LLM, the output quality jumps dramatically. The model isn't guessing at your intent — it's executing within well-defined constraints. This is the difference between a 500-line prompt that produces garbage and a 50-line prompt that produces production-ready code.

Domain Expertise Is Your Unfair Advantage

One pattern we've observed repeatedly at Fajarix — and that echoes the experience of seasoned practitioners — is that domain expertise directly correlates with LLM output quality. When you deeply understand the technology stack you're working with (e.g., backend services in Node.js, infrastructure in Terraform, data pipelines in Python), you can evaluate LLM output at the architectural level and catch problems before they compound.

Conversely, when you're working in a domain you don't understand well — say, a backend developer trying to build a complex mobile app — the LLM's output will seem fine but will be riddled with platform-specific anti-patterns that you can't detect. The code still devolves into a mess, not because the LLM is bad, but because you lack the expertise to steer it.

Key insight: LLMs amplify your existing expertise. If you're a strong architect working in your domain, LLMs make you 5–10x more productive. If you're out of your depth, LLMs give you a false sense of confidence that leads to technical debt. Know which situation you're in.

The Complete LLM-Assisted Development Workflow

Now let's get into the operational detail — the specific tools, techniques, and disciplines that make this work day after day, sprint after sprint, across projects that grow to tens of thousands of lines of code.

Step 1: Choose the Right Harness

Your coding harness is the interface layer between you and the LLM. It's the tool that manages context, sends prompts, receives completions, and ideally integrates with your file system, version control, and development environment. Choosing the right harness is foundational.

Based on our experience building production systems, here are the non-negotiable requirements for a serious harness:

Multi-model support: You must be able to use models from different providers (Anthropic, OpenAI, Google, open-source). No single model is best at everything, and using only one model is like having only a hammer in your toolbox. Tools like OpenCode, Cursor, aider, and Continue all support this.
Context management: The harness should let you control what files, documentation, and system prompts are included in context. Garbage in, garbage out — context quality determines output quality.
Custom agent definitions: Advanced harnesses let you define specialized agents (e.g., a "code reviewer" agent, a "test writer" agent, an "architecture critic" agent) that can be invoked autonomously or in sequence.
Session and history support: You need to be able to resume conversations, reference previous decisions, and maintain continuity across coding sessions.
File system integration: The harness should read and write files directly in your project, not require you to copy-paste code back and forth.

Our top recommendations in mid-2025: Cursor for teams that want an IDE-integrated experience, OpenCode for terminal-native developers who want maximum flexibility, and aider for open-source purists who want git-native workflows. For enterprise teams, we often recommend Continue with a self-hosted model backend for compliance reasons.

Step 2: Use Multiple Models Strategically

This is one of the highest-leverage techniques in LLM-assisted development and one that most developers completely overlook. A single model — no matter how good — has consistent blind spots, stylistic preferences, and failure modes. It will tend to agree with itself even when it's wrong. Using a second model as a reviewer, critic, or alternative implementation source dramatically reduces defect rates.

Here's how we structure multi-model workflows at Fajarix:

Primary generation model: We typically use Claude Sonnet or Claude Opus for initial code generation because of their strong instruction-following and nuanced understanding of architecture.
Review and critique model: We route the generated code through GPT-4o or Gemini Pro for a second opinion. The review prompt explicitly asks the second model to find bugs, anti-patterns, security issues, and deviations from the architectural spec.
Specialist models for specific tasks: For performance-critical code, we sometimes use DeepSeek Coder for its strength in algorithmic reasoning. For infrastructure-as-code, we've found Claude consistently outperforms alternatives.

Pro tip: Think of each model as a team member with different strengths. You wouldn't have the same person write code and review it — the same logic applies to LLMs. Cross-model review catches entire categories of bugs that single-model workflows miss.

Step 3: Structure Your Prompts Like Engineering Specs

The quality of your prompts is the single biggest lever you have over output quality. Vague prompts produce vague code. Precise, structured prompts produce precise, structured code. Here's the prompting framework we use for feature development:

Context block: Describe the system, its purpose, the tech stack, and any relevant architectural decisions. Include file paths and module names the LLM needs to know about.
Task specification: State exactly what you want built — the function, endpoint, component, or module — including input/output contracts, error handling expectations, and edge cases.
Constraints and non-goals: Explicitly state what the LLM should not do. "Do not modify existing function signatures." "Do not add new dependencies." "Do not implement caching in this PR — that's a separate concern." Negative constraints are as important as positive instructions.
Quality criteria: Specify your standards — "Include comprehensive error handling," "Write idiomatic TypeScript," "Follow the existing naming conventions in this file," "Add JSDoc comments for all public functions."
Test expectations: Optionally, include instructions for test generation — "Write unit tests covering the happy path and the three error cases described above."

This may seem like a lot of upfront work, but each of these prompt blocks takes seconds to write when you've already done the architectural thinking from Step 0. And the ROI is enormous — you'll spend far less time fixing broken output than you would with a lazy prompt.

Step 4: Review at the Right Level of Abstraction

Here's a nuanced insight that separates experienced LLM developers from beginners: the level at which you review LLM output should match the capability of the model. In the early days of LLM coding (GPT-3.5 era), you had to review every single line. With GPT-4 and early Claude, you could review at the function level. With today's frontier models (Claude Opus, GPT-4o, Gemini 2.5 Pro), you can often review at the architectural and integration level.

This means your review checklist looks something like this:

Does this module respect the boundaries I defined?
Are the data flows correct — is information going where it should and nowhere else?
Are the external interfaces (APIs, database queries, third-party calls) correct and secure?
Does the error handling strategy match the system's resilience requirements?
Are there any unnecessary dependencies or abstractions introduced?

You're not reading every line of implementation code. You're verifying that the structural and integration-level decisions are sound. If you've given good architectural constraints in your prompt, the implementation details are usually correct. When they're not, it's almost always because the architecture-level instruction was ambiguous, not because the model can't write a for loop.

Step 5: Iterate in Tight Loops, Not Big Bangs

One of the fastest ways to accumulate LLM-generated technical debt is to ask for too much in a single prompt. If you ask an LLM to "build the entire authentication system," you'll get something that sort of works but is full of subtle assumptions you didn't validate. Instead, break the work into tight iterative loops:

Prompt → Generate → Review → Commit. Each cycle should cover one logical unit of work: a single function, a single API endpoint, a single component.
Use version control aggressively. Commit after every successful cycle. This gives you clean rollback points and makes it trivial to identify when something went wrong.
Reference previous context. When starting a new cycle, include relevant code from the previous commit so the LLM has accurate context rather than relying on its memory of what it previously generated.

This discipline is especially critical for projects that grow beyond a few thousand lines of code. The teams we work with at Fajarix web development services have adopted this tight-loop methodology and consistently maintain codebases of 20,000+ lines without the quality degradation that plagues less disciplined LLM workflows.

Real-World Applications: What You Can Build This Way

A common criticism of LLM-assisted development is that it only works for toy scripts and weekend projects. This is flatly wrong. Here are categories of production software that can be — and are being — built with LLM-assisted workflows:

Backend Services and APIs

This is arguably the sweet spot for LLM-assisted development. Backend code tends to follow well-understood patterns (REST/GraphQL endpoints, database CRUD, authentication, authorization, queue processing), and LLMs excel at generating code within these patterns when given clear architectural constraints. We've built complete backend systems with real-time capabilities, multi-tenant data isolation, and complex business logic — all LLM-assisted, all production-grade.

Internal Tools and Admin Dashboards

Internal tools are a perfect use case because the cost of a minor UI imperfection is low, but the cost of not having the tool at all is high. LLMs can generate full CRUD interfaces, data visualization dashboards, and workflow management tools in hours rather than weeks. For our staff augmentation clients, we've used this approach to spin up internal tooling 5x faster than traditional development.

IoT and Hardware-Adjacent Software

Firmware for microcontrollers, data ingestion pipelines for sensor networks, and companion mobile apps — all of these benefit enormously from LLM-assisted development. One practitioner built an entire voice-recording pendant with transcription and webhook integration using this workflow. Another built a wall clock with irregular ticking modes synced via NTP. These aren't toy projects — they're real hardware products with embedded software, built at a fraction of the traditional timeline.

AI Agents and Automation Systems

Perhaps the most meta application: using LLMs to build systems that themselves use LLMs. Personal assistants that manage calendars, do research, write code to extend themselves, and autonomously handle chores. These are complex, stateful, security-sensitive applications — and they're being built with the exact workflow described in this article. At Fajarix AI automation, this is a significant portion of what we build for clients.

The Discipline Layer: Preventing LLM-Generated Technical Debt

Establish and Enforce Coding Standards via System Prompts

One of the most underused features of modern coding harnesses is the system prompt or project rules file. This is a persistent instruction set that gets included in every LLM interaction for a given project. Use it to encode your coding standards, architectural conventions, and quality requirements.

A well-crafted system prompt might include directives like: "Always use structured error types, never throw raw strings." "All database queries must go through the repository layer, never directly from route handlers." "Use dependency injection for all service classes." These rules act as guardrails that prevent the LLM from drifting toward inconsistent patterns across sessions.

Automated Testing as Your Safety Net

LLM-generated code should be held to the same testing standards as human-written code — arguably higher, because the person who wrote it (the LLM) can't be called into a meeting to explain their reasoning. We recommend generating tests as part of the same prompt cycle that generates the implementation code, and then running those tests as a gate before committing.

The combination of LLM-generated tests + human-reviewed test coverage + CI/CD enforcement creates a robust safety net that catches regressions early. Frontier models are now remarkably good at writing comprehensive test suites when you specify the edge cases and failure modes you care about.

Track What You Haven't Read

Here's an honest reality of LLM-assisted development: you will not read every line of code in your project. And that's okay — as long as you know which code you haven't read and have strategies for managing that risk. Automated tests, type checking, linting, and integration tests all serve as proxies for direct code review. If your CI pipeline is green, your type checker is happy, and your integration tests pass, you can be reasonably confident in code you haven't manually inspected.

Rule of thumb: You should be intimately familiar with your project's architecture and interfaces even if you've never read most of the implementation code. If someone asks you how your system works, you should be able to explain every module, every data flow, and every external integration — even if you can't recite the code from memory.

Getting Started: A Step-by-Step Implementation Plan

If you're a CTO or technical founder ready to adopt LLM-assisted development, here's our recommended implementation plan:

Week 1 — Tooling setup: Choose your harness (Cursor, OpenCode, or aider), configure multi-model access (at minimum one Anthropic model + one OpenAI model), and create a project-level system prompt with your coding standards.
Week 2 — Pilot project: Pick a low-risk, real project — an internal tool, a utility service, a migration script. Use the full workflow (architecture first → structured prompts → multi-model review → tight commit loops) and observe the results.
Week 3 — Team training: Share the workflow with your team. Pair program with LLMs — one person prompts, the other reviews. Develop shared prompt templates and system prompts for your organization's tech stack.
Week 4 — Process integration: Integrate LLM-assisted development into your existing CI/CD pipeline, code review process, and sprint planning. Establish metrics: defect rate, velocity, time-to-merge. Compare with your pre-LLM baseline.
Ongoing — Refinement: Continuously update your system prompts, prompt templates, and model choices as new models and tools are released. This field moves fast — what's optimal today may be outdated in three months.

The Future: Where LLM-Assisted Development Is Heading

We've observed a clear trajectory in LLM capabilities: the level at which human review is necessary keeps moving upward. In 2023, you reviewed every line. In 2024, you reviewed every function. In 2025, you review at the module and architecture level. If this trend continues — and there's every reason to believe it will — we may reach a point in 2026 where even architectural-level review becomes optional for certain categories of software.

But that future isn't here yet. Right now, the winning strategy is clear: combine human architectural expertise with LLM execution speed. The teams and founders who master this combination will ship faster, with fewer bugs, and at lower cost than teams on either extreme — those who refuse to use LLMs and those who use them without discipline.

The gap between these groups is widening every month. The best time to adopt a rigorous LLM-assisted development workflow was six months ago. The second best time is today.

Ready to put these insights into practice? The team at Fajarix builds exactly these solutions. Book a free consultation to discuss your project.