Does your LLM keep generating code you don’t want even after prompting and context?
You’ve given it documentation. You’ve crafted detailed prompts. You’ve provided examples and context. Maybe you’ve even given it tools and skills. And yet—it still generates code that violates your architecture, skips your conventions, or ignores your guidelines.
The problem isn’t your prompts. The problem is that prompts, context, and tools aren’t enough.
The missing piece? Validation guardrails that create feedback loops.
When you use LLMs to generate output, you can’t rely on instructions and tools alone. You need validation tooling that checks the output and provides feedback so the LLM can correct itself.
The Problem with Instructions Alone
Giving an LLM documentation, guidelines, and even detailed prompts is a great start. But relying solely on instructions is fundamentally insufficient.
Large language models are probabilistic by nature. Even with perfect prompts, they’ll occasionally:
- Forget a convention mentioned earlier in the context
- Misinterpret a guideline under pressure to complete the task quickly
- Generate outputs that don’t align with your established patterns
- Skip validation steps you assumed were obvious
This isn’t a failure of the model—it’s the nature of working with probabilistic systems.
Whether you’re using LLMs for:
- Code generation with architectural standards
- Content creation with brand guidelines
- Data processing with validation rules
- Document generation with compliance requirements
- Report writing with formatting requirements
The same problem emerges: instructions tell the LLM what to do, but don’t ensure it actually does it.
Policy vs. Guardrails
Think of it like organizational policies and guardrails:
Policy documents tell people what they should do:
- “Always follow our coding standards”
- “Use approved terminology in customer communications”
- “Ensure all data exports match the required schema”
These are important. They set expectations and communicate intent. But policy alone doesn’t prevent violations—it just documents what should happen.
Guardrails enforce what’s possible:
- Code linters that block commits with style violations
- Automated checks that reject non-compliant content
- Schema validators that flag malformed data before it ships
Guardrails don’t just tell you the rules—they help you follow them by providing immediate, actionable feedback when you stray.
The same distinction applies to working with LLMs:
- Instructions (policy) → “Generate code following our architecture patterns”
- Validation (guardrails) → Automated checker that reports specific violations
You need both. Policy sets the direction. Guardrails ensure you stay on course.
Without guardrails, policy is just documentation that may or may not be followed. With guardrails, policy becomes enforceable through automated validation and feedback.
Linting: The Simple Version
Linting is the most straightforward example of validation. A linter doesn’t hope your code follows style guidelines—it checks and reports violations. The feedback is immediate and actionable.
But validation isn’t just for code. The same principle applies to:
- Content style guides: Checking tone, terminology, and structure
- Data validation: Ensuring outputs match schemas and business rules
- Format compliance: Verifying document structure and required sections
- Policy adherence: Confirming generated text follows organizational policies
These higher-level concerns require custom validation tooling.
The Validation Loop
To ensure adherence to your requirements, you need to create tooling that reports violations so the LLM can generate outputs that meet your standards.
Custom validation tools that integrate into your LLM workflow are the key. The LLM should run this tooling to validate its outputs—not just hope it followed instructions.
How It Works in Practice
- LLM generates output based on your instructions and documentation
- Validation tool runs automatically (as part of the workflow)
- Violations are reported with specific, actionable errors
- LLM sees the feedback and adjusts its output
- Repeat until compliant
This isn’t theoretical. When the LLM sees that its generated output isn’t compliant, it will adjust. The key is making the validation automatic and the feedback immediate.
The Neural Network Analogy
Think of it like a neural network with feedback. During training, a neural network doesn’t learn from instructions—it learns from gradient signals that tell it “you’re getting warmer” or “you’re getting colder.”
Your validation tooling provides the feedback signal that shapes output toward your standards.
Without this feedback loop:
- The LLM generates outputs based on best-effort interpretation
- Violations slip through
- Inconsistency accumulates
- You spend time manually checking what tooling should catch
With the enforcement loop:
- The LLM generates output
- Tooling validates immediately
- The LLM sees clear, specific violations
- The LLM corrects itself
- You review only the decisions that require human judgment
Quick Start: Your First Guardrail in 10 Minutes
Let’s build the simplest possible guardrail to see the pattern in action.
Scenario: You’re using an LLM to generate documentation, and every doc must have three sections: Overview, Usage, and Examples.
Create a simple validator (validate-doc.sh):
| |
Make it executable:
| |
Use it in your LLM workflow:
| |
That’s it. You’ve just built a guardrail.
Why this works:
- Takes 5-10 minutes to implement
- Provides clear, actionable feedback
- The LLM can run it and see specific violations
- You can expand it gradually (add more rules as needed)
Next steps to enhance:
- Check for minimum section lengths
- Validate code block syntax
- Ensure links aren’t broken
- Check for required terminology
Start simple. Add complexity only when you need it.
Example 1: Code Generation with Architecture Validation
If you’re using an LLM to generate code for an app with many features—API, backend, frontend per feature—you might provide instructions like:
“Each feature should have its own directory containing
/api,/backend, and/frontendsubdirectories. The API layer should never directly import from the frontend.”
Great documentation. The LLM will read it. And then, inevitably, it will generate code that violates it.
Not because the model is bad, but because:
- The context window is large but finite
- Guidelines compete with immediate task completion
- Without feedback, there’s no correction signal
The solution: Build a validator that checks these rules:
| |
When the LLM runs this validator and sees these errors, it knows exactly what to fix. No ambiguity. No interpretation needed.
Example 2: Content Generation with Brand Compliance
You’re using an LLM to generate marketing content. Your brand guidelines specify:
- Always use “customers” not “users” or “clients”
- Product names must be capitalized exactly: “DataFlow” not “dataflow” or “Dataflow”
- Each blog post needs sections: Summary, Problem, Solution, Call-to-Action
- Avoid passive voice
The enforcement tool:
| |
Now the LLM can correct these specific issues rather than trying to regenerate the entire piece from scratch.
Example 3: Data Processing with Schema Validation
You’re using an LLM to extract structured data from unstructured sources. Your schema requires:
- All dates in ISO 8601 format
- Currency amounts with exactly 2 decimal places
- Required fields:
transaction_id,amount,date,category - Category must be from approved list
The enforcement tool:
| |
The LLM can now fix these specific records rather than re-extracting everything.
Building Your Validation Tools
Your validation tools don’t need to be sophisticated AI systems themselves. They can be:
- Static validators that check against known rules
- Schema validators that verify structure and types
- Pattern matchers that ensure consistency
- Cross-reference checkers that validate relationships
- Compliance scanners that flag policy violations
The important part is that they:
- Run automatically as part of the LLM’s workflow
- Provide clear, specific error messages
- Are invocable from the command line or API
- Integrate with your automation pipeline
- Return structured output the LLM can parse
Integration Pattern for LLM Workflows
The general pattern looks like:
- Prompt the LLM: “Generate X according to our guidelines”
- LLM generates output using your instructions and examples
- Run validation tool:
validate-output ./generated/output - Tool reports violations: Specific, actionable errors
- LLM fixes violations: Adjusts only what’s needed
- Re-validate: Repeats until clean
- Return validated output: To user or next step in pipeline
This becomes a tight feedback loop where the LLM learns (within the conversation context) what compliance looks like for your specific requirements.
Why This Works
This approach succeeds because it:
Separates concerns: The LLM focuses on generation, the validator focuses on compliance Provides specificity: “Line 12 has problem X” beats “follow the guidelines” Enables iteration: Small corrections rather than full regeneration Builds context: The LLM learns the patterns within the conversation Scales reliably: Automated validation catches what manual review might miss
Implementing Validation in Your LLM Workflows
When using LLMs to generate output, ask yourself:
- What rules must always be followed? (Not nice-to-haves, but requirements)
- Can these rules be checked programmatically? (If yes, build the checker)
- How will violations be reported? (Clear, specific, actionable)
- Can the LLM access and run the validator? (Make it available as a tool)
- Is validation part of the workflow? (Automatic, not optional)
Start simple:
- Pick your 3-5 most important rules
- Build a basic validator that checks them
- Make it a required step before accepting LLM output
- Iterate based on what violations actually occur
Conclusion: Build Guardrails, Not Just Better Prompts
When using LLMs to generate output, treat validation guardrails as a first-class requirement, not an afterthought.
Your custom validators are how you:
- Codify institutional knowledge
- Ensure consistency across outputs
- Reduce manual review burden
- Make best practices discoverable
- Enable the LLM to self-correct
Instructions tell the LLM what to do. Guardrails ensure it actually does it.
Your Challenge This Week
Pick one LLM output that matters to your work. Just one.
Identify the 3 most important rules it must follow. Build a simple validator that checks them. Make it required before accepting the output.
You don’t need a sophisticated system. A 20-line bash script that reports violations is enough to start closing the feedback loop.
Whether you’re generating code, content, data, or documents—the principle is the same: stop hoping the LLM follows instructions. Build guardrails that ensure it does.
Reliable LLM output requires more than good prompts—it requires closing the feedback loop with validation guardrails.