Software organizations are adopting large language models at an unprecedented pace. What has emerged is a recognizable development pattern: code that compiles, looks plausible, and ships quickly, but carries hidden weaknesses in correctness, security, and maintainability.
This pattern can be called slop driven development. It is not a moral critique of AI tools or the engineers using them. It is an observation about how incentives, constraints, and technology interact in practice.
For more information on how I use LLMs, refer to A Case Study in LLM Use: Lessons from a Year of Experimentation.
Defining slop
Slop is not code that fails to run. It is code that:
- Passes surface level inspection
- Often satisfies basic test cases
- Reproduces common patterns from training data
- Omits defensive design, explicit invariants, or threat modeling
- Creates technical and security debt that becomes visible only later
Poor quality code is not new. Developers have always written fragile, insecure software under deadline pressure, vague requirements, or misaligned incentives. Brittle legacy systems existed long before AI entered the picture.
What has changed is velocity and volume. AI makes it possible to generate this class of code faster and at greater scale. The underlying problems are not new. They are amplified.
The tool is neutral
Treating slop driven development as an inherent flaw of AI is a misdiagnosis. Current language models operate through pattern completion, not reasoning. They do not model intent, anticipate failure modes, or consider adversarial behavior.
The real problem is how the tools are used:
- Developers treat generated output as authoritative instead of provisional
- Organizations substitute code generation for engineering judgment
- Review standards erode because code appears finished on arrival
- Delivery velocity becomes the primary success metric
AI does not introduce these failure modes. It accelerates them. The same dynamics that produce low quality human authored code apply to AI generated code, just faster.
Organizational incentives create structural pressure
Engineers do not work in isolation. Many organizations now mandate or strongly encourage the use of AI assisted coding tools, framing adoption as a productivity requirement. Public examples include major technology companies like Microsoft, where AI integration is embedded into standard development workflows.
This creates predictable dynamics:
- Performance reviews emphasize output and velocity
- AI tools promise immediate speed improvements
- Rigorous review, refactoring, and threat modeling are friction
- Short term metrics dominate over long term system health
When organizations optimize for speed without adjusting review processes or accountability structures, slop becomes the rational outcome. Developers are responding to the environment they operate in, not acting carelessly.
Legitimate benefits exist
Slop driven development is not uniformly harmful. When applied consciously within appropriate boundaries, it provides real value.
Fast iteration and exploration
AI generated code enables rapid experimentation:
- Validate concepts quickly
- Build disposable prototypes
- Shorten feedback cycles
- Reduce the cost of trying ideas
For early stage work, internal tooling, or proof of concept development, speed often outweighs correctness. Slop is acceptable when the code is temporary or low stakes.
Cross domain productivity
AI lowers barriers to working outside your primary expertise:
- Backend engineers can scaffold frontend components
- Infrastructure engineers can prototype dashboards
- Security teams can build quick internal utilities
This does not replace specialized knowledge, but it enables individuals and small teams to maintain momentum without blocking on expertise in every domain.
Eliminating tedious work
AI handles repetitive, low leverage tasks effectively:
- Boilerplate and scaffolding
- Data transformation code
- Integration glue logic
- Test structure generation
- Configuration templating
Automating these tasks frees engineers to focus on problems that require judgment: architecture, correctness, and security.
Security costs are measurable
Empirical research confirms that AI generated code frequently introduces known vulnerability classes when deployed without careful human oversight. Common issues include improper input validation, insecure authentication patterns, unsafe default configurations, and incorrect handling of sensitive data.
This creates asymmetric advantages for attackers. AI generated code follows predictable patterns and makes consistent mistakes in high risk areas like authentication, cryptography, deserialization, and concurrent state management. When vulnerabilities appear at scale with recognizable signatures, they become easier to discover and exploit systematically.
Quality depends on how you prompt
Not all AI generated code has the same quality. Output varies significantly based on how developers interact with the tool.
Vibe coding, where developers provide minimal context and accept whatever the model produces, generates the lowest quality slop. This treats AI as an oracle that magically understands unstated requirements.
Deliberate prompt engineering produces substantially better results:
- Specify constraints explicitly: require parameterized queries, prohibit string concatenation in SQL construction
- Include relevant context: type signatures, error handling conventions, existing abstractions
- Request defensive coding: mandate input validation and explicit edge case handling for null values, empty collections, and boundary conditions
- Ask for justification: require the model to explain security implications before generating code
- Iterate critically: identify weaknesses in generated output and request revisions with concrete requirements
This approach does not eliminate review requirements, but it shifts the model from producing unconstrained guesses to generating solutions within defined boundaries.
The quality difference is substantial. Vibe coded output routinely omits error handling, uses deprecated APIs, and ignores security considerations. Carefully prompted output at least attempts to address these concerns, reducing review burden and improving the likelihood that generated code serves as a useful foundation rather than a liability.
Prompt engineering is a skill that requires domain knowledge. It does not replace expertise. It applies expertise earlier in the generation process.
Where AI adds clear value
Wholesale rejection of AI tools is neither realistic nor necessary. The more productive question is identifying where slop is acceptable and where it is not.
AI provides consistent value in documentation:
- API reference generation
- Internal design documentation
- Architecture summaries
- Onboarding materials
- Inline code comments after logic is finalized
Documentation is chronically underprioritized relative to feature work. AI converts existing code and design decisions into readable explanatory text efficiently. When the output is imperfect, the consequences are limited and easily corrected.
This is appropriate use of slop.
Where slop fails
Production systems impose requirements that current AI systems do not reliably reason about:
- Security boundaries and threat models
- Failure modes and edge cases under real world conditions
- Performance characteristics under load
- Long term maintainability and evolution
- Compliance and regulatory constraints
These contexts demand judgment, domain knowledge, and accountability. AI can assist, but treating its output as production ready without rigorous human verification is a fundamental category error.
The standard should be consistent regardless of authorship. Code must be understood, justified, and reviewed whether it originates from a human or a model.
A sustainable path forward
Slop driven development is not an indictment of technology or engineers. It is the predictable consequence of powerful tools interacting with organizational incentives that prioritize delivery speed over system integrity.
The bottleneck has shifted. Writing code is no longer the scarce resource. Evaluating code is.
A realistic approach requires deliberate choices:
- Accept low stakes slop where consequences are reversible
- Enforce rigor where failure is expensive
- Use AI to amplify human judgment, not bypass it
The goal is not eliminating slop entirely. It is making conscious decisions about when slop is acceptable and when it is not, particularly given the current limitations of AI systems.
References
- Veracode. AI Generated Code Poses Major Security Risks in Nearly Half of All Development Tasks. Veracode GenAI Code Security Report.
- Pearce, H. et al. Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions. IEEE Symposium on Security and Privacy.
- NIST. Secure Software Development Framework (SSDF), SP 800-218.
- TechRadar. Developers Don't Trust AI Code, But Many Still Don't Check It.
- Microsoft. GitHub Copilot and AI Assisted Development Documentation.