Slop Driven Development

Software organizations are adopting large language models at an unprecedented pace. What has emerged is a recognizable development pattern: code that compiles, looks plausible, and ships quickly, but carries hidden weaknesses in correctness, security, and maintainability.

This pattern can be called slop driven development. It is not a moral critique of AI tools or the engineers using them. It is an observation about how incentives, constraints, and technology interact in practice.

For more information on how I use LLMs, refer to A Case Study in LLM Use: Lessons from a Year of Experimentation.

Defining slop

Slop is not code that fails to run. It is code that:

Passes surface level inspection
Often satisfies basic test cases
Reproduces common patterns from training data
Omits defensive design, explicit invariants, or threat modeling
Creates technical and security debt that becomes visible only later

Poor quality code is not new. Developers have always written fragile, insecure software under deadline pressure, vague requirements, or misaligned incentives. Brittle legacy systems existed long before AI entered the picture.

What has changed is velocity and volume. AI makes it possible to generate this class of code faster and at greater scale. The underlying problems are not new. They are amplified.

The tool is neutral

Treating slop driven development as an inherent flaw of AI is a misdiagnosis. Current language models operate through pattern completion, not reasoning. They do not model intent, anticipate failure modes, or consider adversarial behavior.

The real problem is how the tools are used:

Developers treat generated output as authoritative instead of provisional
Organizations substitute code generation for engineering judgment
Review standards erode because code appears finished on arrival
Delivery velocity becomes the primary success metric

AI does not introduce these failure modes. It accelerates them. The same dynamics that produce low quality human authored code apply to AI generated code, just faster.

Organizational incentives create structural pressure

Engineers do not work in isolation. Many organizations now mandate or strongly encourage the use of AI assisted coding tools, framing adoption as a productivity requirement. Public examples include major technology companies like Microsoft, where AI integration is embedded into standard development workflows.

This creates predictable dynamics:

Performance reviews emphasize output and velocity
AI tools promise immediate speed improvements
Rigorous review, refactoring, and threat modeling are friction
Short term metrics dominate over long term system health

When organizations optimize for speed without adjusting review processes or accountability structures, slop becomes the rational outcome. Developers are responding to the environment they operate in, not acting carelessly.

Legitimate benefits exist

Slop driven development is not uniformly harmful. When applied consciously within appropriate boundaries, it provides real value.

Fast iteration and exploration

AI generated code enables rapid experimentation:

Validate concepts quickly
Build disposable prototypes
Shorten feedback cycles
Reduce the cost of trying ideas

For early stage work, internal tooling, or proof of concept development, speed often outweighs correctness. Slop is acceptable when the code is temporary or low stakes.

Cross domain productivity

AI lowers barriers to working outside your primary expertise:

Backend engineers can scaffold frontend components
Infrastructure engineers can prototype dashboards
Security teams can build quick internal utilities

This does not replace specialized knowledge, but it enables individuals and small teams to maintain momentum without blocking on expertise in every domain.

Eliminating tedious work

AI handles repetitive, low leverage tasks effectively:

Boilerplate and scaffolding
Data transformation code
Integration glue logic
Test structure generation
Configuration templating

Automating these tasks frees engineers to focus on problems that require judgment: architecture, correctness, and security.

Security costs are measurable

Empirical research confirms that AI generated code frequently introduces known vulnerability classes when deployed without careful human oversight. Common issues include improper input validation, insecure authentication patterns, unsafe default configurations, and incorrect handling of sensitive data.

This creates asymmetric advantages for attackers. AI generated code follows predictable patterns and makes consistent mistakes in high risk areas like authentication, cryptography, deserialization, and concurrent state management. When vulnerabilities appear at scale with recognizable signatures, they become easier to discover and exploit systematically.

Quality depends on how you prompt

Not all AI generated code has the same quality. Output varies significantly based on how developers interact with the tool.

Vibe coding, where developers provide minimal context and accept whatever the model produces, generates the lowest quality slop. This treats AI as an oracle that magically understands unstated requirements.

Deliberate prompt engineering produces substantially better results:

Specify constraints explicitly: require parameterized queries, prohibit string concatenation in SQL construction
Include relevant context: type signatures, error handling conventions, existing abstractions
Request defensive coding: mandate input validation and explicit edge case handling for null values, empty collections, and boundary conditions
Ask for justification: require the model to explain security implications before generating code
Iterate critically: identify weaknesses in generated output and request revisions with concrete requirements

This approach does not eliminate review requirements, but it shifts the model from producing unconstrained guesses to generating solutions within defined boundaries.

The quality difference is substantial. Vibe coded output routinely omits error handling, uses deprecated APIs, and ignores security considerations. Carefully prompted output at least attempts to address these concerns, reducing review burden and improving the likelihood that generated code serves as a useful foundation rather than a liability.

Prompt engineering is a skill that requires domain knowledge. It does not replace expertise. It applies expertise earlier in the generation process.

Where AI adds clear value

Wholesale rejection of AI tools is neither realistic nor necessary. The more productive question is identifying where slop is acceptable and where it is not.

AI provides consistent value in documentation:

API reference generation
Internal design documentation
Architecture summaries
Onboarding materials
Inline code comments after logic is finalized

Documentation is chronically underprioritized relative to feature work. AI converts existing code and design decisions into readable explanatory text efficiently. When the output is imperfect, the consequences are limited and easily corrected.

This is appropriate use of slop.

Where slop fails

Production systems impose requirements that current AI systems do not reliably reason about:

Security boundaries and threat models
Failure modes and edge cases under real world conditions
Performance characteristics under load
Long term maintainability and evolution
Compliance and regulatory constraints

These contexts demand judgment, domain knowledge, and accountability. AI can assist, but treating its output as production ready without rigorous human verification is a fundamental category error.

The standard should be consistent regardless of authorship. Code must be understood, justified, and reviewed whether it originates from a human or a model.

A sustainable path forward

Slop driven development is not an indictment of technology or engineers. It is the predictable consequence of powerful tools interacting with organizational incentives that prioritize delivery speed over system integrity.

The bottleneck has shifted. Writing code is no longer the scarce resource. Evaluating code is.

A realistic approach requires deliberate choices:

Accept low stakes slop where consequences are reversible
Enforce rigor where failure is expensive
Use AI to amplify human judgment, not bypass it

The goal is not eliminating slop entirely. It is making conscious decisions about when slop is acceptable and when it is not, particularly given the current limitations of AI systems.

References

Veracode. AI Generated Code Poses Major Security Risks in Nearly Half of All Development Tasks. Veracode GenAI Code Security Report.
Pearce, H. et al. Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions. IEEE Symposium on Security and Privacy.
NIST. Secure Software Development Framework (SSDF), SP 800-218.
TechRadar. Developers Don't Trust AI Code, But Many Still Don't Check It.
Microsoft. GitHub Copilot and AI Assisted Development Documentation.