The Strategic Software Decision: Code, Train, or Prompt?

View presentation slides

I recently watched Andrej Karpathy give a talk to a group of graduates going into software and AI fields. It reminded me of previous conversations I had with industry professionals about how and when to use newer AI/LLM technologies to build software systems.

Tech leaders often grapple with finding the right use cases for AI and frequently struggle to decide whether their teams should code, train a model, or prompt an LLM to accomplish a task.

In the video above, I lay out several heuristics for helping technical leaders think about when to use these various software/AI approaches effectively.

The Spectrum of Software

In this analysis, I examine the three stages of modern software development that Andrej outlines in his talk.

In the video, I explore Andrej’s definition of software 1.0, 2.0, and 3.0 and propose that a missing 1.5 may still be relevant.

I connect these stages to measurable metrics of determinism, certainty, and explainability to help technical leaders evaluate which approach is right for their specific problems.

Software 1.0 – Code

High determinism: Same inputs always yield the same outputs
High certainty: You know exactly what the system will do
High explainability: Every decision is traceable
When to use it:
- Regulatory compliance (e.g. taxes, finance)
- Safety-critical systems (e.g. aviation, medical devices)
- Business rules that cannot tolerate unexpected results

Software 2.0 – Train (Neural Networks)

Some determinism, probabilistic certainty, low explainability
When to use it:
- When rules are too complex to code manually
- When you have abundant high-quality data
- When performance matters more than understanding
- Examples: image recognition, fraud detection, recommendations

Software 3.0 – Prompt (LLMs)

Low determinism, low certainty, high explainability (in natural language)
When to use it:
- When reasoning transparency is valuable
- When you want iterative refinement via natural conversation
- When outputs don’t need to be identical but should follow patterns
- Examples: AI assistants, creative tools, customer support, code analysis

The Missing Middle: Software 1.5

Many overlook the space between rule-based code and deep learning. Think of it as traditional machine learning:

Algorithms like decision trees, SVMs, and linear regression
High interpretability and explainability
Good for regulated domains like credit scoring or risk assessment