GenAI for Engineers (Part 2: From Prompt Engineering to System Design)
Using Prompt Engineering as Software Design
Application for Next Cohort
The next cohort of “2x Your Compensation In Tech" will start in November
If you are a software engineer with 2-10 years of experience and want to get the offer you deserve, you can check the details below and fill out the application form if interested
In Part 1, we broke down the anatomy of an LLM request
Prompt → Model → Output
But here’s the reality:
If you just throw natural language at an LLM, you’ll often get:
inconsistent answers
invalid formats
hallucinations that slip into production
LLMs are like brilliant interns: they can do incredible work, but they need crystal-clear instructions and guardrails.
That’s where prompt engineering comes in.
Not as “prompt hacks” or “secret keywords.”
But as a system design discipline.
1. Prompt engineering is software design
Too many people treat prompts like magic spells: add a few “please” and “act as” statements, and hope for the best.
Engineers should treat prompts like API contracts:
predictable inputs
constrained outputs
validated responses
Prompt design = Specification writing
Bad spec → bad implementation.
Good spec → predictable results.
Example:
# Version A (weak prompt)
"Summarize this log file."
# Version B (strong prompt)
"You are an assistant helping engineers analyze logs.
Task: Summarize this log file into two sentences.
Constraints: Be concise.
Output format: JSON with keys {error_summary, resolution}."
Output A: “The system failed, then recovered.”
Output B:
{
"error_summary": "Connection timeout",
"resolution": "Retried and re-established connection"
}
Same model. Better design
2. The hidden levers inside prompts
Beyond “what you say,” prompts have structural levers that make them more reliable:
Role specification → “You are a security auditor” vs. “You are a helpful chatbot.”
Changes tone, focus, and risk tolerance.
Instruction ordering → LLMs pay more attention to the last instruction.
Constraints → length limits, bullet points, banned words.
Output schemas → forcing JSON/XML with explicit formatting.
Engineers should think of prompts like DSLs (domain-specific languages), you’re writing an input spec, not prose.
3. Few-shot prompting: patterns > words
Humans learn from examples.
LLMs do too—at inference time.
This is few-shot prompting: giving the model explicit input-output pairs so it generalizes the pattern.
Example:
"You are a classifier. Categorize each issue as BUG, FEATURE, or QUESTION.
Examples:
Input: 'App crashes on login'
Output: BUG
Input: 'Can we add dark mode?'
Output: FEATURE
Input: 'What’s the max upload size?'
Output: QUESTION
Now classify: 'Payment button not working on mobile.'"
LLM output → BUG
Why this works: LLMs aren’t truly “reasoning.” They’re pattern-completers. Give them a few correct completions, and the pattern locks in
4. Chain-of-thought & self-consistency
LLMs are notoriously inconsistent on reasoning tasks.
Two tricks make them more reliable:
Chain-of-thought (CoT)
Explicitly ask the model to show its reasoning.
"Explain step by step how you arrived at the answer before giving the final result."
This makes models more accurate on logic/math problems.
Self-consistency
Sample multiple outputs (temperature > 0), then aggregate.
For reasoning tasks, this improves accuracy because wrong answers are less consistent than right ones.
Think of this like ensemble learning in ML, boosting reliability via multiple runs.
5. Chaining prompts into systems
Real-world apps don’t rely on a single prompt. They use pipelines of smaller, reliable prompts.
Example: a bug triage assistant:
Step 1: Extract entities
“List all errors mentioned in these logs.”
Step 2: Classify severity
“Rate each error as LOW, MEDIUM, or HIGH severity.”
Step 3: Generate report
“Summarize findings in Markdown with a table of errors and severity.”
Instead of one giant prompt, break tasks into atomic steps with validation.
This is exactly what frameworks like LangChain and LlamaIndex help automate, but the principle is what matters.
6. Hands-on: Structured JSON extractor
Here’s a practical example for engineers:
import openai, json
text = """
User: app crashed after payment
User: request for adding export to CSV
User: is there a limit to file uploads?
"""
prompt = """
You are a classifier.
Task: Extract each user request and classify it as BUG, FEATURE, or QUESTION.
Constraints: Return valid JSON only.
Output format: [{"request": "...", "category": "..."}]
"""
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": prompt + text}],
temperature=0,
max_tokens=300
)
try:
result = json.loads(response['choices'][0]['message']['content'])
print(result)
except json.JSONDecodeError:
print("Invalid JSON, re-prompt needed.")
Pro tip: Always parse and validate outputs. LLMs love to “almost” follow JSON specs. Production systems often use JSON Schema validation + re-prompting until the schema is satisfied.
7. Where to go deeper
Prompt Engineering Guide for curated examples.
LlamaIndex - context + RAG pipelines.
Self-Consistency Improves Chain-of-Thought Reasoning - the original CoT paper.
Final thoughts
Prompt engineering isn’t about clever hacks.
It’s about designing contracts between humans and LLMs:
Clear roles, tasks, constraints, and formats.
Breaking big problems into smaller chains.
Validating outputs with code, not just eyeballs.
Do this well, and your LLM stops behaving like a creative intern and starts acting like a reliable system component.
Next step for you:
Take the JSON extractor example and:
Add a few-shot examples to the prompt.
Compare results with and without temperature=0.7.
Share back - what improved, what broke?
In Part 3, we’ll tackle one of the biggest gaps: how to make LLMs work with your own data (hello RAG)
Tap the 💙 like button to show some love
Have thoughts or something to add? Leave a comment, I read and reply to every one
Know someone who’d find this helpful? Forward it or share the link
Here’s how I can help:
Join my cohort: “2X Your Compensation in Tech”: a live cohort course to help you prepare better and position yourself right for tech interviews. [Check Details Here]
Sponsor this newsletter: Want to reach 23,000+ senior engineers and tech leaders? [See sponsorship options]
Stay in touch
Find me on LinkedIn, X, Instagram, Threads, or Bluesky.
Want to request a topic? Just email me at hemant.pandey17@gmail.com
This newsletter is funded by the support of readers like yourself
If you’re not already a paid subscriber, consider upgrading to support the work and unlock full access
Thanks for reading
A good learn in morning. Thank You
Hi Hemant,
Many are moving from prompt engineering to context engineering. What are your thoughts on that?
Thanks