The Prompt Engineering Maturity Model
Most prompt engineering advice stops at "be specific" and "give examples." That's fine for experiments. Production requires more.
Level 1: Structured Prompts
Move beyond freeform text. Structure your prompts:
STRUCTURED_PROMPT = """ # Role You are a customer service agent for {company_name}. # Context Customer: {customer_name} Account Status: {account_status} Previous Interactions: {interaction_summary} # Task Respond to the following customer inquiry. Be helpful, professional, and accurate. # Constraints - Do not discuss competitors - Do not make promises about future features - Escalate billing disputes to human agents # Customer Message {customer_message} # Response Format Provide a response in the following format: - Greeting - Address the main concern - Offer next steps - Closing """
Level 2: Prompt Versioning
Prompts are code. Treat them that way:
class PromptRegistry: def __init__(self): self.prompts = {} self.versions = {} def register(self, name: str, prompt: str, version: str): self.prompts[name] = prompt self.versions[name] = version def get(self, name: str) -> str: return self.prompts[name] def get_version(self, name: str) -> str: return self.versions[name] # Usage registry = PromptRegistry() registry.register( "customer_service_v2", CUSTOMER_SERVICE_PROMPT, "2.1.0" )
Level 3: Dynamic Prompt Assembly
Build prompts from components:
class PromptBuilder: def __init__(self): self.components = [] def add_role(self, role: str) -> 'PromptBuilder': self.components.append(f"# Role\n{role}") return self def add_context(self, context: dict) -> 'PromptBuilder': ctx_str = "\n".join(f"- {k}: {v}" for k, v in context.items()) self.components.append(f"# Context\n{ctx_str}") return self def add_examples(self, examples: list) -> 'PromptBuilder': ex_str = "\n\n".join( f"Input: {ex['input']}\nOutput: {ex['output']}" for ex in examples ) self.components.append(f"# Examples\n{ex_str}") return self def add_task(self, task: str) -> 'PromptBuilder': self.components.append(f"# Task\n{task}") return self def build(self) -> str: return "\n\n".join(self.components)
Level 4: Prompt Testing
Test prompts like code:
class PromptTest: def __init__(self, prompt: str, test_cases: list): self.prompt = prompt self.test_cases = test_cases async def run(self, model) -> TestResults: results = [] for case in self.test_cases: filled_prompt = self.prompt.format(**case.inputs) response = await model.generate(filled_prompt) passed = all( check(response) for check in case.checks ) results.append(TestResult(case.name, passed, response)) return TestResults(results) # Define test cases test_cases = [ TestCase( name="handles_refund_request", inputs={"customer_message": "I want a refund"}, checks=[ lambda r: "refund" in r.lower(), lambda r: "policy" in r.lower(), lambda r: len(r) > 50 ] ), TestCase( name="escalates_billing_dispute", inputs={"customer_message": "Your charges are fraudulent!"}, checks=[ lambda r: "human" in r.lower() or "representative" in r.lower() ] ) ]
Level 5: Prompt Optimization
Automatically improve prompts:
class PromptOptimizer: async def optimize( self, base_prompt: str, eval_dataset: list, iterations: int = 10 ) -> str: current_prompt = base_prompt current_score = await self.evaluate(current_prompt, eval_dataset) for i in range(iterations): # Generate variations variations = await self.generate_variations(current_prompt) # Evaluate each variation for variation in variations: score = await self.evaluate(variation, eval_dataset) if score > current_score: current_prompt = variation current_score = score return current_prompt
Common Patterns
Chain of Thought
Think through this step by step:
1. First, identify the main question
2. List relevant information
3. Consider possible approaches
4. Choose the best approach and explain why
5. Provide your answer
Self-Critique
After generating your response, review it for:
- Factual accuracy
- Completeness
- Clarity
- Potential misunderstandings
If you find issues, revise your response.
Format Enforcement
You MUST respond in the following JSON format:
{
"answer": "your answer here",
"confidence": 0.0 to 1.0,
"sources": ["source1", "source2"]
}
Do not include any text outside the JSON object.
Conclusion
Production prompt engineering is software engineering. Version your prompts, test them rigorously, and continuously improve them based on real-world performance.