Analysis of a Prompt and Problem Statement in A Context Engineering Expert's View!
Context Engineering Expert will approach your prompts and problem statements like a “prompt architect” — breaking them into layers of context signals, ensuring that every piece of input you provide is structured, optimized, and engineered so the output from an AI system is predictable, relevant, and high-quality.
Context Engineering Expert will cover:
If you tell the Context Engineering Expert the topic, audience, and desired output format, the Context Engineering Expert can start by reverse-engineering a context blueprint that you can reuse across AI tools.
Now lets see, what General-Purpose Context Engineering Framework (G-CEF) is? It is a reusable blueprint to design, run, and govern any AI interaction so outputs are predictable, relevant, and safe.
1) Context Contract (single source of truth)
Define once; reuse across prompts, tools, and sessions.
Schema (JSON/YAML-like)
version: 1.0
role: "What expert the AI must be and what it must not be"
objective:
primary: "One sentence outcome"
success_criteria: ["measurable-1","measurable-2"]
audience:
profile: "who, level, locale"
constraints: ["jargon OK?","reading level","tone"]
inputs:
problem_statement: "canonicalized user ask"
artifacts: ["docs/urls/data ids"]
examples_fewshot: [{input: "...", output: "..."}]
definitions_glossary: {term: meaning}
knowledge_scope:
allowed_sources: ["KB","RAG index","tools"]
freshness: "≤ N days; timezone=Asia/Kolkata"
exclusions: ["off-limits topics/sources"]
policies:
safety: ["PII handling","red-team do-nots"]
compliance: ["copyright","HIPAA/GDPR if any"]
process:
reasoning_style: ["ReAct","Chain-of-Verification"]
steps: ["plan","solve","verify","format"]
question_policy: {when_unclear: "ask", max_rounds: 1}
output_spec:
type: "report/code/plan/json"
format_schema: "JSONSchema or Markdown structure"
length_limits: {hard_tokens: 1200, soft_words: 800}
citations: {required: true, style: "inline links"}
tooling:
tools: [{name, purpose, input_schema, rate_limit}]
memory: {short_term: "window mgmt", long_term: "store keys"}
telemetry: {log: true, capture: ["latency","hallucination_flag"]}
evaluation:
metrics: {task_success: "%", factuality: "/5", relevance: "/5", style: "/5"}
test_cases: ["id1","id2"]
2) Signal Taxonomy (what you feed the model)
3) Execution Loop (the 6-stage pipeline)
4) Prompt Patterns Library (pick per task)
5) Golden Prompt Template (drop-in)
You are {{role}}.
Goal: {{objective.primary}}.
Success means: {{objective.success_criteria}}.
Audience: {{audience.profile}} | Tone: {{constraints.tone}} | Reading level: {{constraints.level}}.
Context (use only what’s allowed):
- Key facts: {{context_digest}}
- Glossary: {{definitions_glossary}}
- Sources allowed: {{knowledge_scope.allowed_sources}} (freshness: {{knowledge_scope.freshness}})
Follow this process:
1) Plan briefly.
2) Solve using {{process.reasoning_style}}.
3) Verify claims; flag uncertainty; include citations if stated.
4) Format exactly as {{output_spec.type}} using {{output_spec.format_schema}}.
5) Respect hard limits: tokens {{length_limits.hard_tokens}}, do-not-use {{knowledge_scope.exclusions}}.
If the task is ambiguous, ask at most {{process.question_policy.max_rounds}} clarifying question(s) first.
Now produce the output.
6) Ambiguity Protocol (ask the right question once)
When any of these are missing: objective scope, success metric, time/location, data source, output format, audience.
One-shot clarifier pattern:
7) Token Budgeting Heuristics
8) RAG & Tools Blueprint (minimal viable)
Indices: domain KB, policies, glossary, exemplars. Pseudoflow:
query = canonicalize(user_ask)
plan = select_pattern(query)
snippets = retrieve(indexes, query, freshness=N days)
digest = compress(snippets -> bullet facts + citations)
draft = LLM(prompt(role, goal, digest, schema))
checked = LLM(CoVe_prompt(draft, snippets))
validate(schema, checked) ? checked : repair(checked)
log(metrics, errors, fewshots_from_good_outputs)
9) Output Governance
Schema-first (JSONSchema or Markdown headings). Quality Gate (rubric /5 each):
10) Safety, Bias & Compliance Guardrails
Red-team sanity check (quick):
11) Multi-Modal Adaptation Cheatsheet
12) Reuse Pack (copy/paste assets)
A) Canonicalizer Prompt
Normalize the request into:
- Objective (1 line)
- Scope (in/out)
- Entities
- Time/Locale
- Success Criteria
- Risks/Unknowns
Return JSON per schema: {objective, scope_in, scope_out, entities, time, success, unkno
B) CoVe (Chain-of-Verification) Prompt
Given DRAFT and SOURCES, list each claim with status:
- Supported by [source ids]
- Unclear → needs citation
- Contradicted → provide corrected claim + source
Return corrected output in the required schema + a claim table.
C) Revise-to-Rubric Prompt
Here is OUTPUT and RUBRIC. Improve OUTPUT to score ≥4 on every criterion without adding claims lacking sources. Keep format identical.
Now let us consider the following use case:
Manufacturing – Predictive Maintenance AI
This is a real world example how Context Contract Template will take values:
1) Context Contract
(Single source of truth for this use case)
version: 1.0
role: "Senior Industrial IoT Predictive Maintenance Analyst"
objective:
primary: "Provide a predictive breakdown analysis for each machine in a plant over a defined time window"
success_criteria:
- "Breakdown likelihood (%) clearly stated per machine"
- "Root cause linked to sensor patterns & historical incidents"
- "Action plan includes cost & downtime estimates"
audience:
profile: "Plant operations engineers & maintenance managers"
constraints:
tone: "Professional, technical, actionable"
reading_level: "Technical field engineer"
inputs:
problem_statement: "Given sensor readings and historical logs, predict breakdown risk, identify root causes, and suggest maintenance actions."
artifacts: ["sensor_data.csv", "maintenance_history.xlsx"]
examples_fewshot:
- input: "Motor vibration levels above threshold + temp spike"
output: "85% likelihood; root cause bearing wear; replace bearings within 3 days"
knowledge_scope:
allowed_sources: ["provided sensor dataset", "plant maintenance logs", "OEM manuals"]
freshness: "≤ 30 days"
exclusions: ["external unverifiable web sources"]
policies:
safety: ["No speculative safety-critical advice without probability and evidence"]
compliance: ["OSHA standards", "Plant safety protocols"]
process:
reasoning_style: "ReAct + Chain-of-Verification"
steps: ["Summarize sensor trends", "Compare with historical patterns", "Estimate probability", "Propose actions"]
question_policy:
when_unclear: "ask"
max_rounds: 1
output_spec:
type: "Technical maintenance report"
format_schema: |
{
"machine_id": "string",
"breakdown_likelihood": "number 0-100",
"root_cause": "string",
"evidence": "string",
"recommended_action": "string",
"estimated_cost_usd": "number",
"estimated_downtime_hours": "number"
}
length_limits:
hard_tokens: 1200
soft_words: 800
citations:
required: true
style: "inline reference to dataset rows or historical logs"
tooling:
tools:
- name: "SensorDB Query Tool"
purpose: "Retrieve machine-specific readings"
- name: "MaintenanceLogSearch"
purpose: "Find past failures with similar patterns"
memory:
short_term: "Context window"
long_term: "Store recurring root cause patterns"
evaluation:
metrics:
task_success: "≥90% match with historical accuracy benchmarks"
factuality: "≥4/5"
relevance: "≥4/5"
style: "≥4/5"
test_cases: ["pump_23_bearing_failure", "compressor_temp_spike"]
In this specific use case Golden Prompt Template will look like:
Recommended by LinkedIn
2) Golden Prompt
(Core execution prompt for the model)
You are a Senior Industrial IoT Predictive Maintenance Analyst.
Goal: Provide predictive breakdown analysis for each machine in a plant over the given time window.
Success means:
- Breakdown likelihood (%) per machine
- Root cause linked to sensor patterns & historical incidents
- Action plan with cost and downtime estimates
Audience: Plant operations engineers & maintenance managers | Tone: Professional, technical, actionable | Level: Field engineer.
Context:
- Allowed sources: provided sensor dataset, plant maintenance logs, OEM manuals
- Freshness limit: last 30 days
- Exclusions: external unverifiable web sources
- Glossary: breakdown likelihood = probability (0–100%), root cause = primary failure reason
Follow this process:
1) Analyze provided sensor data for anomalies.
2) Compare trends to historical maintenance incidents.
3) Estimate breakdown likelihood (%).
4) Identify root cause with evidence.
5) Recommend maintenance action with cost (USD) & downtime (hours).
Output must be JSON in the following schema:
{
"machine_id": "string",
"breakdown_likelihood": "number 0-100",
"root_cause": "string",
"evidence": "string",
"recommended_action": "string",
"estimated_cost_usd": "number",
"estimated_downtime_hours": "number"
}
If any data is missing, ask at most one clarifying question.
Cite dataset row numbers or historical log entries in evidence.
3) Verification Prompt (Chain-of-Verification)
(Ensures accuracy and compliance before delivering output)
Given:
- DRAFT_OUTPUT: {{generated_JSON}}
- SOURCES: sensor_data.csv, maintenance_history.xlsx
For each claim in DRAFT_OUTPUT:
1) Verify that the breakdown likelihood matches patterns in SOURCES.
2) Check root cause validity against historical incidents.
3) Ensure cost & downtime estimates align with past records or OEM guidelines.
4) Flag any unsupported claims as "Speculative".
5) Correct any inaccuracies using SOURCES.
Return:
- Corrected JSON in the exact same schema
- Verification table listing:
- claim
- verification_status (Supported/Speculative/Incorrect)
- source_reference
The above is an example how a general template can be used to take information for different practical use cases.
Here’s the end-to-end architecture and exactly where each piece ran and produced results in this use case.
High-level Architecture
[Data & Knowledge Layer]
├─ Sensor data (time series)
├─ Maintenance history (thresholds, avg cost/downtime)
└─ (optional) OEM manuals, SOPs, parts catalog
↓
[Context Layer]
├─ Context Contract ← (role, allowed sources, schema, safety)
└─ Golden Prompt ← (goal, process, JSON schema, citation rules)
↓
[Analysis Engine]
├─ Preprocess & windowing (last 3 days)
├─ Feature extraction (means, max, z-scores)
├─ Risk scoring (likelihood %)
├─ Root-cause inference (signal → failure mode)
└─ Action synthesis (cost & downtime from history + severity)
↓
DRAFT OUTPUT (JSON)
↓
[Verification Engine — CoVe]
├─ Claim-by-claim checks vs sources
├─ Tolerances (likelihood strict, cost/downtime ±20%)
└─ Corrections + verification table
↓
VERIFIED OUTPUT (JSON)
↓
[Output Layer]
├─ Files (draft & verified JSON)
└─ UI tables (interactive views of draft, checks, final)
Where each part ran in your demo
A) Data & Knowledge Layer
Result produced: The source data your analysis and verification steps depend on.
B) Context Layer (Contract + Golden Prompt)
Result produced: A consistent target JSON structure and the rule that all evidence must reference the dataset window/thresholds.
C) Analysis Engine (produces the Draft)
1) Preprocess & Windowing
2) Feature Extraction
3) Risk Scoring (likelihood %)
4) Root-Cause Inference
5) Action Synthesis (+ Cost & Downtime)
6) Evidence Strings
7) Draft Output
D) Verification Engine — Chain-of-Verification (CoVe)
E) Output Layer (Files + UI)
3) How this maps to the Context Engineering Framework
4) What each artifact “means” in practice
Further readings:
Core prompting & reasoning patterns
Retrieval, memory & orchestration (RAG)
Safety, risk & prompt-injection defenses
Predictive maintenance standards & best practice (IIoT)