Sitemap

Generative AI Series

Context Engineering (1/2)—Getting the best out of Agentic AI Systems

In this blog, we will explore a critical emerging concept called context engineering, which is becoming a crucial practice/skill to get the agentic AI to work.

9 min readJul 7, 2025

--

Context engineering is becoming an emerging skill as we move towards building complex agentic AI applications. Providing the right context is becoming very critical, as agentic AI-based applications try to solve complex problems.

In this blog, I will share my experiences working with Agentic AI applications and why I strongly feel context engineering is very critical to getting the right result. Specifically, this is a very critical skill while using tools like Claude code or Gemini CLI to generate code.

I had covered my experiences with vibe coding in the following blogs. Please read them.

Setting the context for Context Engineering

I sometimes spend hours trying to get my prompt right so that I get the results I want, and in the process, I use up all my free-tier tokens. I'm sure everyone has gone through something like this, and it's very annoying. The most critical input for any LLM is the context. When we get our context right and pass the right context with our prompt, I have always seen much better results, and sometimes it has surprised me.

As we start building powerful agentic AI applications based on very smart reasoning models, context becomes even more critical for us to get the right results. Defining clear context is as important as giving clear requirements to a software designer/developer.

Context engineering is the discipline of designing, structuring, and optimizing the contextual information provided to AI systems to achieve desired outcomes. It provides a systematic approach toward building the right context. It’s not just about writing good prompts. It’s about creating a systematic approach to communication that ensures consistent, reliable, and high-quality AI responses every single time.

What is context engineering?

A few weeks back, I was trying to build a complete Spring Boot application (slightly complex enterprise use case), using Claude Code. Since this is not an off-the-shelf use case, it was becoming very difficult for me to define the exact prompt (with the right in-context learning, RAG etc). I kept getting inconsistent results—sometimes brilliant, sometimes complete nonsense.

I stumbled upon various papers and blogs about context engineering, which I had heard about earlier but dismissed as just another fad. I soon realized that there should be a solution, and we really need a systematic approach to defining the context. I did some reading and practiced a few ideas/tricks/methods from the internet, and soon, I could see significantly outstanding results.

I thought the learning was something I should share with my readers.

The problem wasn’t the LLM; it was how I was communicating with it. I was treating it like a search engine rather than a collaborative partner. Once we implemented proper context engineering principles, their success rate jumped from about 30% to over 90%.

Here is what I learned from various sources after practicing these principles for a couple of weeks.

The goal is to create predictable, repeatable interactions that minimize ambiguity and maximize the AI’s ability to understand and respond appropriately. It’s like developing a shared language with your AI assistant—one that both of you understand perfectly.

When done right, context engineering transforms AI from an unpredictable tool into a reliable partner. Instead of doing trial and error with prompts, we can get consistent, high-quality results.

The following flowchart captures my high-level approach to building context.

Let me walk you through each step:

  1. Raw Requirements: This is where most people start—with a vague idea like “I need the LLM to build an application.” It’s unclear and often incomplete.
  2. Context Design: Here’s where the magic begins. We take that vague requirement and start asking the right questions: What kind of application? Who are the users?? What’s the goal? What does “better” even mean?
  3. Context Structure: We organize all that information into a logical framework. Think of it like creating an outline for a presentation—everything has its place and purpose.
  4. PRP (Product Requirement Prompt) Implementation: This phase is where we build the actual Prompt Response Pattern—the reusable template that will guide our AI interactions. In the next blog, I'll give more details and examples of PRPs.
  5. Context Validation: We then need to test our context with real scenarios to make sure it works.
  6. AI Response: The AI generates its output based on our carefully crafted context. Sometimes, I have also run this in parallel with other LLMs to see if the results are consistent. The idea is to make sure the context is so well defined that all similarly strong reasoning models should provide similar results. (I wish!! :-D )
  7. Outcome Evaluation: We honestly assess whether the result meets our needs. No sugar-coating here—if it’s not good enough, we admit it. Sometimes, it's also a good idea to test this with another LLM. (Just a wild idea :-D )
  8. The Critical Decision Point: If the outcome isn’t satisfactory, we don’t start over—we refine our context.
  9. Context Refinement: We analyze what went wrong and make targeted improvements. It’s like debugging code—methodical and purposeful.
  10. Context Deployment: Once we are happy with the results, we have a reliable, repeatable process that delivers consistent results.

The key is “reliable” and “repeatable.” With all the model drifts that are happening almost every month, this becomes even more critical

The process is normally iterative, and each cycle makes the context more robust and reliable. I’ve seen simple contexts become incredibly powerful through just a few iterations of this process.

Layers of Context

When I initially began working with AI, I believed that context merely involved providing detailed prompts. I was wrong. Context in AI systems operates on multiple layers, each serving a specific purpose in the communication stack. The following picture captures the various layers of context and how they align to the various components of the context.

Context Stack

The Left Side (Context Stack) shows the five layers working in sequence, like floors in a building:

System Context Layer: This is the foundational layer that defines the AI’s operational parameters—think of it as the AI’s “persona” and basic operating instructions:

  • Core capabilities and limitations: What the AI can and cannot do
  • Behavioral guidelines: How the AI should act and respond
  • Safety constraints: Boundaries that should never be crossed
  • Processing preferences: How the AI should approach problems

Example: When building customer service AI, the System Context Layer established that it should be helpful, polite, never make promises about refunds without human approval, and always escalate complex issues. This layer doesn’t change much—it’s the AI’s core personality.

Domain Context Layer: This is where we tell the AI what domain skills it needs to acquire for this particular job:

  • Domain-specific knowledge: The specialized information relevant to the field
  • Terminology and jargon: The language experts in this field actually use
  • Industry standards: Best practices and accepted norms
  • Relevant methodologies: How work gets done in this domain

Example: When building a medical coding AI, the Domain Context Layer included medical terminology, coding standards like ICD-10, healthcare regulations, and the specific workflow that medical coders follow. Without this layer, the AI would give generic advice instead of expert-level guidance.

Task Context Layer: This is where we get specific about what we want the AI to accomplish:

  • Task requirements: Exactly what needs to be done
  • Success criteria: How we’ll know if the task was completed well
  • Input/output specifications: What goes in and what should come out
  • Performance expectations: Speed, accuracy, and quality standards

Example: For a legal document review AI, the Task Context Layer specified that it needed to identify potential compliance issues, flag any sections requiring human review, and provide a confidence score for each finding. Without clear task definition, the AI would wander aimlessly.

Interaction Context Layer: This layer manages how the conversation flows between human and AI.

  • Communication style: Formal, casual, technical, or conversational
  • Feedback mechanisms: How the AI should ask for clarification
  • Error handling: What to do when something goes wrong
  • Clarification protocols: How to handle ambiguous requests

Example: A financial advisory AI might have an interaction context layer that specified it should always explain complex financial concepts in simple terms, ask follow-up questions when risk tolerance wasn’t clear, and never make specific investment recommendations without proper disclaimers.

Response Context Layer: This final layer shapes how the AI delivers its output:

  • Structure requirements: How the response should be organized
  • Formatting preferences: Headers, bullet points, tables, etc.
  • Delivery constraints: Length limits, technical requirements, accessibility needs
  • Quality standards: What makes a response “good enough” vs. “excellent”

Example: For a technical documentation AI, the Response Context Layer specified that all responses should start with a brief summary, use numbered steps for procedures, include code examples where relevant, and end with common troubleshooting tips. This consistency made the documentation predictable and user-friendly.

Context Components

The Right Side (Context Components) shows the specific tools and materials we use at each layer:

  • Role Definition: connects to the System Layer— What is AI’s persona, and what is supposed to think like?
  • Knowledge Base: connects to the Domain Layer—what specialized information is available?
  • Constraints: connect to the Task Layer—what are the rules and limitations?
  • Examples: connect to the Interaction Layer—what does good work look like?
  • Output Format: connects to the Response Layer—how should the final result be presented?

Now building these layers of context requires a very structured way of capturing them.

One of the most powerful applications of context engineering is Product Requirement Prompts (PRPs)—specialized prompts designed to extract, analyze, and structure product requirements from stakeholder conversations, documents, and business needs.

We will be doing a deep dive on PRP and how RAG enhances the approach in Part 2 of the blog. Before we close this part, I want to quickly talk about some of the concepts I stumbled into, which I feel are some very useful design patterns and context engineering techniques.

Advanced Context Engineering Techniques

Context Layering

Context layering involves building context incrementally, allowing for more complex and nuanced interactions. This design pattern is a very structured way to achieve the layered context that we discussed in the previous section.

Context Chaining

Context chaining allows for complex multi-step processes where the output of one context becomes the input for another; this way the context builds on bringing in different perspectives into it. Just that this may end up with a very large context document, which might effect your context window limits and token consumption costs

Emerging techniques

Context engineering continues to evolve with advances in AI capabilities. Key areas of development include:

  • Adaptive Context Systems: Contexts that learn and adjust based on performance
  • Multi-modal Context Integration: Combining text, visual, and audio context. This is becoming very critical for lot of enterprise solutions, where it's not just about generating application code but scanning through various types of data to solve a complex enterprise problem
  • Context Compression Techniques: Optimizing context size while maintaining effectiveness, this will help reduce the token consumption & bring the context withinthe context window limits.
  • Automated Context Generation: AI-assisted context design and optimization

Best Practices for Context Engineering

  • Clarity and Precision: Use specific & unambiguous language. Avoid jargon unless it’s domain-appropriate, and Define terms that might be misunderstood
  • Structured Organization: Follow consistent formatting patterns, and use hierarchical organization for complex contexts, and separate concerns into distinct sections.
  • Validation and Testing: Test contexts with various inputs. Validate outputs against expected criteria. Iterate based on performance feedback
  • Scalability Considerations: Design contexts that can handle varying input sizes, Consider computational complexity, and Plan for context reuse and adaptation
  • Documentation and Maintenance: Document context design decisions, Track performance metrics, and Maintain version control for context evolution

There you go; it's a very interesting topic and a lot to cover, so I have broken this into 2 parts. In the next part, we will cover PRPs, which are very critical for context engineering, and there is so much to cover about PRPs that I thought it was a good idea to have a dedicated blog for it.

Hope this was useful; please leave your comments, feedback, and yout experiences doing Prompt engineering, vibe coding, context engineering

See you soon in the next part… take care and have fun :-D

--

--

A B Vijay Kumar
A B Vijay Kumar

Written by A B Vijay Kumar

IBM Fellow, Master Inventor, Agentic AI, GenAI, Hybrid Cloud, Mobile, RPi Full-Stack Programmer

Responses (13)