What Is Context Engineering?

LLM

Context Engineering

Author: Oleh Baranovskyi

Published on Apr 10, 2026

5 MIN READ

What Is Context Engineering?
Why It Matters
The Context Budget
Core Techniques
Context Engineering vs Prompt Engineering
A Practical Example
Common Pitfalls

Context engineering is the discipline of deliberately designing and managing what goes into an LLM's context window at runtime. It goes beyond writing a good system prompt — it is about deciding what information the model sees, in what form, and in what order, for every single request.

A model's output quality is directly bounded by its input quality. Context engineering is how you control that input.

flowchart LR
    CE["Context Engineering"]
    CE --> SP["What to put in\nthe system prompt"]
    CE --> HS["How much history\nto include"]
    CE --> RD["Which external data\nto retrieve & inject"]
    CE --> TR["Which tool results\nto surface"]
    CE --> FMT["How to format\nand order it all"]

LLMs have no persistent memory. Every call is a fresh start — the model knows only what you put in front of it. Poor context leads to:

Hallucinations caused by missing facts
Ignored instructions buried under irrelevant text
Wasted tokens on content that doesn't help the model answer
Inconsistent behavior across turns

Context engineering is how you prevent all of these systematically rather than patching each symptom individually.

Every model has a context window — a hard token limit for a single call. Everything inside that window costs tokens: system prompt, history, retrieved documents, tool outputs, and the user message.

pie title Context Budget Allocation (example)
    "System prompt" : 10
    "Conversation history" : 25
    "Injected / retrieved data" : 40
    "Tool results" : 15
    "Current user message" : 10

Context engineering means treating tokens as a budget and deciding consciously how to spend them. The key principle: only include content that changes what the model outputs.

Retrieval-Augmented Generation (RAG) Instead of relying on training knowledge, retrieve relevant documents at query time and inject them. The model gets exactly the facts it needs for this specific question.

flowchart LR
    Q["User query"] --> R["Retrieve relevant\ndocuments"]
    R --> CTX["Inject into context"]
    CTX --> LLM["LLM generates\ngrounded answer"]

History summarization Rather than replaying every past turn verbatim, compress older turns into a short summary. Only recent turns stay in full.

flowchart LR
    OLD["Old turns (verbatim)"] --> SUM["Summarize → compact paragraph"]
    RECENT["Recent turns (verbatim)"] --> CTX["Context"]
    SUM --> CTX

Structured system prompt Organize the system prompt into clear sections (role, rules, output format, examples) so the model can parse it predictably. Avoid prose walls.

## Role
You are a senior TypeScript developer assistant.

## Rules
- Answer only about TypeScript and Node.js.
- Always include type annotations in code examples.

## Output format
Respond in plain text. Use fenced code blocks for all code.

Dynamic injection Build the context programmatically per request — inject user profile, feature flags, current date, or live API results only when they are relevant to the query.

Context pruning Actively remove stale, redundant, or off-topic content before each call. A smaller, tighter context outperforms a large, diluted one.

flowchart TD
    PE["Prompt Engineering\nCrafting the wording of a single instruction"]
    CE["Context Engineering\nDesigning the full information environment\nfor every model call"]
    PE -->|"is one part of"| CE

Prompt engineering is about how you phrase something. Context engineering is about what the model can see at all — which makes it the broader, more impactful discipline for production systems.

Imagine a customer support bot. A naive implementation passes the full chat history and a generic system prompt. A context-engineered implementation does this instead:

flowchart TD
    UM["User message"] --> INT["Classify intent"]
    INT --> RET["Retrieve relevant\nFAQ / policy docs"]
    INT --> PROF["Fetch user account\nsummary"]
    RET --> BUILD["Assemble context"]
    PROF --> BUILD
    SYS["Focused system prompt\n(role + rules only)"] --> BUILD
    HIST["Last 3 turns\n(not full history)"] --> BUILD
    BUILD --> LLM["LLM call"]
    LLM --> ANS["Accurate, grounded answer"]

Each piece of context is chosen for this specific query. Nothing is included by default.

Injecting everything by default. More context is not always better. Irrelevant content dilutes the useful signal and increases cost.

Never trimming history. Unbounded history eventually consumes the entire budget, leaving no room for retrieved data.

Burying key instructions in the middle. LLMs tend to underweight instructions in the middle of a long context. Put critical rules at the start or end of the system prompt.

Unsanitized user input in the system prompt. Placing raw user text in a privileged position opens prompt-injection attacks.

Never concatenate unvalidated user input directly into the system prompt. An attacker can embed instructions like "Ignore all previous instructions" to hijack the model's behavior.

Static context for a dynamic world. Hardcoding documents or facts into the system prompt means the model works with stale information. Retrieve and inject dynamically instead.

Table of contents

What Is Context Engineering?

Why It Matters

The Context Budget

Core Techniques

Context Engineering vs Prompt Engineering

A Practical Example

Common Pitfalls