วิศวกรรมบริบท: คู่มือ LLM อัจฉริยะ | th

Context engineering ถือเป็นก้าวสำคัญในด้านปัญญาประดิษฐ์ (AI) โดยเป็นการเปลี่ยนแปลงจากการใช้ prompts เพียงอย่างเดียวไปสู่การสร้างระบบนิเวศข้อมูลที่ครอบคลุมรอบ ๆ large language models (LLMs) เนื่องจากแอปพลิเคชัน AI มีวิวัฒนาการจาก chatbots ขั้นพื้นฐานไปจนถึง intelligent agents ที่ซับซ้อนซึ่งสามารถดำเนินงานที่ซับซ้อนและหลายขั้นตอนได้ คุณภาพของผลลัพธ์ของแบบจำลองจึงขึ้นอยู่กับข้อมูลที่ให้มามากขึ้น ดังนั้น context engineering จึงมีความสำคัญอย่างยิ่งต่อการสร้างแอปพลิเคชัน AI ที่น่าเชื่อถือและมีประสิทธิภาพซึ่งมอบประสบการณ์การใช้งานที่น่าประทับใจ

การเปลี่ยนแปลงกระบวนทัศน์: จาก Prompts สู่ Systems

จุดสนใจกำลังเปลี่ยนจากการสร้าง prompts แต่ละรายการไปเป็นการสร้างระบบนิเวศข้อมูลที่สมบูรณ์รอบ ๆ large language models (LLMs) อย่างเป็นระบบ เนื่องจากแอปพลิเคชัน AI มีวิวัฒนาการจาก chatbots อย่างง่ายไปจนถึง intelligent agents ที่สามารถดำเนินงานที่ซับซ้อนและหลายขั้นตอนได้ คุณภาพของผลลัพธ์ของแบบจำลองจึงขึ้นอยู่กับคุณภาพของข้อมูลที่ให้มามากขึ้น ผู้นำในอุตสาหกรรมและนักวิจัยด้าน AI ตระหนักถึงความสำคัญของการเปลี่ยนแปลงนี้ โดยเน้นย้ำถึงความจำเป็นในการให้ LLMs ได้รับบริบทที่ครอบคลุมเพื่อแก้ไขงานอย่างมีประสิทธิภาพ Context engineering เกี่ยวข้องกับศิลปะและวิทยาศาสตร์ในการเติม context window ด้วยข้อมูลที่ถูกต้อง ทำให้แบบจำลองสามารถตัดสินใจได้อย่างแม่นยำ

ข้อโต้แย้งหลักคือความล้มเหลวของ intelligent agents ส่วนใหญ่เกิดจากการขาดบริบทมากกว่าความล้มเหลวของแบบจำลอง การยืนยันนี้กำหนดความท้าทายหลักของ AI engineering ใหม่ โดยเปลี่ยนความสนใจจากการปรับแต่งแบบจำลองไปเป็นการพัฒนาระบบสนับสนุนข้อมูล การทำความเข้าใจและเชี่ยวชาญ context engineering ได้กลายเป็นข้อกำหนดเบื้องต้นสำหรับการสร้างแอปพลิเคชัน AI ที่น่าเชื่อถือและ robust

การกำหนด Context Engineering

Context engineering ไม่ใช่แค่ prompt engineering เวอร์ชันที่ปรับปรุงแล้วเท่านั้น แต่เป็นสาขาวิศวกรรมระดับระบบที่ไม่เหมือนใครซึ่งมุ่งเน้นไปที่การสร้างระบบการส่งมอบข้อมูลแบบไดนามิก แทนที่จะเป็นเพียงการเพิ่มประสิทธิภาพอินพุตข้อความ

Context engineering สามารถกำหนดได้ว่าเป็นสาขาวิศวกรรมที่มุ่งเน้นไปที่การออกแบบและสร้างระบบไดนามิกที่ให้ข้อมูลและเครื่องมือที่ LLMs ต้องการเพื่อให้งานเสร็จสมบูรณ์อย่างถูกต้อง ในรูปแบบที่ถูกต้อง และในเวลาที่เหมาะสม

องค์ประกอบสำคัญ:

“การออกแบบและสร้างระบบไดนามิก”: เน้นว่า context engineering เป็นกิจกรรมทางวิศวกรรม โดยมุ่งเน้นไปที่สถาปัตยกรรมระบบมากกว่าแค่การใช้คำ Context คือผลลัพธ์ของระบบที่ทำงานก่อนการเรียก LLM หลัก วิศวกรจำเป็นต้องสร้าง data pipelines, memory modules และ information retrieval mechanisms เพื่อเตรียม working memory ของ LLM
“ข้อมูลและเครื่องมือที่ถูกต้อง”: ครอบคลุมข้อเท็จจริง ข้อมูล เนื้อหา knowledge base (ผ่าน RAG) และความต้องการของผู้ใช้ เครื่องมือหมายถึงความสามารถเช่น API interfaces ฟังก์ชัน หรือ database queries การให้ทั้งความรู้และความสามารถเป็นพื้นฐานสำหรับงานที่ซับซ้อน
“รูปแบบที่ถูกต้อง ในเวลาที่เหมาะสม”: เน้นถึงความสำคัญของการนำเสนอข้อมูลและ timing บทสรุปที่กระชับมักจะดีกว่า raw data และ tool schema ที่ชัดเจนมีประสิทธิภาพมากกว่าคำแนะนำที่คลุมเครือ การให้บริบทตามความต้องการเป็นสิ่งสำคัญเพื่อหลีกเลี่ยงการรบกวนแบบจำลองด้วยข้อมูลที่ไม่เกี่ยวข้อง
“ทำงานให้เสร็จสมบูรณ์ได้อย่างน่าเชื่อถือ”: นี่คือเป้าหมายสูงสุดของ context engineering โดยจะเปลี่ยนแอปพลิเคชัน AI ให้เป็นระบบที่น่าเชื่อถือซึ่งสามารถสร้างเอาต์พุตคุณภาพสูงได้อย่างสม่ำเสมอ ด้วยการจัดการบริบทที่แม่นยำ เอาต์พุตจะสอดคล้องกันมากขึ้น ลด hallucinations และสนับสนุน intelligent agent workflows ที่ซับซ้อนและยาวนาน

วิวัฒนาการจาก Prompt Engineering สู่ Context Engineering

แม้ว่าทั้ง context engineering และ prompt engineering จะมุ่งเป้าไปที่การเพิ่มประสิทธิภาพเอาต์พุต LLM แต่ก็มีความแตกต่างกันในด้านขอบเขต ลักษณะ และเป้าหมาย การเปรียบเทียบระดับระบบเน้นย้ำถึงความแตกต่างเหล่านี้:

ขอบเขต: Prompt engineering มุ่งเน้นไปที่การเพิ่มประสิทธิภาพ single interactions หรือ text strings ในขณะที่ context engineering มุ่งเน้นไปที่ระบบนิเวศข้อมูลทั้งหมด ครอบคลุมวงจรชีวิตของงานทั้งหมด
ความไดนามิก: Prompts มักจะเป็นแบบ static ในขณะที่ context ถูกสร้างขึ้นแบบไดนามิกตามงานและมีวิวัฒนาการระหว่างการโต้ตอบ
Input Composition: Prompt engineers สร้างอินพุตรอบ ๆ user queries ในขณะที่ context engineers มองว่า user queries เป็นเพียงส่วนหนึ่งของ “context package” ที่ใหญ่กว่า ซึ่งรวมถึง system instructions, retrieved documents, tool outputs และ conversation history
การเปรียบเทียบ: ถ้า prompts เป็นเหมือน single line ในละคร context คือ set เรื่องราวเบื้องหลัง และบทภาพยนตร์ทั้งหมดของภาพยนตร์ ซึ่งรวมกันแล้วให้ความลึกซึ้งและความหมาย

ตารางด้านล่างเปรียบเทียบทั้งสองเพิ่มเติม:

Prompt Engineering vs. Context Engineering

มิติ	Prompt Engineering	Context Engineering
ขอบเขต	Single interaction, single input string	Entire intelligent agent workflow, full information ecosystem
ลักษณะ	Static หรือ semi-static, template-based	Dynamic, assembled in real-time, evolves with the task
เป้าหมาย	Guide the LLM to give a high-quality answer	Empower the LLM to reliably complete complex tasks continuously
ผลิตภัณฑ์หลัก	Optimized prompt templates, instruction sets	Data pipelines, RAG systems, memory modules, state managers
ทักษะหลัก	Linguistics, logical reasoning, instruction design	System architecture, data engineering, software development
การเปรียบเทียบหลัก	Asking a precise question	Building a comprehensive library for a researcher

การกำหนด AI Engineering ใหม่

การเปลี่ยนแปลงจาก prompt engineering ไปสู่ context engineering นี้ปรับเปลี่ยนบทบาทของ AI engineers Prompt engineering มุ่งเน้นไปที่การปรับปรุง input strings ซึ่งต้องใช้ทักษะด้านภาษาศาสตร์และตรรกะ อย่างไรก็ตาม เมื่อภารกิจคือการสร้างระบบที่ประกอบ inputs เหล่านี้จาก databases, APIs และ memory แบบไดนามิก ทักษะหลักจะเปลี่ยนเป็นการพัฒนาซอฟต์แวร์และสถาปัตยกรรมระบบ

Frameworks เช่น LangChain และ LlamaIndex เป็นที่นิยมเนื่องจากสนับสนุน context engineering โดยนำเสนอ architectural patterns สำหรับการสร้าง dynamic context assembly systems เช่น Chains, Graphs และ Agents

การเพิ่มขึ้นของ context engineering แสดงถึงการเปลี่ยนแปลงในการพัฒนา AI จาก a model-centric, niche field ไปสู่ mainstream software engineering discipline ความท้าทายหลักไม่ใช่แค่ตัวแบบจำลองเอง แต่เป็น application stack ทั้งหมดที่สร้างขึ้นรอบ ๆ

Context: การวิเคราะห์และหลักการ

ส่วนนี้ให้รายละเอียดเกี่ยวกับองค์ประกอบของ “context” และสรุปหลักการสำหรับการจัดการที่มีประสิทธิภาพ

การแยกส่วน Context Window

Context window is the total information the model can “see” or “remember” when generating a response A complete “context package” is the sum of all information provided

Instructions/System Prompt: This base layer defines the model’s behavior, setting its role, style, rules, constraints, and objectives
User Prompt: The direct question or task instruction that triggers the intelligent agent
Conversation History/Short-Term Memory: Previous exchanges provide direct context, managed through pruning or summarization due to context window limitations
Long-Term Memory: A persistent knowledge base that records information learned from interactions, such as user preferences, project summaries, or facts explicitly told to remember
Retrieved Information/RAG: To overcome knowledge cutoff and ensure fact-based responses, the system dynamically retrieves relevant information from external knowledge sources
Available Tools: Defines the schemas and descriptions of callable functions or built-in tools, giving the model the power to act, not just know
Tool Outputs: Results from tool calls must be re-injected into the context for the model to use in subsequent reasoning and actions
Structured Output Schema: Defines the expected output format (like JSON Schema) to guide structured, predictable results

The “LLM as an Operating System” Framework

This analogy provides a solid theoretical framework for understanding and practicing context management

LLM as CPU, Context Window as RAM: This analogy positions the context window as a limited and valuable resource Context engineering is like OS management, efficiently loading the right information at the right time into working memory
Kernel Context vs User Context: This framework divides context into two layers; similar to kernel space and user space
- Kernel Context: Represents the managed, variable, persistent state of the intelligent agent It includes core memory blocks and file systems that the LLM can observe, but only modify through controlled “system calls”
- User Context: Represents the “user space” or message buffer, where dynamic interactions occur It includes user messages, assistant responses, and calls to non-privileged “user program” tools
System Calls and Custom Tools: This distinction clarifies how the agent interacts with its internal state and the external world System calls modify the kernel context, altering the agent’s persistent state, while custom tools bring external information into the user context

Guiding Principles of Context Engineering

Effective context engineering follows core principles, derived from practitioners, to build reliable intelligent agent systems

Continuous and Comprehensive Context: Also known as “See Everything,” this principle requires that the agent has access to its full operational history, including previous user interactions, tool call outputs, internal thinking processes, and intermediate results
Avoid Uncoordinated Parallelism: Allowing multiple sub-agents or sub-tasks to work in parallel without a shared, continuously updated context almost inevitably leads to output inconsistencies, conflicting goals, and failures
Dynamic and Evolving Context: Context should not be a static information block It must be assembled and evolved dynamically based on task progress, acquiring or updating information at runtime
Full Contextual Coverage: The model must be provided with all the information it might need, not just the latest user question The entire input package (instructions, data, history, etc ) must be carefully designed

Context Management Strategies:

Writing: Persisting Context:

This involves storing information beyond the immediate context window for future use, building the agent’s memory capabilities

Scratchpads: Used for storing short-term memory within the session
Memory Systems: Used for building long-term memory across sessions

Selecting: Retrieving Context:

This involves pulling the right information from external storage into the context window at the right time

Selecting from Memory/Scratchpads: The agent must be able to effectively query its persisted memory and scratchpads when it needs to recall past knowledge
Selecting from Tools: When the agent has many available tools, it is efficient to apply RAG techniques to the tool descriptions themselves, dynamically retrieving and providing only the most relevant tools based on the current task
Selecting from Knowledge: This is the core function of Retrieval-Augmented Generation (RAG), dynamically acquiring factual information from external knowledge bases to enhance the model’s answering capabilities

Compressing: Optimizing Context:

This involves reducing the number of tokens used in the context while retaining core information

Summarization: Using the LLM to summarize lengthy conversation histories, documents, or tool outputs, extracting key information
Trimming: Using heuristic rules to cut back the context, such as simply removing the earliest dialogue rounds when the conversation history is too long

Isolating: Partitioning Context:

This involves decomposing the context into different parts to improve the model’s focus and manage task complexity

Multi-agent Systems: Large tasks can be split among multiple sub-agents, each with its own dedicated, isolated context, tools, and instructions
Sandboxed Environments: Operations that consume a large number of tokens can be run in an isolated environment, returning only the final key results to the main LLM’s context

Advanced Memory Architectures

Memory is key to building intelligent agents that can learn and adapt Key components include short-term memory through dialogue history buffers and scratchpads, and long-term memory for persistence and personalization

Implementation Techniques:
- Automated Memory Generation: The system can automatically generate and store memories based on user interactions
- Reflection Mechanisms: The agent can self-reflect on its behavior and results after completing tasks, synthesizing learned lessons into new memories
- Dialogue Summarization: Regularly summarize past conversations and store the summaries as part of long-term memory
Structured Memory (Temporal Knowledge Graphs): A more advanced memory architecture that stores not just facts but relationships between facts and timestamps for each piece of information

Retrieval-Augmented Generation (RAG): The Cornerstone of Context Engineering

RAG is a core technique for “selecting” external knowledge in context engineering, connecting LLMs to external knowledge bases A typical RAG system has three stages:

Indexing: Documents are split into semantic chunks, then converted into high-dimensional vectors using an embedding model These vectors and source texts are stored in the vector database
Retrieval: The user converts a query to a vector with the same embedding model and searches the vector database for other close vectors with similar queries
Generation: The system combines the original query and the related text chunks into a prompt, then submits it to the LLM to generate a suitable answer

Advanced Retrieval and Ranking Strategies

The basic RAG architecture often needs more complex strategies to improve retrieval quality in the real world Combining semantic search with keyword indexes and ranking is crucial for improving search quality Anthropic’s contextual information retrieval will improve the context of LLMs

Hybrid Search: Combines semantic search (based on vectors) and keyword search to leverage complementary strengths
Contextual Retrieval: Uses an LLM to generate a short summary of the context of each text block
Re-ranking: Adds a re-ranking step, using a stronger model to re-sort the results based on relevance

RAG vs Fine-tuning: A Strategic Decision Framework

Choosing between RAG and fine-tuning is a key decision The choice depends on the requirements of the project

Advantages of RAG:
- Suitable for integration of real-time knowledge
- Reduces hallucinations by providing verifiable facts
- Allows enterprises to keep proprietary data within secure internal databases
Advantages of Fine-tuning:
- Best for teaching a model a new behavior, speech style, or specialized terminology
- Can align the model’s output with the organization’s brand image
Hybrid Approaches: In order to get the best performance with models, you should use both fine-tuning for performance and RAG for accuracy

Context Optimization and Filtering

Even by using powerful retrieval mechanisms, managing the context window and avoiding common failures, you will still run into errors

Common failure modes:

Context Poisoning: when a seemingly factual error is presented, it will corrupt the entire system from that point forward
Context distraction: Models get distracted when presented with irrelevant information
Context confusion: Context information can be overwhelming with the model leading it away from the correct answer
Context Clash: Models get confused with conflicting information and may produce a contradictory answer

Solutions:

Engineers need to adopt filtering techniques to mitigate these failures Ensuring the model’s working memory is full of highly relevant and completely optimized information becomes essential for practice and theory

Context Engineering in Practice: Case Studies

Analyzing different applications provides a deeper understanding of the value and implementation of context engineering

AI Programming Assistants

The Problem: Early attempts at AI programming were often chaotic, relying on vague prompts with little understanding of the larger codebase
The Solution: Treat the project documentation, code guidelines, design patterns, and requirements like any engineering resource

Enterprise Search and Knowledge Management

The Problem: Traditional enterprise search engines rely on keyword matching, failing to understand user intent, job role, or the reason for their search
The Solution: Build intelligent search systems using context to understand each search

Automated Customer Support

The Problem: General LLMs are unaware of product specifics, return policies, or customer history, leading to inaccurate or unhelpful responses
The Solution: Use RAG-based chatbots, systems that retrieve information from the company’s knowledge base, to ensure accurate, personalized, and up-to-date assistance

Personalized Recommendation Engines

The Problem: Traditional recommendation systems struggle to grasp the immediate, specific intent of users, resulting in generic recommendations
The Solution: Context engineering uses RAG to make the experience more conversational

Mitigating Fundamental Flaws of Large Language Models

Context engineering is a key means of addressing two fundamental LLM shortcomings: hallucinations and knowledge cutoff

Countering Hallucinations

The Problem: When LLMs are uncertain or lack relevant knowledge, they tend to fabricate plausible but untrue information
The Solution: Context Engineering, especially RAG, are the most effective strategies
- Provide Factual Basis: By providing verifiable documents from a trusted source during answering, hallucinations can be avoided effectively
- Honesty “I don’t know.”: In order to be transparent, indicate to models to show “I Dont Know” when no information is available

อัปเดตเมื่อ 2025-07-09

# AI # LLM # RAG