Docs
View sourceGlossary
This glossary normalizes vocabulary across agent frameworks, serving runtimes, vector databases, training libraries, and LLMOps tools.
| Term | Meaning |
|---|---|
| Agent | A runtime entity that chooses actions, calls tools, manages context, and produces outputs. |
| Agent loop | A repeated cycle of observe, plan, act, inspect result, and continue or stop. |
| Workflow graph | A structured flow of nodes and edges, often more deterministic than an agent loop. |
| Tool | A callable capability exposed to a model or agent, usually with a schema and side effects. |
| Handoff | Transfer of control from one agent or role to another. |
| Guardrail | A policy or validation layer that blocks, routes, or modifies unsafe or invalid behavior. |
| Memory | Persistent or session-level state used by an agent or application. |
| RAG | Retrieval-augmented generation: retrieve external context before or during generation. |
| Embedding | A vector representation of text, image, or other data used for similarity search. |
| Chunk | A document segment indexed for retrieval. |
| Vector database | Storage and query system for embeddings plus metadata. |
| Hybrid search | Retrieval that combines vector similarity with lexical or structured filtering. |
| Payload / metadata | Non-vector fields used for filtering, tenancy, access control, or ranking. |
| Inference runtime | Software that loads a model and executes generation or prediction. |
| Token streaming | Incremental delivery of generated tokens to the caller. |
| KV cache | Cached attention keys and values used to speed autoregressive decoding. |
| Quantization | Reducing model numeric precision to save memory or improve speed. |
| Adapter | A small trainable module attached to a base model for task/domain adaptation. |
| PEFT | Parameter-efficient fine-tuning, including LoRA-style adapter methods. |
| ZeRO | DeepSpeed optimization family that partitions optimizer, gradients, and parameters. |
| Checkpoint | Saved model/training state used for recovery or deployment. |
| Trace | Structured record of an AI interaction, including spans for model calls, tools, retrieval, and scores. |
| Span | One timed operation inside a trace. |
| Evaluation dataset | A set of examples used to measure quality or regressions. |
| Feedback function | A scoring function that measures properties such as groundedness or relevance. |
| Lineage | Provenance chain connecting dataset, prompt, model, adapter, retrieval config, run, and deployment. |
| MCP | Model Context Protocol, a standard interface for connecting models/agents to tools and resources. |
| Gateway | A layer that routes requests to providers, tools, models, or policies. |
| Production readiness | Evidence that a system is safe, observable, governable, scalable, and recoverable enough to operate. |
Vocabulary By Layer
mindmap
root((Vocabulary))
Agent layer
Agent loop
Tool
Handoff
Guardrail
Memory
Serving layer
Runtime
Tokenizer
Streaming
KV cache
Quantization
Data layer
RAG
Embedding
Chunk
Metadata
Hybrid search
Training layer
Adapter
PEFT
ZeRO
Checkpoint
LLMOps layer
Trace
Span
Score
Dataset
Lineage
Platform layer
MCP
Gateway
Policy
Audit log
Terms That Are Often Confused
| Pair | Distinction |
|---|---|
| Agent vs workflow | Agents choose actions dynamically; workflows encode more explicit control flow. |
| RAG vs fine-tuning | RAG changes context at runtime; fine-tuning changes model behavior through training. |
| Trace vs log | A trace preserves structured causality across spans; a log is usually an event stream. |
| Evaluation vs monitoring | Evaluation measures quality against examples or criteria; monitoring watches runtime health and drift. |
| Adapter vs checkpoint | An adapter is a small learned module; a checkpoint may contain full model/training state. |
| Vector search vs hybrid search | Vector search uses embedding similarity; hybrid search mixes vector, lexical, and structured signals. |
| Tool server vs gateway | A tool server exposes actions/resources; a gateway routes and governs access to models/tools/providers. |