AI Solution Architecture

Docs

View source

Glossary

This glossary normalizes vocabulary across agent frameworks, serving runtimes, vector databases, training libraries, and LLMOps tools.

TermMeaning
AgentA runtime entity that chooses actions, calls tools, manages context, and produces outputs.
Agent loopA repeated cycle of observe, plan, act, inspect result, and continue or stop.
Workflow graphA structured flow of nodes and edges, often more deterministic than an agent loop.
ToolA callable capability exposed to a model or agent, usually with a schema and side effects.
HandoffTransfer of control from one agent or role to another.
GuardrailA policy or validation layer that blocks, routes, or modifies unsafe or invalid behavior.
MemoryPersistent or session-level state used by an agent or application.
RAGRetrieval-augmented generation: retrieve external context before or during generation.
EmbeddingA vector representation of text, image, or other data used for similarity search.
ChunkA document segment indexed for retrieval.
Vector databaseStorage and query system for embeddings plus metadata.
Hybrid searchRetrieval that combines vector similarity with lexical or structured filtering.
Payload / metadataNon-vector fields used for filtering, tenancy, access control, or ranking.
Inference runtimeSoftware that loads a model and executes generation or prediction.
Token streamingIncremental delivery of generated tokens to the caller.
KV cacheCached attention keys and values used to speed autoregressive decoding.
QuantizationReducing model numeric precision to save memory or improve speed.
AdapterA small trainable module attached to a base model for task/domain adaptation.
PEFTParameter-efficient fine-tuning, including LoRA-style adapter methods.
ZeRODeepSpeed optimization family that partitions optimizer, gradients, and parameters.
CheckpointSaved model/training state used for recovery or deployment.
TraceStructured record of an AI interaction, including spans for model calls, tools, retrieval, and scores.
SpanOne timed operation inside a trace.
Evaluation datasetA set of examples used to measure quality or regressions.
Feedback functionA scoring function that measures properties such as groundedness or relevance.
LineageProvenance chain connecting dataset, prompt, model, adapter, retrieval config, run, and deployment.
MCPModel Context Protocol, a standard interface for connecting models/agents to tools and resources.
GatewayA layer that routes requests to providers, tools, models, or policies.
Production readinessEvidence that a system is safe, observable, governable, scalable, and recoverable enough to operate.

Vocabulary By Layer

mindmap root((Vocabulary)) Agent layer Agent loop Tool Handoff Guardrail Memory Serving layer Runtime Tokenizer Streaming KV cache Quantization Data layer RAG Embedding Chunk Metadata Hybrid search Training layer Adapter PEFT ZeRO Checkpoint LLMOps layer Trace Span Score Dataset Lineage Platform layer MCP Gateway Policy Audit log

Terms That Are Often Confused

PairDistinction
Agent vs workflowAgents choose actions dynamically; workflows encode more explicit control flow.
RAG vs fine-tuningRAG changes context at runtime; fine-tuning changes model behavior through training.
Trace vs logA trace preserves structured causality across spans; a log is usually an event stream.
Evaluation vs monitoringEvaluation measures quality against examples or criteria; monitoring watches runtime health and drift.
Adapter vs checkpointAn adapter is a small learned module; a checkpoint may contain full model/training state.
Vector search vs hybrid searchVector search uses embedding similarity; hybrid search mixes vector, lexical, and structured signals.
Tool server vs gatewayA tool server exposes actions/resources; a gateway routes and governs access to models/tools/providers.