AI Solution Architecture

Docs

View source

Learn AI Solution Architecture

A bilingual, repository-grounded course for learning how modern AI systems are designed from agent applications to serving, retrieval, training, observability, tools, and production governance.

English Tiếng Việt 17 Repositories 6 Domains 12 Lessons 6 Projects Templates

This knowledge system is synthesized from the bilingual deep-dive architecture documents in repo-architecture-docs/. It is written as a learning path, not as another per-repository inventory.


Table Of Contents


English

The goal is simple: learn how to design a complete AI solution, not just one library call. A production AI system normally needs six cooperating layers:

  1. Application and agent architecture: agent loops, tools, memory, workflows, human-in-the-loop, orchestration.
  2. Model serving and inference: model loading, batching, scheduling, token streaming, local or distributed serving.
  3. Training and adaptation: parameter-efficient tuning, distributed training, checkpointing, optimizer state, reproducibility.
  4. RAG and vector data: ingestion, embeddings, indexing, hybrid search, metadata filters, tenancy, durability.
  5. Observability, evaluation, and LLMOps: tracing, scoring, datasets, feedback, prompt/version governance, experiment lineage.
  6. Tooling, MCP, and AI platform: tool servers, UI gateways, model/provider routing, admin controls, workspace integration.

Read the full English course:


Tiếng Việt

Mục tiêu của bộ tài liệu này là giúp bạn học cách thiết kế một giải pháp AI hoàn chỉnh, không chỉ gọi một thư viện đơn lẻ. Một hệ thống AI production thường cần sáu lớp phối hợp:

  1. Kiến trúc ứng dụng và agent: vòng lặp agent, tool, memory, workflow, human-in-the-loop, orchestration.
  2. Serving và inference mô hình: nạp model, batching, scheduling, streaming token, serving local hoặc distributed.
  3. Training và adaptation: fine-tuning tiết kiệm tham số, training phân tán, checkpoint, optimizer state, khả năng tái lập.
  4. RAG và dữ liệu vector: ingestion, embedding, indexing, hybrid search, metadata filter, tenancy, durability.
  5. Observability, evaluation và LLMOps: tracing, scoring, dataset, feedback, quản trị prompt/version, lineage thí nghiệm.
  6. Tooling, MCP và nền tảng AI: tool server, UI gateway, routing provider/model, admin control, tích hợp workspace.

Đọc khóa học tiếng Việt:


The AI Solution Architecture Map

flowchart LR User[Users and business workflows] --> App[AI app and agent layer] App --> Tools[Tools, MCP, platform gateway] App --> RAG[RAG and vector data] App --> Serving[Model serving and inference] Serving --> Models[Model artifacts and runtimes] Training[Training and adaptation] --> Models RAG --> Data[Documents, metadata, embeddings] App --> LLMOps[Tracing, evaluation, feedback] Serving --> LLMOps RAG --> LLMOps LLMOps --> Governance[Security, governance, operations] Tools --> Governance Governance --> User

The repository set is organized around that map:

DomainRepositoriesWhat You Learn
AI app / agent architectureOpenAI Agents Python, LangChain, AutoGen, LlamaIndexAgent loops, workflow graphs, multi-agent boundaries, retrieval orchestration
Model serving / inferencevLLM, llama.cpp, TransformersRuntime selection, scheduling, quantization, token streaming, compatibility
Fine-tuning / trainingPEFT, DeepSpeedAdapter tuning, distributed training, memory partitioning, checkpoint governance
RAG / vector databaseQdrant, ChromaVector search, indexing, tenancy, durability, query semantics
Observability / evaluation / LLMOpsLangfuse, Phoenix, MLflow, TruLensTracing, scoring, experiment lineage, eval design, feedback loops
Tooling / MCP / AI platformMCP servers, Open WebUITool contracts, provider gateways, admin surfaces, self-hosted AI workspaces

Quick Start

If you want the fastest path, do this:

  1. Read English course homepage or trang tiếng Việt.
  2. Use the repository atlas to locate the stack you care about.
  3. Pick one project from Projects or Dự án.
  4. Copy a template from the AI architecture toolkit.
  5. Go back to the deep-dive source docs in repo-architecture-docs/ when you need repository-level architecture details.
  6. Use the glossary to normalize vocabulary before comparing libraries.
FAST PATH
=========
Course map
  |
  v
Repository atlas --> choose layer --> read deep-dive docs
  |
  v
Project lab --> design decision log --> production checklist
  |
  v
Capstone architecture

Learning Path

flowchart TB P1[Phase 1: Agent applications] --> P2[Phase 2: Inference runtime] P2 --> P3[Phase 3: RAG and vector storage] P3 --> P4[Phase 4: Training and adaptation] P4 --> P5[Phase 5: Observability and evaluation] P5 --> P6[Phase 6: Platform, tools, and governance] P6 --> Capstone[Capstone: production AI solution architecture]

Each phase has two outputs: a mental model and a practical architecture artifact. Do not stop at reading APIs. The useful skill is knowing where boundaries should be drawn, what failure modes matter, and how to verify that the system is production-ready.


Syllabus

LessonQuestionPrimary Repositories
L01What does an AI solution architecture contain end to end?All repositories
L02How should agent applications be decomposed?OpenAI Agents Python, LangChain, AutoGen, LlamaIndex
L03When do you choose workflow graphs, agent loops, or multi-agent teams?LangChain, AutoGen, OpenAI Agents Python
L04How do model runtimes change architecture decisions?Transformers, vLLM, llama.cpp
L05What makes serving production-grade?vLLM, llama.cpp, Open WebUI
L06When should you fine-tune, adapt, or avoid training?PEFT, DeepSpeed
L07How should RAG data be modeled and operated?Qdrant, Chroma
L08How do retrieval and agent orchestration interact?LlamaIndex, LangChain, Qdrant, Chroma
L09What should be traced, scored, and evaluated?Langfuse, Phoenix, TruLens
L10How do experiment lineage and model lifecycle fit into LLMOps?MLflow, Langfuse, Phoenix
L11How should tools and MCP servers be governed?MCP servers, Open WebUI, AutoGen
L12What does a production readiness review look like?All repositories

Projects

ProjectBuildMain Decision
P01Agent architecture comparisonAgent loop vs workflow graph vs multi-agent team
P02Serving runtime selectionTransformers vs vLLM vs llama.cpp
P03RAG system designQdrant vs Chroma, ingestion and query lifecycle
P04Fine-tuning and deployment planPEFT adapters, DeepSpeed scaling, serving handoff
P05LLMOps and evaluation layerLangfuse, Phoenix, MLflow, TruLens boundaries
P06Capstone production AI platformEnd-to-end architecture with governance and failure drills

Open the full project track:


Toolkit

The toolkit contains copy-ready architecture artifacts:


Capstone And Assessment

The capstone gives the course one concrete product scenario. The assessment pack gives learners a way to test whether they can defend architecture decisions with evidence.


Repository Atlas

The atlas is the fastest way to compare repositories by role, decision point, integration surface, and production risk:

mindmap root((AI solution architecture)) Agent apps OpenAI Agents Python LangChain AutoGen LlamaIndex Serving Transformers vLLM llama.cpp Training PEFT DeepSpeed RAG Qdrant Chroma LLMOps Langfuse Phoenix MLflow TruLens Platform MCP servers Open WebUI

Local Use

This is a Markdown-first documentation set. No build step is required.

Run validation from the workspace root:

powershell -ExecutionPolicy Bypass -File learn-ai-solution-architecture\validate-knowledge-system.ps1

Source deep dives remain in repo-architecture-docs/. This course layer synthesizes them into a learning system.