The Ultimate Guide to AI Concepts for Building LLM Apps and Agents (with Open Source Examples)

by Manvir Singh Basra | Apr 13, 2025 | Tech Playbook

In the fast-moving world of AI development, understanding the landscape of concepts and tools is essential. Whether you’re building a chatbot, an AI agent, or a knowledge assistant, these foundational concepts form the building blocks of a powerful LLM application.

Below is a categorized, concept-driven guide to help you build AI-native apps with clarity.

1. LLM (Large Language Models)

What it is: The core engine that generates, understands, and transforms language.
Purpose: Power conversation, summarization, code generation, reasoning, etc.
Top Open Source Example:

LLama 3 / 4 (by Meta)
Mistral

2. Prompt Engineering

What it is: Designing inputs to get optimal outputs from an LLM.
Includes: Few-shot, zero-shot, chain-of-thought, system prompts.
Open Source Tools:

PromptLayer (CMS for prompts)
Guidance by Microsoft
LangChain prompt templates

3. RAG (Retrieval-Augmented Generation)

What it is: Combines LLMs with external data (like documents or databases) for up-to-date, grounded responses.
Includes: Chunking, embedding, retrieval, context injection.
Top Open Source Examples:

LlamaIndex
LangChain RAG pipeline
Haystack
RAGStack by Hugging Face

4. Embeddings

What it is: Numeric representations of text used for semantic similarity and search.
Use Cases: Similarity search, retrieval, memory.
Popular Open Source Models:

BGE by BAAI
E5 by MTEB
Instructor
GTE (Google)

5. Vector Database

What it is: Stores and retrieves embeddings for fast similarity search.
Use Cases: Powering RAG, recommendations, memory for agents.
Top Open Source Examples:

Qdrant
Weaviate
Milvus
FAISS (local, in-memory)

6. Agents

What it is: Autonomous AI entities that decide what tools to use, when to reason, and how to act.
Includes: Planning, reasoning, tool use, memory.
Top Open Source Frameworks:

LangGraph
AutoGen (Microsoft)
CrewAI
MetaGPT

7. State Management / Memory

What it is: Persistence of context, past interactions, and state across agent steps.
Types: Short-term (per session), long-term (persistent), episodic.
Open Source Patterns:

LangGraph state object
Redis memory stores
LlamaIndex context memory

8. Tool Use / Function Calling

What it is: Agents call APIs or tools (weather, search, DB) to gather info or act.
Types: Function Calling, JSON-based tools, plugins.
Examples:

LangChain tools
AutoGen tool wrappers
ReAct agent pattern

9. Orchestration Framework

What it is: Manages flows between LLMs, tools, memory, users.
Purpose: Build complex LLM apps with modular logic.
Top Open Source Examples:

LangChain
LangGraph
Semantic Kernel
Haystack

10. Tool Integration / Plugins

What it is: External utilities agents can use (e.g., code interpreter, SQL, browser).
Popular Plugins:

Python REPL
SQL Database tools
Web search tools

11. Chunking & Text Splitting

What it is: Breaking documents into digestible pieces for embedding and context injection.
Tools:

RecursiveTextSplitter (LangChain)
SentenceSplitters in LlamaIndex

12. Guardrails & Validation

What it is: Ensure outputs are safe, correct, and within bounds.
Includes: JSON schema validation, regex, classification, moderation.
Top Tools:

Guardrails.ai
Rebuff
Flowise validators
LMQL (LLM query language with constraints)

13. Observability & Tracing

What it is: Track what LLMs do, debug reasoning paths, and improve performance.
Tools:

LangSmith
Traceloop
Helicone

14. Agent Memory Graphs / Cognitive Architectures

What it is: Structure agents with working memory, long-term memory, task queues.
Open Source Ideas:

LangGraph state trees
MemGPT
CAMEL agents
DSPy (Stanford)

15. Deployment & Serving LLMs

What it is: Host models or agents locally or via APIs.
Open Source Options:

llama.cpp
Ollama

16. Multi-Agent Systems

What it is: Multiple agents collaborating or debating to solve a problem.
Patterns:

Planner → Tool Selector → Executor
Debate → Finalizer

Frameworks:

AutoGen
CrewAI
LangGraph
MetaGPT

17. Frontend for LLM Apps

What it is: Chat interfaces, dashboards, inputs for end-users.
Popular Open Source UIs:

LangFlow (visual LangChain builder)
Flowise

18. Multi-modal & Vision Models

What it is: LLMs that understand both text and images (or more).
Top Models:

MiniGPT-4
Mistral multimodal

19. Agents-as-APIs (AgentOps)

What it is: Serve agents as RESTful APIs, SaaS logic, or functions.
Use Cases: CRM bots, assistants, devtools.
Tools:

FastAPI + LangChain
CrewAI + Flask

20. Data for LLM Apps

What it is: Source material for RAG, fine-tuning, evals.
Includes: PDFs, Notion, Confluence, SQL, CSVs
Tools:

LlamaIndex connectors
LangChain loaders
Unstructured.io

Final Thoughts

Building AI agents is no longer just about using a language model — it’s about orchestrating tools, data, memory, and workflows into reliable, explainable, and user-friendly systems.

This guide covered the foundational concepts and the best open-source options for each one. Whether you’re building a real estate assistant, internal agent platform, or dev AI co-pilot — these are the blocks you’ll work with.

Pre Construction Property Finder

Developers

Types

Cities

Occupancies

Status

Launch

Lot Sizes

Disclaimer

The information contained on this site is for general guidance only and is not to be construed as legal or other professional advice. It should not be used as a substitute for consultation with legal or other competent advisers. Before making any decision or taking any action, you should consult a professional.

Manvir Singh Basra is not responsible for any errors or omissions in connection with the use of this information. All information on this site is provided “as is,” with no guarantee of completeness or accuracy.

Manvir Singh Basra won’t be liable to you or anyone else for any decision made or action taken in reliance of the information on this site.