Phase 528 DaysAdvanced
Phase 5 โ Generative AI
Build production-grade LLM applications using RAG, fine-tuning, and prompt engineering โ with evaluation, safety, and cost control built in from the start.
- Build a RAG pipeline that is grounded, measurable, and failure-aware.
- Fine-tune an open LLM with QLoRA on a custom task under low-VRAM constraints.
- Evaluate generative outputs for quality, groundedness, and safety.
โก Must Know
- LLM Architecture โ decoder-only, autoregressive
- Tokenization โ BPE, SentencePiece
- Prompt Engineering โ zero-shot, few-shot, CoT
- Embeddings + Semantic Similarity
- Vector Databases โ Chroma, FAISS, Pinecone
- RAG Pipeline โ chunk, embed, store, retrieve, generate
- LangChain + LlamaIndex basics
- OpenAI API + Function Calling
- Hugging Face Transformers โ pipeline API
- Fine-Tuning vs RAG vs Prompting โ decision matrix
- LoRA + QLoRA โ PEFT methods
- LLM Evaluation โ ROUGE, BLEU, LLM-as-judge
โจ Good to Know
- Diffusion Models โ DDPM, Stable Diffusion
- GANs + VAEs โ generative architectures
- Ollama + vLLM โ local LLM hosting
- RLHF โ alignment concepts
- LLM Scaling Laws โ compute, data, params
- Anthropic Claude API + system prompts
๐ Resources
Hugging Face NLP Course
Best resource for transformers, tokenizers, and fine-tuning.
huggingface.co/learn โPEFT Docs โ LoRA Guide
Official guide to parameter-efficient fine-tuning methods.
huggingface.co/docs/peft โDeepLearning.AI โ LLM Short Courses
Short, focused courses on RAG, agents, and fine-tuning.
deeplearning.ai โ๐๏ธ Projects
RAG Document Chatbot
Q&A on custom docs with source-grounded responses and failure handling.
Fine-tune Llama 3 (QLoRA)
Adapt an open LLM with parameter-efficient fine-tuning under low VRAM.
Semantic Search Engine
Embedding-based search with vector indexing and retrieval quality metrics.