hands-on-llms
🦖 𝗟𝗲𝗮𝗿𝗻 about 𝗟𝗟𝗠𝘀, 𝗟𝗟𝗠𝗢𝗽𝘀, and 𝘃𝗲𝗰𝘁𝗼𝗿 𝗗𝗕𝘀 for free by designing, training, and deploying a real-time financial advisor LLM system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 𝘷𝘪𝘥𝘦𝘰 & 𝘳𝘦
Star Growth
Overview
Hands-on LLMs is an educational open-source course that teaches practitioners how to build production-ready LLM systems through a real-world financial advisor project. The course covers the complete MLOps pipeline including training, deployment, and real-time inference using modern tools like QLoRA for fine-tuning, vector databases, and serverless GPU infrastructure. Students learn to implement a 3-pipeline architecture: training pipeline for fine-tuning open-source LLMs on proprietary Q&A datasets, streaming real-time pipeline for data processing, and inference pipeline for serving the model. The course emphasizes practical LLMOps practices including experiment tracking with Comet ML, model registry management, and serverless deployment using Beam. Note that this original course has been archived and replaced with a new 'LLM Twin' course for an improved learning experience.
Deep Analysis
vs generic LLM tutorials: 3-pipeline production architecture (training + streaming + inference) with real financial data — teaches QLoRA fine-tuning, real-time embeddings, and RAG deployment end-to-end
⚡ Capabilities
- • End-to-end LLM course: training, streaming, and inference pipelines
- • QLoRA fine-tuning of open-source LLMs on financial Q&A data
- • Real-time financial news embedding pipeline with Bytewax
- • RAG inference with LangChain + Qdrant vector store
- • Experiment tracking with Comet ML LLMOps
- • Serverless GPU deployment via Beam
- • Fine-tuning with distillation (GPT-3.5 → smaller LLM)
🔗 Integrations
✓ Best For
- ✓ ML engineers wanting to learn production LLM deployment end-to-end
- ✓ Practitioners building real-time RAG systems with streaming data
- ✓ Teams learning QLoRA fine-tuning with LLMOps best practices
✗ Not Ideal For
- ✗ Beginners without ML/Python experience
- ✗ Production financial advisory systems (educational)
- ✗ Users without GPU access or cloud budget
Languages
Deployment
⚠ Known Limitations
- ⚠ Archived — superseded by LLM Twin course
- ⚠ Requires CUDA-enabled Nvidia GPU (10GB+ VRAM for training)
- ⚠ Multiple external service accounts needed (Alpaca, Qdrant, Comet ML, Beam, AWS)
- ⚠ Financial domain specific — not general purpose
Pros
- + Complete end-to-end LLM system architecture with real production deployment examples using modern MLOps tools
- + Hands-on approach with practical financial advisor use case that demonstrates real-world application patterns
- + Comprehensive coverage of LLMOps including experiment tracking, model registry, and serverless GPU infrastructure deployment
Cons
- - Requires significant hardware resources (10GB VRAM, CUDA GPU) for local training, though cloud alternatives are provided
- - Course has been archived in favor of a newer 'LLM Twin' course, potentially indicating outdated content or approaches
Use Cases
- • Learning to build production LLM systems with proper MLOps practices for financial or advisory applications
- • Understanding QLoRA fine-tuning techniques for customizing open-source models on proprietary datasets
- • Implementing real-time LLM inference pipelines with streaming data processing and vector database integration