lumos
Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"
Overview
Lumos is an open-source language agent framework that provides unified training for complex interactive tasks using modular design principles. Built on LLAMA-2-7B/13B models, it features a three-module architecture consisting of planning, grounding, and execution components. The framework is trained on approximately 56,000 high-quality subgoal and action annotations derived from ground-truth reasoning steps across multiple benchmarks. Lumos achieves competitive performance with GPT-4/3.5-based agents on web navigation, complex question answering, mathematical reasoning, and multimodal tasks. Its unified data format enables seamless support for diverse interactive tasks, making it a valuable resource for researchers developing open-source agents. The framework outperforms contemporaneous fine-tuned agents like FiReAct, AgentLM, and AutoAct on benchmarks such as Mind2Web, HotpotQA, WebShop, and InterCode_SQL, while maintaining the transparency and accessibility advantages of open-source models.
Pros
- + Modular architecture with separate planning, grounding, and execution components enables flexible customization and debugging
- + Unified data format supports multiple task types (web navigation, QA, math, multimodal) within a single framework
- + Competitive performance with much larger proprietary models while being fully open-source and based on smaller LLAMA-2 models
Cons
- - Based on LLAMA-2 architecture which is older and may not incorporate latest language model advances
- - Primarily research-focused with limited documentation for production deployment
- - Requires significant computational resources for training and may need fine-tuning for domain-specific applications
Use Cases
- • Research into open-source language agents and comparative studies against proprietary models
- • Web navigation and automation tasks requiring multi-step planning and execution
- • Complex question answering systems that need to break down problems into actionable subgoals