lumos

Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"

open-sourceagent-frameworks
Visit WebsiteView on GitHub
475
Stars
+40
Stars/month
0
Releases (6m)

Overview

Lumos is an open-source language agent framework that provides unified training for complex interactive tasks using modular design principles. Built on LLAMA-2-7B/13B models, it features a three-module architecture consisting of planning, grounding, and execution components. The framework is trained on approximately 56,000 high-quality subgoal and action annotations derived from ground-truth reasoning steps across multiple benchmarks. Lumos achieves competitive performance with GPT-4/3.5-based agents on web navigation, complex question answering, mathematical reasoning, and multimodal tasks. Its unified data format enables seamless support for diverse interactive tasks, making it a valuable resource for researchers developing open-source agents. The framework outperforms contemporaneous fine-tuned agents like FiReAct, AgentLM, and AutoAct on benchmarks such as Mind2Web, HotpotQA, WebShop, and InterCode_SQL, while maintaining the transparency and accessibility advantages of open-source models.

Pros

  • + Modular architecture with separate planning, grounding, and execution components enables flexible customization and debugging
  • + Unified data format supports multiple task types (web navigation, QA, math, multimodal) within a single framework
  • + Competitive performance with much larger proprietary models while being fully open-source and based on smaller LLAMA-2 models

Cons

  • - Based on LLAMA-2 architecture which is older and may not incorporate latest language model advances
  • - Primarily research-focused with limited documentation for production deployment
  • - Requires significant computational resources for training and may need fine-tuning for domain-specific applications

Use Cases

Getting Started

Clone the repository and install dependencies using the provided requirements, download the pre-trained Lumos models from Hugging Face or train your own using the provided training scripts, run the demo interface or integrate the modular components into your application for specific interactive tasks