AutoAct
[ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning
Star Growth
Overview
AutoAct is an automatic agent learning framework designed for question answering tasks that eliminates the dependency on expensive closed-source models like GPT-4 and large-scale annotated datasets. Published at ACL 2024, the framework introduces a novel approach to creating language agents from scratch through self-planning mechanisms. AutoAct automatically synthesizes planning trajectories without human assistance or guidance from proprietary models, making it accessible for researchers and developers with limited resources. The system employs a division-of-labor strategy that intelligently differentiates tasks based on target information and synthesized trajectories, creating specialized sub-agent groups to handle complex problems collaboratively. This approach allows the framework to achieve competitive performance compared to existing baselines while maintaining cost-effectiveness and reproducibility. The framework supports various open-source language models and provides a comprehensive solution for organizations looking to build capable agent systems without relying on expensive commercial APIs or extensive manual data annotation efforts.
Deep Analysis
vs ReAct/Reflexion/BOLAA: division-of-labor strategy automatically creates specialized Plan/Tool/Reflect sub-agents from self-synthesized trajectories — zero dependency on closed-source model data or human annotations
⚡ Capabilities
- • Automatic agent learning without large-scale annotated data
- • Self-synthesized planning trajectory generation
- • Division-of-labor with specialized sub-agents (Plan, Tool, Reflect)
- • Tool library integration with automatic selection
- • LoRA fine-tuning for agent specialization
- • Multi-hop QA task solving (HotpotQA, ScienceQA)
🔗 Integrations
✓ Best For
- ✓ Research on automatic agent learning without GPT-4 dependency
- ✓ Multi-hop QA requiring complex question decomposition
- ✓ Teams wanting to train specialized sub-agents from self-generated data
✗ Not Ideal For
- ✗ Production real-time applications
- ✗ Tasks without clear tool libraries
- ✗ Teams unable to afford Bing API costs or GPU resources
Languages
Deployment
⚠ Known Limitations
- ⚠ Requires Bing Search API key (paid)
- ⚠ Trajectory filtering needed for quality (reward ≥ 1)
- ⚠ Context length limited to 4096 tokens
- ⚠ Dependent on tool library quality for performance
Pros
- + Eliminates dependency on expensive closed-source models like GPT-4, making agent development more accessible and cost-effective
- + Automatically synthesizes planning trajectories without requiring human annotation or manual trajectory creation
- + Implements division-of-labor strategy with specialized sub-agents for improved task decomposition and completion
Cons
- - Primarily focused on question answering tasks, which may limit applicability to other agent use cases
- - Requires an existing tool library to function effectively, adding setup complexity
- - Performance may vary significantly depending on the quality and capabilities of the underlying open-source language model used
Use Cases
- • Building cost-effective QA agents for organizations without access to expensive closed-source language models
- • Creating reproducible agent systems in research environments with limited annotated training data
- • Developing multi-agent systems that require automatic task decomposition and specialized sub-agent coordination