AutoAct
[ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning
Overview
AutoAct is an automatic agent learning framework designed for question answering tasks that eliminates the dependency on expensive closed-source models like GPT-4 and large-scale annotated datasets. Published at ACL 2024, the framework introduces a novel approach to creating language agents from scratch through self-planning mechanisms. AutoAct automatically synthesizes planning trajectories without human assistance or guidance from proprietary models, making it accessible for researchers and developers with limited resources. The system employs a division-of-labor strategy that intelligently differentiates tasks based on target information and synthesized trajectories, creating specialized sub-agent groups to handle complex problems collaboratively. This approach allows the framework to achieve competitive performance compared to existing baselines while maintaining cost-effectiveness and reproducibility. The framework supports various open-source language models and provides a comprehensive solution for organizations looking to build capable agent systems without relying on expensive commercial APIs or extensive manual data annotation efforts.
Pros
- + Eliminates dependency on expensive closed-source models like GPT-4, making agent development more accessible and cost-effective
- + Automatically synthesizes planning trajectories without requiring human annotation or manual trajectory creation
- + Implements division-of-labor strategy with specialized sub-agents for improved task decomposition and completion
Cons
- - Primarily focused on question answering tasks, which may limit applicability to other agent use cases
- - Requires an existing tool library to function effectively, adding setup complexity
- - Performance may vary significantly depending on the quality and capabilities of the underlying open-source language model used
Use Cases
- • Building cost-effective QA agents for organizations without access to expensive closed-source language models
- • Creating reproducible agent systems in research environments with limited annotated training data
- • Developing multi-agent systems that require automatic task decomposition and specialized sub-agent coordination