code-act

Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji.

open-sourceagent-frameworks

Visit Website View on GitHub

1.6k

Stars

+15

Stars/month

Releases (6m)

Star Growth

+2 (0.1%)

Overview

CodeAct是ICML 2024论文《Executable Code Actions Elicit Better LLM Agents》的官方实现，提出了一种革命性的LLM智能体架构。该框架通过将智能体的动作统一为可执行的Python代码，解决了传统基于文本或JSON格式动作的局限性。CodeAct集成Python解释器，使智能体能够执行代码、动态修改先前动作，并根据执行结果进行多轮交互优化。该项目包含CodeActInstruct数据集（7000多轮交互数据）和CodeActAgent模型（基于Mistral-7b），在API-Bank和M³ToolEval基准测试中相比传统方法实现了高达20%的成功率提升。CodeAct的核心创新在于将复杂的工具调用、API交互和推理过程转化为可解释、可调试的代码形式，提供了更强的可控性和透明度。该框架支持Kubernetes部署、llama.cpp推理，并提供了在线聊天界面体验。

Deep Analysis

Key Differentiator

vs ReAct/text-based agents: executable Python code as unified action space with containerized execution, achieving 20% higher success rate than JSON/text actions

⚡ Capabilities

• Unified code-based action space for LLM agents
• Up to 20% higher success rate vs text/JSON actions
• Dynamic revision of prior actions based on execution feedback
• Multi-turn interaction with iterative refinement
• Containerized Jupyter Kernel for safe code execution
• Two model variants: Mistral-7b (32k context) and Llama-7b (4k context)

🔗 Integrations

vLLMLLama.cppJupyter Kernel GatewayMongoDBDockerKubernetes

✓ Best For

✓ Research on code-based agent action spaces
✓ Building agents that execute Python code as their primary action mechanism

✗ Not Ideal For

✗ Production agent systems needing large frontier models
✗ Non-code task automation

Languages

Python

Deployment

Kubernetes (single-command)Dockerlocal via LLama.cpp (macOS M2 tested)

⚠ Known Limitations

⚠ Evaluated primarily on API-Bank and M3ToolEval benchmarks
⚠ vLLM requires significant GPU resources
⚠ LLama.cpp alternative is slower
⚠ 7B parameter models have limited reasoning capacity

Pros

+ 统一动作空间设计显著提升了智能体在复杂任务上的成功率，相比传统Text/JSON方法提升高达20%
+ 集成Python解释器支持代码执行和动态修正，提供了强大的自我纠错和迭代改进能力
+ 提供完整的开源生态系统，包括训练数据集、预训练模型和部署工具，支持研究和生产应用

Cons

- 需要Python环境和代码执行权限，在受限环境下部署存在安全性考虑
- 模型推理和代码执行的双重开销可能增加延迟和计算成本
- 对代码生成质量依赖较高，错误的代码可能导致任务失败或系统异常

Use Cases

• 自动化API集成和数据处理任务，智能体可以动态调用各种API并处理响应数据
• 复杂的多步骤问题解决，如数据分析、文件操作和系统管理任务
• 教育和研究场景中的交互式编程助手，能够执行代码并根据结果调整解决方案

Getting Started

1. 从Hugging Face下载CodeActAgent-Mistral-7b-v0.1模型或使用ollama安装 2. 配置Python解释器环境并安装必要的依赖包 3. 运行聊天界面或集成到现有应用中开始与CodeActAgent交互

Compare code-act

code-act vs claude-code code-act vs llama.cpp code-act vs dify code-act vs OpenHands code-act vs OpenHands code-act vs langgraph