SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

18.9k
Stars
+240
Stars/month
0
Releases (6m)

Star Growth

+38 (0.2%)
18.5k18.9k19.3kMar 27Apr 1

Overview

SWE-agent是一个由普林斯顿大学和斯坦福大学研究人员开发的开源工具,能够使用大语言模型(如GPT-4o、Claude Sonnet 4)自主修复GitHub仓库中的实际问题。该工具在SWE-bench基准测试中达到了开源项目的最先进水平,通过单个YAML文件进行配置,为语言模型提供了最大的自主权来使用各种工具解决编程问题。SWE-agent不仅可以自动修复代码问题,还能用于网络安全漏洞发现和竞赛编程挑战。该项目专为研究设计,架构简单且易于修改。值得注意的是,开发团队目前主要专注于mini-swe-agent项目,这是一个更简洁的继任者,在保持相同性能的同时大大简化了实现。SWE-agent展示了AI在软件工程自动化方面的潜力,特别是在代码修复和漏洞检测领域,为研究人员和开发者提供了一个强大的工具来探索自主编程代理的能力边界。

Deep Analysis

Key Differentiator

Princeton/Stanford research project achieving SoTA on SWE-bench — the most rigorous benchmark for automated software engineering — with a simple, hackable design that leaves maximal agency to the LLM

Capabilities

  • Autonomous software engineering agent for GitHub issues
  • State-of-the-art on SWE-bench benchmarks
  • Configurable via single YAML file
  • Offensive cybersecurity (CTF) capabilities (EnIGMA)
  • Custom task support beyond code fixing
  • Support for multiple LLMs (GPT-4o, Claude, etc.)

🔗 Integrations

GitHubOpenAI GPT-4oAnthropic ClaudeDockerGitHub Codespaces

Best For

  • Automated bug fixing and issue resolution in GitHub repos
  • Research on AI-driven software engineering capabilities

Not Ideal For

  • General-purpose AI assistants (not designed for chat)
  • Teams wanting actively maintained tooling (consider mini-SWE-agent)

Languages

Python

Deployment

pip installDockerGitHub Codespaces

Pricing Detail

Free: Fully open source (MIT)
Paid: N/A — free (LLM API costs apply)

Known Limitations

  • Now in maintenance mode — mini-SWE-agent is the successor
  • Requires powerful LLM (GPT-4o/Claude) for good results — API costs can be significant
  • Limited to code-related tasks — not a general-purpose agent
  • Complex setup for custom environments

Pros

  • + 在SWE-bench基准测试中达到开源项目的最先进性能水平
  • + 支持多种主流大语言模型(GPT-4o、Claude Sonnet 4等),配置灵活
  • + 专为研究设计,架构简单且文档完善,易于定制和扩展

Cons

  • - 开发重心已转移到mini-swe-agent项目,原项目维护可能受到影响
  • - 主要面向研究用途,生产环境的稳定性和可靠性可能不如商业解决方案

Use Cases

  • 自动修复GitHub仓库中的代码问题和bug
  • 网络安全领域的漏洞发现和渗透测试
  • 竞赛编程和算法挑战的自动化解决

Getting Started

1. 克隆SWE-agent仓库到本地环境;2. 配置YAML文件指定目标语言模型和API密钥;3. 运行工具并提供GitHub issue URL开始自动修复流程

Compare SWE-agent