autoresearch

AI agents running research on single-GPU nanochat training automatically

62.2k
Stars
+29588
Stars/month
0
Releases (6m)

Star Growth

+4.8k (8.2%)
57.1k60.7k64.3kMar 27Apr 1

Overview

autoresearch 是一个创新的自主AI研究工具,让AI代理在单GPU环境下自动进行LLM训练实验。该工具基于简化的nanochat实现,允许AI代理在夜间自主运行:修改训练代码、进行5分钟训练、评估结果改进、保留或丢弃更改,然后重复这个过程。你醒来时会看到完整的实验日志和(希望)改进的模型。核心理念是通过编程`program.md`文件来指导AI代理,而不是直接修改Python代码。代理可以自由修改架构、超参数、优化器、批量大小等所有训练要素。系统使用固定的5分钟时间预算和val_bpb(验证每字节比特数)作为评估指标,确保不同架构变更之间的公平比较。这代表了自主AI研究的新范式,where研究不再需要人类持续监督。

Deep Analysis

Key Differentiator

Karpathy's pioneering concept of AI agents autonomously running ML experiments overnight — vs traditional hyperparameter search tools that don't modify architecture or code

Capabilities

  • Autonomous AI-driven LLM training experiments
  • Automated code modification, training, evaluation loop
  • Fixed 5-minute time-budgeted experiments for fair comparison
  • Experiment logging with val_bpb metric tracking
  • Agent-editable single-file training setup (train.py)
  • Markdown-based agent programming (program.md)

🔗 Integrations

Claude CodeCodexPyTorchNVIDIA CUDA

Best For

  • Researchers exploring autonomous ML experiment iteration
  • Learning about AI-driven research automation
  • Overnight autonomous hyperparameter/architecture search

Not Ideal For

  • Production model training or deployment
  • Teams without GPU access

Languages

Python

Deployment

Local single-GPU setupuv package manager

Pricing Detail

Free: Open source MIT, free to run locally
Paid: N/A — requires GPU hardware + AI agent API costs

Known Limitations

  • Requires NVIDIA GPU (tested on H100, works on smaller GPUs with forks)
  • Single-GPU only, no distributed training
  • Narrow scope: only LLM training optimization experiments
  • Results not comparable across different hardware

Pros

  • + 完全自主的夜间实验能力,无需人工干预即可进行数百次训练迭代
  • + 简洁的三文件架构设计,降低复杂性同时保持实验灵活性
  • + 固定时间预算确保不同实验配置之间的公平比较和评估

Cons

  • - 限制为单GPU环境,无法扩展到大规模分布式训练
  • - 5分钟的固定训练窗口可能限制复杂模型或大数据集的充分训练
  • - 需要NVIDIA GPU硬件支持,增加了使用门槛

Use Cases

  • 自动超参数调优,让AI代理探索最佳学习率、批量大小和优化器设置
  • 神经网络架构搜索,自主试验不同的模型设计和层配置
  • 夜间无人值守的研究实验,充分利用计算资源进行持续优化

Getting Started

1. 准备单个NVIDIA GPU环境并克隆仓库;2. 编辑program.md文件设置AI代理的研究指令和目标;3. 启动自主代理循环,让其开始自动实验和迭代训练过程

Compare autoresearch