Autoresearch is an innovative project designed to streamline AI hyperparameter tuning. With a unique background researcher that operates with low process priority, it allows for seamless optimization on any hardware, without the need for specialized GPUs. Unleash the power of local research with simple commands and enjoy interactive demos after finding optimal settings.
Universal CPU Edition (Matti A. Pöysti Fork)
This project presents an optimized version of the autoresearch framework, designed to operate seamlessly on a wide range of hardware including standard CPUs, Apple Silicon, or NVIDIA GPUs. It eliminates the dependency on Flash Attention 3 and H100 GPUs, making it ideal for conducting local AI research on everyday computers.
Autonomous "Folding Mode"
The system features an innovative "Always-On" research loop, inspired by successful initiatives like Folding@home. This background process runs with low process priority, preserving the performance of your computer while it diligently searches for optimal AI hyperparameters.
To initiate the background researcher, execute the following command:
python agent.py
This process will:
- Consult a local Ollama model (Qwen 2.5 0.5b) for enhancements.
- Automatically modify the
train.pyfile. - Conduct a 5-minute training experiment.
- Log every result in
results.tsv. - Execute an automatic
git commitfor any improvements made.
Interactive Chat Demos
Once optimal training metrics have been established for your CPU, users can evaluate these metrics through interactive demos:
- Small (0.8M Params): Fast training with simple patterns.
uv run chat_demo.py - Medium (10M Params): Enhanced language understanding, requiring approximately 20 minutes of training.
uv run chat_demo_medium.py
Project Overview
The foundational principle behind this repository is to empower an AI agent with a modest but functional LLM training setup, allowing it to autonomously experiment overnight. Each iteration involves code modification, a 5-minute training session, performance evaluation, and logging of results. Upon waking, one can review the logged experiments to discover potential improvements in the model.
The training code is a streamlined, single-GPU implementation of nanochat, encouraging researchers to focus on the program.md file which outlines instructions for AI agents, facilitating autonomous research organization. The default program.md starts as a minimal baseline, designed for iterative development over time, tailored to optimize research progress by different parameters and configurations.
Technical Details
The repository structure is intentionally kept concise, featuring three primary files:
prepare.py: Handles constant values, one-time data preparation (such as downloading training data and training a BPE tokenizer), and runtime utilities (data loader, evaluation). This file is not modified during operation.train.py: The central file, containing the full GPT model, optimizer (Muon + AdamW), and training loop. This file is editable by the AI agent.program.md: Serves as the guideline for one agent, providing a framework for autonomous experimentation. This file is human-editable, allowing for personalized research direction.
Training sessions are constrained to a fixed 5-minute time budget (measurement excludes the startup and compilation time), ensuring a uniform basis for comparison across various configurations. The success metric employed is val_bpb (validation bits per byte), with a lower value indicating better performance, which remains consistent irrespective of vocabulary size and architectural modifications.
No comments yet.
Sign in to be the first to comment.