🤖Karpathy: Agentes de IA Iterando Código do Nanochat Sozinhos
Andrej Karpathy anunciou que o nanochat agora treina um modelo de capacidade GPT-2 em apenas 2 horas num único nó com 8 H100s — era 3 horas um mês atrás. Mas o anúncio mais interessante não é a velocidade: é que ele deixou agentes de IA iterando no repositório automaticamente. --- Em 12 horas, os agentes fizeram 110 mudanças, reduzindo o validation loss de 0.862 para 0.858 sem custo adicional de wall clock time. O agente trabalha em feature branch, testa ideias, faz merge quando funciona e itera. Karpathy admite que nas últimas 2 semanas iterou mais no 'meta-setup' de como os agentes trabalham do que no código do nanochat em si. --- Ele termina o post: 'I'll just leave this running for a while, go relax a bit and enjoy the feeling of post-agi.' Karpathy sendo Karpathy — anunciando que agentes de IA escrevem melhor código que ele próprio e indo descansar. O futuro do engineering está se materializando em tempo real.
nanochat now trains GPT-2 capability model in just 2 hours on a single 8XH100 node (down from ~3 hours 1 month ago). Getting a lot closer to ~interactive! A bunch of tuning and features (fp8) went in but the biggest difference was a switch of the dataset from FineWeb-edu to NVIDIA ClimbMix (nice work NVIDIA!). I had tried Olmo, FineWeb, DCLM which all led to regressions, ClimbMix worked really well out of the box (to the point that I am slightly suspicious about about goodharting, though reading the paper it seems ~ok). In other news, after trying a few approaches for how to set things up, I now have AI Agents iterating on nanochat automatically, so I'll just leave this running for a while, go relax a bit and enjoy the feeling of post-agi :). Visualized here as an example: 110 changes made over the last ~12 hours, bringing the validation loss so far from 0.862415 down to 0.858039 for a d12 model, at no cost to wall clock time. The agent works on a feature branch, tries out ideas, merges them when they work and iterates. Amusingly, over the last ~2 weeks I almost feel like I've iterated more on the "meta-setup" where I optimize and tune the agent flows even more than the nanochat repo directly.
— @karpathy View on X
