From Connect4 to multi-agent RL and Artificial Life
[This is a quick un-edited jot-down.]
Since February:
ran LLM experiments on deception, realized that LLMs still have some pretty big limitations. Also prompt sensitivity
running these experiments requires very little technical skill and honestly feels quite superficial. Lots of prompt engineering/design required. And then you'd really need mech interp to extract anything useful.
joined bluedot AGI strategy and Technical AI safety courses. At first was a bit bamboozled and felt the AGI strategy discourses were a bit too superficial and just "ai bad/aicompanies bad, frontier labs regulation". But start enjoying the AGI strategy cohort. My group has quite a few cool people. Still feel like the discussions are bit too opinionated and not grounded enough, but the resources and the overall course plan from bluedot is quite cool. I'm somehow enjoying them. The technical AI safety first session was horrendous. Felt like I was in a group full of noobs. Switched groups. My group leader is now ex-openai researcher and now uk aisi researcher. Group members are all very smart, lots of PhDs in relevant fields. Everyone is quite knowledgeable and we can actually go quite technical in the discussions. Very much enjoying these.
I also did a lot of technical work. To gain a deeper understanding of LLMs and NNs / ML in general, I built a lot of things from scratch: linear regression, mlps, autograd, attention, microgpt. I continued via RL, built a RL via self-play and PPO model that is very good at connect4 (iywey.github.io/connect4-web). This got me thinking more about NN architectures. Went througuh Stanford's CS 231n. Started working on a second model (first one was CNN-based, this one transformer). Looking to investigate what the optimal NN architecture is for this problem, or if that framing makes sense. Idea of the optimal "artificial brain" structure to encode a given problem space (in this case connect4), like mapping the essence of the game to weights. Part of this would be diving into some mech interp work and seeing what's encoded in the neurons etc. Started training second model. First model was pure RL via selfplay without search. Liked keeping search out of it. This one is supervised learning based on a alpha-beta pruning solved version. Trained for a bit but would require further work. Realized that the net wasn't deep enough so the idea of minimal layers fell a bit a part in the sense that it's probably better to start with a more abundant NN and then you can subtract or knowledge distill instead. The overall conclusion for this experiment is that the stuff remains super interesting and I think I can get there with more data and more training. But somehow not super motivated to continue training this model or doing the mech interp work. May come back to this in the future though. Guess one of the problems is that I find connect 4 boring. On the other hand, the people I showed this to found it really fun playing against the model.
After the connect4 I went into physics sims, MuJoCo specifically. Looked into running RL to get a humanoid to stand. Idea is to to RL via self-play on MMA / Muay Thai. Further idea to get a humanoid that represents my body, perhaps even a body suit to track my movements, train a model that fights like me, then let it fight against itself to see how optimal Wey may fight. Idea could translate to being valuable for professional fighters. May generalize to other sports, but even cooler to go into robotics and built a robot that I can spar with. Whole thing is quite ambitious. Currently on hold. For a long time I wanted to write C, especially after hearing some great programmers say that the one language everyone should learn/write is C. Usually there's something deep behind these statements. I've been wanting to find out for a while. Got intro to C in first sem CS bachelors, but forgotten since. So one of the reasons I started this physics sim arc is because I wanted to write C. And part of the reason it's on hold is because I want to go deeper into C. (The C isn't very deep here. Just a little bit of C, but then the training is all in python.)
Interlude wise I also got an arduino starter kit and started playing around with some electrical engineering stuff. Found the idea interesting to build a small robot that has vision and face recognition and perhaps a local llm or call a local llm. Drives around my home, recognizes friends, chats them up. Local llms in general is something I find quite interesting. Also had this idea of building some voice ai loop, ie i can always talk to my pc and tell it to write notes / summarize / search notes and write directly to my obsidian. Feels like there are lots of small apps/qol improvements where local llms especially could shine.
For a while I wanted to watch the performance-aware programming series by Casey Muratori. Decided to finally do so. Loved the intro. Definitely feels like the way I generally like to approach learning something, from first principles, from the deep fundamentals. This course is a great match and I really enjoy his philosophy of great programming should understand the machine you're actually programming and what the actual instructions are that you're writing. This motivated me to write a game engine in C. I now have a ticking game server running that you can send api requests to via http. Wrote everything in C. Brainstorming on where I want to take it next led me to multi-agent RL. Thinking about creating this virtual world in which I'll birth/train some AI agents to inhabit. The general idea behind the game so far was to have a game based on sound economic principles, with trade, with scarce resources, no external inflation, and have this experiment of an emergent economy. In the game you can fish, cut wood, do all kinds of things that produce goods. You can trade them. You need to survive etc. Initially I wanted to just get some friends on there and see what emerges, but now I like the multi-agent AI version. I found this research about "artificial life". Might connect.
So right now I'll continue joining my bluedot courses. And I'll continue building my technical skills with projects that really interest me. My commitment to AI safety in the last post was still to early. In fact there is stuff about this area that I find important and interesting, but there are also things I don't like about the community already. Ultimately I'm still in an exploration phase. The only thing I do know is that I really enjoy building technical competence.