What Is BabyAGI? What You Need To Know About the Autonomous Agent

Is BabyAGI about to change everything? In this interview, the founder of BabyAGI Yohei Nakajima discusses the origins of the company and the inspiration behind the name. He also delves into the unique framework used to build BabyAGI and the potential future of app development. The interview touches on the three different agents used in the framework - the execution agent, the task creation agent, and the prioritization agent. Nakajima also discusses some of the cool things people have built using BabyAGI, including web-based versions and NLPs for RPG game engines. Overall, the interview provides insight into the innovative approach used by BabyAGI and the potential for its framework to be used in a variety of different applications. This talk was originally delivered at Arize:Observe, a conference on the intersection of large language models, generative AI, and AI observability in the era of LLMops.

Introduction

"Hello, welcome back to another amazing session here at Arize AI. My name is Aparna Dhinakaran, one of the founders."

Development of BabyAGI [1:07]

"Yeah, so you know, when it comes to building with AI, I probably started building on top of GPT-3 around August last year. We've built a lot of tools just as experiments. Some of the tools we've built, we use internally at Untapped, some of it we exposed to our portfolio founders, our LPs, and some of them are more general open tools for founders. This most recent one was probably experiment number 70 when it started. I was looking at Hustle GPT movement, which is really cool, where people are saying I have a hundred dollars and using ChatGPT as a co-founder. I wanted to participate but just didn't have time, and that got me thinking, could I just build an AI founder to do this without me being part of it?"

Usage and Applications of Baby AGI [2:55]

"I feel like I've been seeing just all these demos on Twitter of now people trying to use Baby AGI and using it for their own use cases. What are some cool things you've seen people build with Baby AGI? Some of the demos I've tried that are really cool are you know the web-based versions sometimes with human in the loop god mode is pretty cool to play with, you know instead of just running all autonomously it suggests the tasks and you kind of approve and push things through but there's a couple other, you know Asian GPT and a couple others that have been pretty interesting to play with. There's another one where someone's using a modified Baby AGI to build NLPs to kind of walk around an RPG actual game engine."

Inspiration Behind the Name 'Baby AGI' [3:47]

"It was my friend Jenny, Jenny AI on Twitter. She was the first person to be like, bro did you just build a Baby AGI in response to me sharing my founder uh and it just kind of stuck. After I showed the demo and it became popular, I needed a research paper, so I threw the code into a research paper and named it the task-driven autonomous agent. A lot of people asked if I was going to open source it, which I wanted to do, but task-driven autonomous agent was a mouthful, so I picked the first nickname that was given by Jenny, Baby AGI."

Building Baby AGI: Framework and Codebase [4:48]

"The task-driven autonomous agent had Lang chain, and I even got Zapier NLA working in the execution agent which was really cool because it could do things like search the web, add to files. But for me, Baby AGI was really an introduction to a framework. I stripped out all of that when I released the code base and tried to pare it down as little as I could. It was 150 lines but only 105 lines of code including prompts. The whole point was to introduce the framework which is why it was mostly using prompts instead of any other actions."

Components of Baby AGI [6:01]

"The prioritization agent initially was actually just a deduping agent because I saw it making duplicate tasks but then I realized if it's going to dedup anyways I might as well just ask it to like re-prioritize to do the first task. I've seen some people do tasks and subtasks, defining either creating subtasks within a task to break it down or creating relationships where you have to finish one task after another."

Evolution and Impact of Baby AGI [7:46]

"There's a lot of different opinionated ways in which you can evolve a simple framework like Baby AGI, which is what we're seeing. One of the things I'm most interested in is the conversation around the different approaches people are taking. So we'll continue to grow the Baby AGI code base, but one of the things I'm really interested in is just making sure we're tracking and following all the different iterations of it so that we can talk to each other about why did you do it that way, how did it work, and then people can learn from each other instead of building in a silo."

Innovations in Prioritization with Baby AGI [8:21]

"The thing that to be honest was really novel to me as I was going through it was the prioritization agent. In the past, the agents you really had to kind of give it step by step what you want them to do and this is the order that you want the execution in. It was just really novel to think about why not have you know why not give the ability to prioritize some kind of flexibility there or kind of re-prioritize, would be smart about it and that was really novel for me."

Memory and Reflection in Baby AGI [9:02]

"Any examples you've seen of users implementing long-term memory and reflection in the prioritization agent as referenced in that generative paper that came out two weeks ago, the one with like the simulation, the West World simulation that's called Civilization? We do have memory in Baby AGI. We store past tasks and we retrieve the most relevant ones to create new tasks to hopefully minimize to some extent repetitiveness which is absolutely not solved. I have heard of people wanting to implement reflection although I'm not sure exactly where that goes. When we use Link Chain in the execution agent, they have the reflection agent built in so that was pretty interesting. I think there's various ways to add reflection which I'm definitely seeing especially when it comes to like code like review the code before executing or reviewing the errors."

Iteration and Development of Baby AGI [11:04]

"How often have you iterated on the prompts for the execution agent, the task creation agent, and the prioritization agent? Do you see people still continuing to iterate on them? I haven't iterated on it much. There was some iteration at the beginning when I created it but I think I kept it pretty straightforward and simple. I did like zero shot on all of them again because what I really wanted was to give something that people could build on top of and I wanted it to be as not opinionated when I give it to them. I think that's what inspired a lot of people. I had a lot of people reach out who hadn't coded in a long time saying your Baby AGI got me to code and I think it was partially that when people saw that code they immediately thought of things they would want to do with it and I think that's what was really exciting about Baby AGI's code."

Examples of Implementation in Baby AGI [12:10]

"Can you give me some examples of what people have put in the execution agent, the task creation agent, and the prioritization agent? Full transparency, I haven't reviewed the code bases of a lot of the tools that are being built. The first ones I can think of, Link Chain actually went in because my LinkedIn integration and execution agent was kind of sloppy but I shared the code with Harrison and I think he used that as a base. Link Chain published a demo version of Baby AGI where you have Lang chain as the execution agent. That's pretty powerful also if you just stick your NLA in there, you can access anything that Zapier has access to so that's pretty powerful. I know a few people have put the ability to write and execute code in the execution function so that's pretty interesting."

Community Growth and Future of Baby AGI [13:25]

"How has it been for you just to kind of see how it's taken off and the kind of communities? It's been really fun to watch, it's moving really fast. Definitely wasn't expected when I started because it was just another experiment but it's been a really great way to meet a lot of founders, a lot of VCs. I actually took a 10-day family trip during that. I tried to stay off of Twitter but it was impossible, there was just too much happening."

The Future of Agent Technologies [14:05]

"Where do you see the future of agents? There's kind of some stuff today where people are building and then there's kind of some far-fetched stuff from like the Westworld paper and maybe gaming applications. Where do you see real world use cases or applications for agents going? Well, I think it's helpful to clarify that agents aren't new, like Lang chain's been using the agent. I think Baby AGI was interesting just because it never stopped. We have a couple of agent-type tools we use internally, like one that drafts an investment memo for us based on a URL. It scrapes the website, pings Product Hunt API, Crunchbase data, BuiltWith. Where it goes, I think initially we'll see a lot of specific tailored agents of sorts that people might use internally or for their own purpose because it's easier to build something for yourself than for other people. Then as those start working, we're seeing some people will start building tools for other people to use. Again, I think they'll start specialized because it's easier to fine-tune the prompts, it's easier to pick the databases they should look at. Then in parallel, as we learn how to build these well, you'll see a couple of people building generic ones that will probably do simple tasks and then over time, maybe learning from the specializations, the general agent will become better and better. It's going to take a little while until you have an agent that can just do anything."

Adoption and Development of Autonomous Agents [15:53]

"We're in like the beginning of the hype cycle for autonomous agents where everyone's excited. It's already almost at the peak and then there's going to be a little winter before really useful tools come out. Just with AI, I don't know how long the winter is going to be. It could be three weeks for all I know. This is the winter we're living in right now. Right now, people are already disappointed that Baby AGI can't do everything for them. It was not supposed to, it's a build on it."

Observability in Autonomous Agents [16:32]

"Seeing the people adopt agents has got me thinking about well now you have agents acting autonomously. How do you think about observability, your visibility into what these autonomous agents are doing today or could do in the future? It's important obviously. With any technology, any automations, I'm a huge Zapier guy, I've put tens of thousands of Zaps running. Any of them could do damage technically, but with autonomous decisions where you're letting the computer decide what to do, the risk is obviously higher especially if you start giving it capabilities like sending money or whatnot. It's important to understand the risks of the specific use case you're using it for and understand where it might all go wrong. Make sure you have human in the loop as you're testing it and then only remove the human in the loop to let it run on its own when you have a certain confidence level with setting monitors to make sure that you catch any potential errors and hopefully you did not you thought of all the potential errors it could run into. There's probably not a place where observability for AI becomes more important as we think about agents acting autonomously than ever before."

Monitoring and Evaluating Autonomous Agents [18:48]

"Have you been following the LLM evaluation or LM assisted evaluation space and any thoughts there? I've followed interestingly but mostly from like, is there a better model I should be playing with perspective but I do know that's important. I've seen a couple of tools that build monitoring layers on top of these APIs which I think are interesting. I think they're initially starting more from a monitor cost perspective which is definitely important for auto agents type stuff. God mode that I mentioned, I was playing around actually earlier today so it's fresh in my mind, but I really like that they give me the tasks that they come up with and then I get to choose which tasks and then they before it executes it asks me if I want to execute. I can approve the plan or I can add my own comments. I can see that human in the loop as a really good way to make sure the agent is doing what I want but also using it as training data to make sure it does it kind of similarly in the future."

The Popularity of Auto GPT and Baby AGI [20:09]

"Auto GPT has over 100k plus GitHub stars. Are you suspicious that there's an agent created to drive the GitHub stars? How is this so popular, it's already surpassed PyTorch? What do you think is behind the popularity? I think the idea of autonomous agents really caught people's imagination. Auto GPT, he'd been working on it before Baby AGI. It was slowly reaching like 40 stars and then after I released the task of autonomous agent research paper, it shot up. I think people just really liked the idea. Baby AGI as much as popularity as it got, people quickly realize it can't do much right. People wanted a full-blown solution so they were looking for something more complete and what he's built is impressive."

Future Developments in Baby AGI [21:39]

"What's next for Baby AGI, where are you planning on investing? We don't know right, we're exploring. I brought on recently Fraser who was the ex-head of product at OpenAI to help us look at Baby AGI. There's a community behind it and so we're going to see where it takes us."

Closing Remarks and Following Baby AGI [22:24]

"Thank you so much for being a part of Observe and sharing what's happening on the ground with autonomous agents. We're all very excited about the space. If anyone else wants to follow along or see what Yohei is up to, where's the best place to follow you? Follow Baby AGI on Twitter, I think it's @babyAGI_ and we'll soon be opening up our Discord, we're just cleaning it up right now so you'll see it on Twitter first probably." "Awesome, we'll follow along and we'll keep posted. Thank you so much, thank you."

Subscribe to our resources and blogs

Subscribe