Exploring Pseudo-AGI: The Emergence of Advanced AutoGPT AI Bots
Autonomous AI Agents unlock big potential
We’ve seen ChatGPT do some pretty impressive things, but one of the clear limitations of such a system is the ability to make and execute a plan.1 With terminal velocity being one prompt at a time, the capabilities of these systems are significantly limited.
However, some forward thinking devs have realized that it doesn’t take much to create the glue necessary to enable and empower these products far beyond the walls within which they were relegated.
The familiar “Call and Response” (AKA, prompt and completion) of tools like GPT-4 is already impressive, but a big weakness is the ability to coordinate and execute a series of complex tasks. Although it is easy to expect that LLM creators may one day expand on these models, enterprising individuals have come up with shortcuts to extend these systems… and have integrated them only weeks after GPT-4’s release. Adding introspection and the ability to work tasks to completion is novel, and utilities are empowering these systems to interact with the real world.
Stitching systems together provides a lightweight solution to a heavyweight problem. In theory, there’s not much to it:
Ask “System A” to come up with a set of the tasks necessary to achieve a goal
Use a governance wrapper to delegate those tasks to “System B”
Iterate through all the things until the work is done, potentially delegating work to different systems
I envisioned that people would be slapping together one-off solutions like this to accomplish highly targeted goals. I didn’t consider how people would come up with similar approaches for a generalized framework. It seems as obvious as Post-It Notes now, and watching these systems in action gives a hint as to what we might see as AGI is released into the wild. The magic is not having to think about architectural aspects of bringing these things together, because the machines do it for us on the fly. Sure, there is some very minor setup and tweaking, but considering where we are today it is easy to imagine we will be light years ahead in the coming months.
A quick look at trending GitHub projects tells us that there is a LOT happening in this space. Let’s look at a few of the efforts that are gaining momentum:
BabyAGI by @yoheinakajima
This Open Source GitHub Project is an AI-powered task management system which uses OpenAI and Pinecone2 APIs to create, prioritize, and execute tasks.
BabyAGI is based on the wonderfully simple Task Driven Autonomous Agent (which itself was released only a few weeks ago). The project has a great website which also highlights a few similar types of projects not covered in this article.
AutoGPT by @SigGravitas
Andrej Karpathry calls Auto-GPT the "next frontier of prompt engineering,” and it’s easy to see why. This Open Source GitHub project comes with some key features including:
🌐 Internet access for searches and information gathering
💾 Long-Term and Short-Term memory management
🧠 GPT-4 instances for text generation
🔗 Access to popular websites and platforms
🗃️ File storage and summarization with GPT-3.5
Get involved and become a part of the conversation on the Auto-GPT Discord.
JARVIS by @Microsoft
Another Open Source GitHub project, JARVIS works with Hugging Face to serve as an interface for LLMs to connect numerous AI models for solving complicated AI tasks in 4 stages:
Task Planning: Using ChatGPT to analyze the requests of users to understand their intention, and disassemble them into possible solvable tasks.
Model Selection: To solve the planned tasks, ChatGPT selects expert models hosted on Hugging Face based on their descriptions.
Task Execution: Invokes and executes each selected model, and return the results to ChatGPT.
Response Generation: Finally, using ChatGPT to integrate the prediction of all models, and generate responses.
One nice differentiating aspect of JARVIS is its ability to select an appropriate model for the task at hand.
AgentGPT by @AsimShrestha
Are you less technical and still want in on the action? You might find some joy with AgentGPT, which empowers you to “Assemble, configure, and deploy autonomous AI Agents in your browser.”
Let me know if you have any questions about this or related topics. I’m happy to curate my content for my readers!
The Takeaway
It is crucial to keep an eye on this space as these projects continue to evolve. I agree with Mckay Wrigley that we are witnessing the construction of an "AI App Superhighway."
The rapid development of innovative solutions to overcome the limitations of GPT models like ChatGPT demonstrates the potential for amplifying the impacts of less refined AI systems. By even loosely bridging these systems together and creating a framework for AI to interact with the real world, such tools can enable mass adoption and significantly change the AI landscape.
While there may still be challenges in terms of scalability and production-ready applications, these solutions showcase the immense potential of AI and the exciting opportunities that lie ahead in the field.
In a previous article I referenced work highlighting that the biggest weakness of GPT-4 is perhaps the inability to plan
Pinecone is a vector DB frequently used with embeddings. It allows for long-term storage of information that can be referenced by these models without the need for training
https://www.pinecone.io/