AI-Powered Autonomous Agents – AI Research From The Lab

In Research Briefs

By Dave Timm

July 2023

Table of contents

Print to PDF

AI-Powered Autonomous Agents: The workforce of the future?

Overview

Our research team is working on a new technology called Al-Powered Autonomous Agents, which leverages Large Language Models (LLM) as the agent’s brain.

It can autonomously solve problems that consist of multiple steps, which bridges the limitation of current LLMs’ inability to handle complex tasks.

There are some proof-of-concept demo projects such as AutoGPT, GPT-Engineer, and BabyAGI, which serve as inspiring examples of the potential of this technology.

Key Concepts

The Autonomous Agent system functions by using a Large Language Model (LLM) as the core controller, which is complemented by 3 added key capabilities:

Planning: Breaking down complex tasks into smaller subgoals with self- reflection, allowing it to learn from past actions and improve future results
Memory: The ability to retain context and remember prior interactions
Completing Tasks: Calling external APIs to access additional information or execute code

Deep Dive

Planning consists of 2 areas: Task Decomposition, which is using Chain of Thought reasoning, and Self-Reflection, which is using ReAct, Reflexion, or Chain of Hindsight.

The Memory component emulates the memory types found in the human brain, including short-term memory, and long-term memory, making use of vector databases.

The Task Completion component significantly extends the capacities of the model with connection to various work completion services.

Setup & Testing

We ran AutoGPT on Gitpod, with a straightforward installation process that does not impact the local machine.

AutoGPT can automatically decompose and execute tasks according to user-defined goals, and it requests authorisation or feedback during the process. Users can authorise the next step or provide feedback on its actions.

We tasked AutoGPT with generating an HTML file for a website. AutoGPT generated functional HTML that met all its objectives.

It wasn’t beautiful, but it worked without errors!

Hitting The Limits

The limitations commonly observed in LLM Powered Autonomous Agents include the finite context length, which restricts the incorporation of historical information and detailed instructions. Although techniques like vector stores and retrieval can provide access to a larger knowledge pool, their representation power is not as strong as full attention.

Long-term planning and task decomposition pose challenges for LLMs, as they struggle to adjust plans in the face of unexpected errors.

Additionally, the reliability of natural language interfaces is questionable, as LLMs may make formatting errors or exhibit rebellious behaviour, requiring a significant focus on parsing model output in agent demo code.

Overall, these limitations highlight the need for further advancements in overcoming context limitations, improving planning capabilities, and enhancing the reliability of LLM-based systems.

What’s Next?

We will closely monitor emerging research and strive to build autonomous agents that meet our specific needs by leveraging the core idea of this technology.