⚫Technical Architecture and Flow Diagrams
Last updated
Last updated
The Society of Automotive Engineers introduced a classification system to describe the different autonomous driving levels. It ranges from level 0 to level 5, with level 5 being completely autonomous. The same framework can be applied to help understand the current and future state of solvers. Much of Web3 today operates in the realm of level 2 of autonomy. Meaning that smart contract systems as they exist now provide some level of automation, but require human attention at all times. Smart contracts provide a degree of autonomy within the context of their own base chain and sometimes other chains, but have limited degrees of freedom. More concretely, smart contracts are limited to the specific execution logic that governs them. In all reality, a user’s objective might be better suited by other smart contracts or a novel combination of multiple independent smart contracts. Solvers, in combination with the recent advancements in artificial intelligence, provide a path for achieving levels 3-5 of autonomy giving blockchain users a new type of smart contract system that can self-drive towards a specific goal that they are trying to achieve. The reference architecture below provides a basis for a solver network that achieves level 3 autonomy. Level 3 autonomy will introduce solvers that can orchestrate complex workflows leveraging existing smart contract functionality with occasional human intervention. The proposed architecture lays the groundwork for level 4-5 autonomy by introducing the concept of an agent creator economy.
Agent Creator Economy
OpenAgents AI provides creators the ability to publish novel functionality using AI to a broad network of users on the platform (like Custom GPTs). A simple example of this would be a user finding a new protocol that has yet to be added to the OpenAgents AI platform. He or she could wrap the protocol functionality using the multi-agent framework so that other users on the platform could interface with the new protocol using natural language. There are obvious security considerations with this new paradigm but those are out of the scope of this document.
Front end and other interfaces
Users can interact with OpenAgents AI in a variety of ways with frond ends that are available in both web and native formats. Developers can interact with OpenAgents AI via an API or SDK.
User proxy agent
The user proxy agent consists of a container registry and a Kubernetes cluster that auto-scales the numbers of pods and containers based on the demand. User agent proxies will be hosted on decentralized physical infrastructure providers like (e.g. IO.NET). If additional capacity is needed, Kubernetes clusters can be spun up in public cloud regions in a geographically distributed manner.
The purpose of the user proxy agent is to act as the control plane for OpenAgents AI. The user proxy agent can also be thought of as the solver that is executing the intent with the help of multiple different subsystems. This involves managing the authentication/authorization and identity layers. It also includes coordinating with the various models used for inference as well as instantiating and/or executing the different smart contracts needed to complete the user’s objective.
The user proxy agent receives the user intent and deploys the corresponding Docker container id to handle the request. After receiving the request, the user proxy agent is responsible for contextualizing and packaging the user’s intent in a way that produces the desired output from the LLM. A number of other components are leveraged for input enrichment and prompt construction. Real-time text embeddings are incorporated for scenarios where retrieval augmented generation is necessary to help with code generation for creating new agents.
After submitting the user’s prompt and relevant context, the user proxy agent receives output from the LLM pertaining to the instructions needed to execute the intent. Oftentimes, this will be a direct reference to an already existing agent or set of agents that can best accomplish the task at hand. If no such agent exists, a separate workflow is triaged and the appropriate agent will be created and published on-chain.
When retrieving an agent to complete a task, the user proxy agent references the agent library (see agent library section below). After retrieving the appropriate smart contract functionality for the specified agent, the user proxy agent injects any core dependencies and executes the smart contract functionality.
The user proxy agent serves as a single interface for self-custodial wallets across different chains. To provide chain abstraction, OpenAgents AI creates self-custodial wallets for intents that span across multiple non-interoperable blockchains.
If the user's preference is to explicitly sign transaction instructions from their existing self-custodial wallets, the user proxy agent will prompt the user whenever consent is needed. However, the recommended way to interact with OpenAgents AI is to use the distributed key storage and signing that is built-in. This gives users the ability to outsource the execution of the intent to OpenAgents AI while maintaining total ownership of the wallets created by the protocol.
As referenced earlier, the user proxy agent pulls from a Docker container registry with unique identifiers that denote a specific user. When a user interacts with OpenAgents AI for the first time, a unique Docker container id is created for all future interactions with that user. Also, a telemetry smart contract is deployed on a per user basis which logs user specific metadata.
When the user intent is completed the user proxy agent appends new data to the user’s telemetry smart contract.
LLM hosting and inference
The large language models used by the OpenAgents AI multi-agent solver network can be hosted in a variety of ways depending on the user’s preference. Models used for inference will incorporate decentralized physical infrastructure network providers (e.g. Akash Network). Other hosting options include fully managed generative AI services (e.g. GroqCloud).
If a user so chooses, he or she can self-host models on their mobile devices or laptop/desktop computers. This is largely possible due to quantization and new low rank adaptation of LLMs (i.e. LoRA) frameworks. Quantization has made it possible to compress models with billions of parameters to the point that they can be run at the edge using significantly less memory. In addition, new LoRA frameworks allow for thousands of fine tuned models to be served on a single GPU. The process of loading models into memory just-in-time is an efficient way to utilize multi-agent workflows without sacrificing throughput or latency.
LLMs ingest user intents that are forwarded to the appropriate model based on the request. Inference requests are orchestrated by the user proxy agent and are called through an API interface.
The purpose of the LLM in this context is to act as a base model endpoint for reasoning and decision making when executing an intent. Put simply, the LLM is the conductor of a symphony of complimentary subsystems. LLM hosts receive input data from the user proxy agent and send model responses as part of the solver workflow. More concretely, OpenAgents AI uses LLMs to determine the methodology used to fill and execute the intent.
The LLM acts as a state manager with the ability to provide if agent failure occurs during the intent execution. State changes including the success of the user’s goal or objective are communicated to the user through the LLM.
Agent library
The agent library is a smart contract in and of itself with a schema that stores metadata and pointer references to the representative smart contract functionality of the agent which is stored on Filecoin or other decentralized data storage facilities. As part of creating a new agent on the OpenAgents AI platform, a smart contract is deployed that stores metadata pertaining to that agent. This ensures that agent usage data is properly recorded.
Decentralized storage
The purpose of the decentralized storage facility is to store the source code of agents that are created on the OpenAgents AI platform. Filecoin is one of many providers in this problem space.
Decentralized key management
As part of the distributed key management store for wallets created by OpenAgents AI it is necessary to have a decentralized key management system. When OpenAgents AI creates a new wallet for a user it uses multi-party computation and threshold signature schemes with envelope encryption to securely reference the private key at a later time. Decentralized key management systems like Lit provide a framework for creating asymmetric keys that can be encrypted again for best practices. The encrypted asymmetric key is stored in the user’s telemetry smart contract and a soulbound NFT is issued to their OpenAgents AI wallet to govern access to the key share network for decrypting the asymmetric key. OpenAgents AI delegates key signing to Lit which uses distributed serverless functions to securely sign transaction instructions with the wallet private key. This can be done at scale for wallets that are provisioned by OpenAgents AI across any supported network for the seamless execution of user intents that span across multiple different blockchain networks.
Level 4-5 autonomy
Iterative versions of this reference architecture will prepare for level 4-5 autonomy with solver networks that consist of multi-agent collaboration with shared memory. In this model, agents can call other agents recursively during the intent execution process. The reference architecture document that describes level 4-5 autonomy will be better served for a later date.
Flow Diagram 1: Chain Abstraction is Built-in
This will be possible through a concept of parent and child wallet relationship powered by a distributed key signer protocol. This can also be possible through recent technologies with embedded wallets and wallet-as-a-service (i.e. WaaS) providers via an authorization architecture.
In the diagram above, the user connects their wallet (e.g. MetaMask) is given authority over a set of wallets that their agent controls. The user can modify anything across chains for transactions given from the agents such as swapping 100 USDC for SOL on Solana while connected to their EVM-based wallet.
In the diagram above, the user would simply be using a WaaS to enable access to all of the wallets across numerous blockchains. This is a fairly traditional model that we have continued to explore for the past few years, but the user proxy agent will be the one interacting and managing multiple wallets at once.
Flow Diagram 2: Supporting Scenarios Where Connected Wallets or No AA is Preferred
OpenAgents AI will support users that prefer direct control over their wallets and signing of transactions. Alternatively, users can also delegate complete authority to the user proxy agent. Think of this as layer 3 autonomy for driverless cars.
Flow Diagram 3: Technical Tradeoffs
In the blockchain space, it doesn’t take long to find out that things that work routinely, suddenly stop. It could be a bridge, swap or any other network-based transaction. Re-executing large on-chain workflows with multiple dependencies is a problem on multiple fronts for most developers. This is largely due to blockchain-based systems not providing fallback guarantees causing many potential points of failure. It is a massive leap forward for developers and the ecosystem at large knowing that transactions can be completed reliably. Therefore, we are making the tradeoff of increased latency in exchange for additional flexibility and fault tolerance. Reliability and consistency is especially important when achieving chain abstraction.
There is an option to reduce the latency by skipping the Relay from User Proxy and AI Agent by placing the AI agent on the catch block to reduce the latency but at the expense of the flexibility.
Flow Diagram 4: AI Agent Library
The system works by leveraging an AI Agent Library. This library contains agents that are responsible for protocols, APIs and SDKs. All of this is leveraging contracts that sit on-chain that track usage and distribute rewards accordingly.
Flow Diagram 5: Agent Creator/Code Generation Agents
To enhance our library of agents and reduce the burden of the overall developer ecosystem to produce agents that wrap smart contract functionality, we created an Agent Creator tool that helps in research and code generation.
Hypothetical Example: Purchase of a NFT worth 100 ABC tokens
Let’s explore a traditional example as things work today without OpenAgents AI. In this example, I want to buy an NFT. Assuming I have adequate wallet balances, the user journey would look something like this: