I recently came across the concept of Idea Legos while reading Packy McCormick's 'Not Boring' newsletter. Packy talks about how well defined functional pieces of open software can be considered to be legos/building blocks. These legos can be combined with other legos to build functionality that wouldn't be possible using individual legos by themselves.

Composability is a $10 word used in system-design that describes this process.  It is a useful framework for discussing how primitives from multiple technologies can be combined to create new functional blocks that aren't easily achieved using a single technology.

In this post, I examine Public Blockchains and AI using the framework of Idea Legos. I first define the primitives (Idea Legos) provided by Public Blockchains and AI. I then analyze existing protocols at the intersection of these two domains. The analysis  reveals that protocols with seemingly different objectives and operation can be seen as arising out of the interaction of a common set of simple Public Blockchains and AI primitives. Finally, I use the same primitives to synthesize a small set of possible future applications in the Web3XAI space. I hope you have as much fun reading this post as I had playing with Web3 and AI Idea Legos. Let's dive in.

Blockchain Primitives

To begin with, let's list out some of the the main primitives afforded by Public Blockchains that underpin Web3. One way of thinking about Public Blockchains is as massively multiclient databases where everyone has root access (Primitive 1). Users transact largely anonymously and trust-lessly via public key cryptography (Primitive 2). Transactions are secured and validated by a decentralized consensus mechanism (Primitive 3).

Once transactions are validated and included in the decentralized ledger, they are irrevocable (Primitive 4), provided that the blockchains' security is continuously upheld. Another way of looking at this primitive is that legitimate transactions on the blockchain can be verifiably guaranteed by other legitimate actors on the blockchain.

Most Blockchains also provide the ability to program transactions with different constraints (Primitive 5). Broadly speaking these are referred to as smart contracts. Some Blockchains (like Ethereum) provide general purpose programmability with the ability to write complicated programs with high level languages. Some (like Bitcoin) have a limited instruction set and basic programmability.

Finally most Blockchains allow the monetary representation of arbitrary forms of value (Art, Music, Money, Debt, Data, AI models, model predictions,...) in monetary units that would be impractical (10^-8 USD, for instance) when using fiat money (Primitive 6).

To summarize, these are the primitives (as I see it) that a Public Blockchain provides

  1. Root-access to value / Permission-less
  2. Censorship resistance / Anonymity
  3. Decentralized consensus for security and validation
  4. Verifiable guarantees on state / Immutability
  5. Programmability of transactions
  6. User-defined value can be transacted in infinitesimally small units.

Strictly speaking, the primitives listed above are not independent from each other. For instance, transaction immutability is only guaranteed if the decentralized security model is robust enough to prevent Sybil attacks. From the standpoint of understanding how this technology can be used, however, it may be useful to consider these primitives as independent.

AI primitives

Defining AI primitives is tricky. AI is a catch-all term that abstracts major differences underlying individual implementations. Colloquial usage of the term Artificial Intelligence imagines non-human agents responding appropriately (with equal to or better competence than humans) to stimuli without requiring human intervention, continuously learning from their environment. While this is a more accurate description of trained machine learning (ML) models, it will do for the sake of this post. So, equal or better than human performance  - 24X7 availability, indefatigability, consistency and speed - on certain tasks seems to be one primitive afforded by AI as colloquially understood (Primitive 1). In practice, no AI is completely autonomous of course. Presently, AI models require continuous retraining and re-deployment as their operating environment changes.

In Prediction Machines, the authors cast AI models as an economic agents that significantly lower the cost of prediction (Primitive 2). The essential idea here is that when the cost of an item drops to a fraction of what is used to be, it starts getting used in places where it would previously be completely uneconomical. Consider compute power and Moore's Law as a case in point. Clocks, televisions, radios, cameras, communication and money were previously completely analog. These are now essentially completely digital because going digital affords significantly improved economies of scale.

In the same way, the cost of making predictions (at the same level of accuracy) has gone down dramatically in recent times with the advent of GPU-based ML models. This drop in cost now enables predictions to be used in places where it was previously either completely inaccurate, if cheap, or uneconomical if accurate.

Modern (post-2012) ML is characterized by another important feature - the use of data to infer rules of inference and improve performance - rather than being given explicit rules to transform inputs to outputs. The highly performant ML models that capture public imagination (GPT3, BERT, AlphaZero, ...) rely on massive treasure troves of data to inform their decision making rules rather than having humans explicitly encode these rules (Primitive 3). Does 'The Unreasonable Effectiveness of Data' in this regard warrant a separate primitive? Perhaps. There are two supporting reasons for its inclusion as a primitive. The first is that such behavior is not a characteristic of other technologies in general. The second is that this feature of modern ML creates fundamentally new ways of doing business.

Model performance vs dataset size. Figure from here

As modern AI marches inexorably towards Artificial General Intelligence (AGI), an important primitive it provides along the way - Domain Independent Learning - is similar in some aspects to the general purpose programmability and transaction of arbitrary value afforded by (some) Public Blockchains (Primitive 4).

To summarize, here are some of the important primitives that AI (as colloquially used) provides.

  1. Equal or better than human performance on certain tasks
  2. Lowered cost of prediction
  3. Performance improves with size of (good) data used for training.
  4. Domain Independent Learning

Web3XAI application analysis using primitives

The general purpose programmability (Blockchain Primitive) of blockchains such as Ethereum, Solana and others enables smart contracts that store and settle transactions involving mutual value to participants. This in turn allows the creation of a new primitive - the fungible token - that is a special type of smart contract that uses the base layer of the blockchain while itself representing arbitrary types of value (Blockchain Primitive). In case of AI models that depend on access to increasing amounts of high quality data for their performance, the combination of these two primitives allows the creation of a new ecosystem - Data De-Fi. Here, data asset holders (corporations, individuals) monetize their assets through staking and/or selling of their data using highly programmable ERC20 data tokens. Data buyers (Corporations, Data Scientists, AI algorithms) buy access to this data and improve performance in their respective domains (AI primitive). This combination of blockchain and AI primitives is the key idea behind protocols such as Ocean and Swash.

Viewing protocols through the lens of primitives enables us to see remarkable similarities in the essential working of Web3 applications that may not seem similar at first glance. For instance the same set of blockchain primitives that enable Ocean and Swash (see figure below) can be cast in a slightly different manner to enable applications such as Numerai. The general purpose programmability primitive is used to create NMR tokens, while the primitive that enables value assignment to assets of mutual agreement allows the prediction of AI models to be staked upon. The end result is a large number of models that can be ensembled into a meta-model that aims to be superior to human performance (AI Primitive). Access to data is the central focus of Ocean whereas Numerai makes data freely available and focuses on models and their predictions.

Better than human performance and cheap prediction costs (AI Primitives) at multiple tasks  such as logistics, robotics and investment can be used to create Autonomous Agents. When combined with blockchain primitives such as Verifiable Guarantees, General Purpose Programmability , Anonymity  and the ability to transact mutually agreed upon value, the agents create building blocks for Fetch.ai. Here, AI agents act in their own economic interests (or the interests of their investors and owners) and co-operate with other AI agents in a network governed by rules of the underlying blockchain protocol.

The same set of primitives can be cast slightly differently to create AI DAOs.

Application synthesis using Web3XAI Idea Legos

Running the above procedure in reverse i.e. combining primitives to predict applications is a fun way to explore the universe of possible Web3XAI applications. These applications span the continuum of their difficulty of application, but using Idea Legos allows us to dream them up and leave those pesky difficulties aside!

Cheap predictions in lieu of calculations (AI Primitive) can be combined with micropayments and final settlement (Blockchain Primitive) enabled by layer 2 blockchain applications such as the Lightning Network to serve model predictions on an on-demand per-prediction basis instead of using bulk pricing or subscription based services.

We might also be able to turn this set of primitives around and make training data available to AIs on an on-demand basis using micropayments + final settlement to improve their performance.

Bitcoin critics have often derided Proof-Of-Work as dangerously wasteful, using enormous amounts of energy to solve useless random math problems. Even if we were to grant them this objection, in principle, it could be addressed by designing a proof-of-work scheme in which instead of computing hashes, we used the computational power to do "useful" work. Some proof-of-useful work schemes with applications in AI have been already been proposed in the literature. While researching applications during the writing of this post, I came across a relatively new protocol - Flux - that (cl)aims to do just this. Other proofs of useful work can be dreamt-up just as easily. From the field of computational physics for instance, we could design a proof-of-work scheme in which we compute the ground state of systems of atoms and then store their eigenvectors and eigenvalues (proof-of-storage) in exchange for block rewards. The computation creates valuable large scale data sets that can then be sold to participants training AI-based materials design models while securing the network at scale.

It's easy and a lot of fun to come up with any number of applications at the intersection of these two fields by combining their primitives. This makes no claims about their viability as businesses, of course :)

There are a number of domains such as DeFi Lending, Decentralized Insurance, Logistics, Gaming where Web3XAI Idea legos can be very useful for application idea generation.

Summary

In this post, I defined some essential Idea Legos/primitives of Public Blockchains and modern AI.  When viewed through the lens of primitives, seemingly dissimilar protocols at the intersection of Web3 and AI appear fundamentally similar. Combining primitives between these technologies is a fun and productive way of generating new application ideas. Are there new applications that you can dream-up using Web3XAI Idea Legos? Let me know. I'm @antaraxia_kk on Twitter.