Today, there is no open-source AI. You cannot fork the LLaMA training code. You cannot see or modify the LLaMA training dataset. You cannot participate in the development of the next LLaMA model. And even if you could do all these things, you would need millions of dollars to actually carry out a new training run. Substitute LLaMA for any leading open-source model and it's the same story. Open-source AI, today, depends on large model weights being released for free by well-resourced corporates. The key properties that define open-source—the ability for anyone to participate, innovate, and build on others’ work—do not exist at the foundation model layer. I believe this is a bad thing, and founded Pluralis Research to change it. We are developing a new approach, called Protocol Learning, that will enable truly open-source AI.
Commoditisation doesn’t mean you have rich competition, usually the exact opposite.
A common refrain here is ‘it’s all going to be ok - foundation models are commoditizing’. I’m not convinced that’s true. But even if it is, think for a second what that actually means. When Standard Oil started operations, kerosene cost 26c per gallon and quality was irregular. Thirty years later, kerosene quality was effectively uniform, cost 5c a gallon, and Standard Oil had a complete monopoly. Commoditization doesn't mean you have rich competition, usually the exact opposite. Why is this an appropriate analogy to the current AI situation? Huge fixed costs, significant marginal costs, control of distribution is critical, and the product is essential and ubiquitous. Standard Oil was able to entrench this monopoly through cheap distribution via kickbacks to railroads. In a commoditized LLM scenario, distribution is also the major factor (the LLM’s that are ‘just there’ and integrated into existing products will absorb most use, if capability is roughly equivalent). So, in this case, the situation is even worse - large tech IS the distribution, they don’t need to influence it - they already completely control it.
This urgency is heightened because AI has the potential to render vast amounts of human ‘knowledge work’ redundant. If foundational models are controlled by a few entities, extreme power asymmetries will emerge. We've seen echoes of this moment before. In the early days of the internet, innovation flourished not because of capital or power, but because it was open. Ideas flowed freely into reality, through anyone passionate and driven enough to make them happen. My entire life has been watching corporations gradually close that openness, leaving us with an internet increasingly controlled, narrow, and disconnected from the promise it once held. Now, these same corporations stand ready to dominate AI in a similar way, from day one. However, this time, I think it will be worse.
Introducing Protocol Learning
Protocol Learning offers a chance to avoid repeating history. It allows expertise—not financial strength—to drive innovation at the foundation model layer. It trustlessly pools compute from many participants together, training and hosting models collaboratively within protocols. However, this is not enough – volunteer computation will never reach the scale required to train cutting-edge models. Training participants must have a way to profit from their contribution. There must be economic rationality in the open model development process. Protocol Learning solves this seeming contradiction (how can a model be open, but also monetizable?) by allowing training and deployment in such a way that no participant can ever obtain a full weight set. This allows programmatic value flow from downstream use of the model which, in turn, allows individuals and collaborations of individuals to develop and monetize foundation models without upfront capital cost to themselves (as you can incentivize through partial model ownership). This key property means that someone with nothing other than a credible, good idea can attract sufficient compute to train at frontier scale, create a useful model, and they (alongside training contributors) can make money by doing so. If this is realized, Plurais will have succeeded.
At a technical level, Protocol Learning is low-bandwidth, heterogeneous multi-participant, model-parallel training and inference1. It combines the economic sustainability of closed model releases (one of the only positives of closed model releases) with the benefits of open-source.
The benefits may seem obvious; open collaboration, permissionless participation, sustainable incentives, transparency, and universal guaranteed access to frontier-scale models. However, there is a fundamental reason why it hasn't been implemented: low-bandwidth model parallelism remains an unsolved research challenge. Currently, all decentralized training efforts are Data Parallel (DP) approaches, where identical copies of the model exist across all participants. It remains an extremely non-consensus position that a low-bandwidth model parallel training is possible. The reason is that when a large model is split across multiple devices, training requires transferring activation and gradient tensors (10s-100s of MB) between nodes during each forward and backward pass. When devices communicate only via internet connections, training is bottlenecked by this communication and slows to a crawl. The core thesis of Pluralis is that this is solvable.
Why I started Pluralis
When Pluralis began in December 2023 every single researcher I talked to (with one exception) disagreed with me. Later, I found out that a very senior engineer/manager/researcher at a frontier lab—who I had discussed this with—told a friend: “What Alexander is proposing would break ‘training physics’ - it makes no sense”. Others would commonly respond with: “Oh you mean Federated Learning?” (Federated Learning does not split the model over devices and is primarily concerned with data privacy rather than protecting model weights). In fact, Decentralized Training was not even a term. The concept of splitting the weightset itself over participants was largely met with: “But why would you want to do that?”. In hindsight, the decision to pursue it was irrational. My fundamental justification was the significance of potential benefits; at the very least, justified an attempt. Protocol Learning is a very rare situation where solving a core research problem can directly result in large-scale changes to the world, economy, and geopolitical power structures with a correspondingly significant cultural impact. A problem like that—even if you think the probability of solving it might be low—is hard to forget about.
The most unique, impactful idea in Protocol Learning is the concept of ‘Unmaterializable models’. As soon as you have model-parallel multi-party runs, this becomes possible. No single participant has the full weight set - everyone only has shards. No one can take the model and standup inference somewhere else; no one can make a local copy and use it internally. If you can enforce this property, the full model never ‘exists’ in any one point. Extraction is not possible - no one has a full model checkpoint to share. Unmaterializability is what allows programmatic value flow. Without it, a model can always be more heavily quantized, deployed in a centralized setting, simply used internally/locally, or served cheaper, undermining the ability of trainers to monetize their contribution. The value of a model is in its weight set. The fact that this is possible while still allowing a model to be trained and deployed is very specific to neural networks. I can't think of any other useful entity that has this property.
This single compromise (the inability to ever see a full weight set), gives you every benefit of open-source while allowing the entire process to be economically sustainable. In my view, this approach offers far stronger auditability, interoperability, and transparency than merely sharing final weights.
When I developed the initial ideas for Pluralis, I felt it was too obvious - that I may only have a month or so before it became a major field. Somehow, almost a year on, that still has not happened. This has given the core team 6 months to do uninterrupted, focused research on this problem. The fundamental plan has not changed since the first document I wrote in January 2024. There is a hard research problem at the core of all of this. The way to tackle that is to do research. Pluralis is currently staffed solely with ML PhD’s. We’re a deeply technical team that has already worked together for multiple years. We’re completely aligned on the vision, laser-focused on the problem, and we are seeing the first signs of it giving way.
Pluralis is going to;
Solve the open research problem of model-parallel low-bandwidth training.
Begin multiple large-scale runs, incentivizing contribution via ownership of the resulting model.
Allow market dynamics to result in a giant model run, exceeding current frontier scale, resulting in true open-source AI.
The Next Phase
We’ve raised a $7.6M seed round led by USV and Coinfund with participation from other outstanding investors who are aligned with our view of the situation. The goal is extremely clear: create models that outperform the closed ones, share the value among those who made it happen, and allow—for the first time—actual open innovation at this layer. We’re assembling compute partners to participate in the first open runs and are hiring across the board in both the US and Australia for people who understand what is at stake and want to build something that actually deserves to be called open-source.
No technical problem today has greater impact.
This is really exciting, super impactful work! Can't wait to see where it leads... I would love to help you guys with foundational product marketing work, hmu if you're interested!
Thanks for distilling this complex approach into a thought provoking and understandable article. Has there been a positive shift in support over the last 6 months from your peers and the investor community? I hope there are enlightened Australian investors amongst the participants of your round.