If you want a scalable Ethereum blockchain, then P=NP

The Ethereum project has an unfortunate tendency to propose six impossible things before breakfast — ideas that would win a Fields Medal for mathematical genius if they could possibly work.

If you want a horizontally scalable blockchain … then P=NP. If you want transaction sharding to work, then P=NP.

You could bodge either of these, of course. Do you feel lucky?

But maybe it doesn’t matter — and maybe it was never meant to matter.



NP-completeness and P=NP

NP-complete problems are one particular class of mathematical problem where it’s easy to check a result is correct, but hard to calculate the result in the first place.

NP-completeness is a big deal, because a lot of NP-complete problems would be near-magically useful to solve. And all NP-complete problems are mathematically equivalent — so if you can analytically solve one, you’ve  solved them all.

We don’t know if NP-complete problems can be solved. We do know that if we could solve them, it would turn the world upside down. The shorthand for this is “P=NP” — that NP-complete problems could be solved in polynomial time.

Problems of interest to the cryptocurrency field that P=NP could solve include quickly factoring integers to break public-key cryptography, or reversing hashes.

The Golden Ticket: P, NP, and the Search for the Impossible by Lance Fortnow is a good non-technical introduction to just how many amazing things you could do if P=NP. [Amazon]

But we’ve never found how to do any of those amazing things. So we presume that if a problem is NP-complete, there isn’t an analytical solution.

There might be a solution! But the burden of proof is firmly on the claimant.

Mathematicians have gotten pretty good at spotting NP-completeness in exciting new ideas in the past few decades. It’s not a good sign.

Data Finnovation: let’s construct some proofs

Let’s look at some requirements for a scalable blockchain. Scalability like this was first posited by Ethereum, then the promises were copied into the white papers for other blockchain projects. If the fabulous future promises come true, your blockchain will need to do these:

  • A scalable blockchain will process more transactions per unit time than a single node on the network.
  • Balances can never go negative in the system.
  • We need to reconcile multiple lists of valid transactions, with a requirement of no negative balance, into a single merged list, such that the maximum number of possible transactions go through.  This must be achievable efficiently.

Here’s a post from the Data Finnovation blog that aims to do the following: [Data Finnovation]

  1. Describe these three requirements.
  2. Show that the problem of making a blockchain that does all of these things is NP-complete.
  3. Show how, if you have such a blockchain, you can right now break hash functions and public-key cryptography, and construct a proof that P=NP.

For an encore, the author shows that if transaction sharding — calculating transactions and running smart contract programs in parallel — could work as advertised, then that too would be NP-complete. [Data Finnovation]

I won’t copy their work here — this is a post about the Ethereum project. But do enjoy working through the maths.

Worse is better

In real-world programming, when you hit an NP-complete problem, you can usually get away with an approximation to the right answer.

Ethereum is likely to bodge a solution to scaling and sharding, because so many things in Ethereum are bodges. Ethereum is the epitome of “Worse is Better” — a product that does the job inelegantly but exists now is better than one that’s more elegant, but doesn’t exist yet.

The risk is that we already know how creative blockchain hackers are in seeking out self-service bug bounties — especially when a bodge breaks.

This is why Ethereum keeps promising fabulous new scaling intiatives — then doesn’t implement them for years, or abandons them when they realise the only way is to bodge it.

The current Ethereum proof-of-stake test network generates less CO2 — but it’s still a single-threaded application, that can’t run faster than a single standalone node.

To some extent, the Ethereum project has just given up on scaling the main blockchain. “For Ethereum to scale and keep up with demand, it has required rollups” — do the work somewhere else and send back the result. The blockchain is only usable if you work around actually using it. [ethereum.org]

Stay in school, kids

Data Finnovation notes: “It’s worth stating that part #2 — the only part of this blog that is technical — would fly as a homework assignment in any ‘Intro to the Theory of Computation’ class.”

Vitalik Buterin skipped getting a degree in computer science because Peter Thiel paid him not to go to college. This may not have been the greatest idea. The hazard of being an autodidact is that you don’t know what you don’t know.

Buterin had a propensity for proposing the profoundly unlikely back when he was still into Bitcoin — such as calculating Bitcoin hashes faster by using a quantum computer. He didn’t have a quantum computer — but he believed a maths crank friend who told him you could simulate a quantum computer usably fast on an ordinary computer.

When The DAO was hacked in 2016, the Ethereum project proposed a soft fork: they would blacklist transactions whose result interacted with the “dark DAO” the attacker had poured the funds into. This would have been a vector for a fairly obvious denial-of-service attack: flood Ethereum with costly computations that end at the dark DAO.

This approach could only have worked by first solving the halting problem — you would need to be able to determine the outcome of any possible Ethereum program without actually running it and observing the result. [Hacking Distributed, 2016]

I don’t have a computer science degree (or any degree) either, but I did take care to run my ideas past some computer science professors before I posted. They stressed that bodging your NP-complete problem is absolutely standard, because in almost all cases, the 99% solution will do. The trouble is, I’m pretty sure there’s a lot of self-service bug bounties in the other 1%.

The purpose of a system is what it does

The Ethereum blockchain is clogged to near-unusability. Transaction fees are large and unpredictable. Applications are deployed to Ethereum because it’s popular — then they’re all but unusable by normal humans.

Ethereum claims to be decentralised, but this is a marketing lie. The Ethereum blockchain burns a country’s worth of electricity for its proof-of-work mining, but no more than three entities control the majority of the mining. Megatons of CO2 are generated for no useful reason except to back legal claims of untouchability. [Digiconomist]

Infura, an API provided by ConsenSys, is the only way to use the Ethereum network at any speed. Infura recently blocked a swathe of countries in response to US sanctions pressure — and took out a pile of common applications in those countries, including MetaMask and OpenSea.

The present Ethereum system does one job well: it makes a great deal of money for the centralised entities who have operational control of the system. Ethereum’s cryptocurrency, ether, is almost as easily traded for actual money as Bitcoin is.

Ethereum makes more sense if you first assume it never mattered if the fancy promises ever worked out — even if Buterin believed they were supposed to.

“Maybe investing billions in dollars & man hours into a white paper from a teenager who had no previous experience in … well almost everything, wasn’t the best of ideas. Unless the idea was not about whether it could work, and instead how much $ you could extract from the naive.” — Alan Graham [Twitter]


Become a Patron!

Your subscriptions keep this site going. Sign up today!

5 Comments on “If you want a scalable Ethereum blockchain, then P=NP”

  1. I’ve been reading a book on computational complexity and NP=P by Avi Wigdersen. He points out that we don’t know if NP=P but we do know lots and lots of problems that if we solved we would prove it. Given the size of the attack surface and the years spent on all the problems from domains across math and science it seems very very unlikely that NP=P. I have a link to his book on my website http://www.agman.com.

  2. Nitpick from an ancient computer science student: P and NP stand for polynomial and non polynomial, we *do* know how to solve many of these problems but the time required grows exponentially with the size of the input for all known solutions – i.e., the equation for the complexity class has at least one exponential term, and is therefore non polynomial, hence the name.

  3. As the CS professors said, 99% solutions with a bodge will do; one common bodge is to keep the input very small. So, if one only has a handful of nodes and a handful of transactions…
    /insert the sound of bells ringing

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.