While we finish up StellarX, I wanted to share a research project on Ethereum we did earlier in the year. It’s an open secret that it’s simply the wrong platform for most of the stuff being built on it—it’s not, of itself, a bad technology, but entrepreneurs have applied Ethereum to the wrong uses.
A reckoning is coming, something like the Great Filter:
…where high costs and slow performance cull almost the entire ecosystem, and unless your project is specifically adapted to Ethereum’s strengths, it will die alongside so many others.
I’m pretty tired of people debating cryptocurrency using stuff they either made up or copied, fingers-crossed, from some untested whitepaper. So here I use actual data from actual tests that cost us an actual thirteen thousand dollars. My team and I spent that money running an “at-scale” app for a total of ten hours on the Ethereum network, when gas prices were low, so it’s a mere pittance compared to what your blockchain company will pay to learn these same lessons in a real deployment.
Ethereum’s Strengths and Weaknesses
Ethereum is a distributed programming platform — scripting software for autonomous organizations and ownerless apps. Vitalik himself put it like this, in his original announcement on bitcointalk:
Ethereum is a modular, stateful, Turing-complete contract scripting system…our goal is to provide a platform for decentralized applications.
Today, as in 2014, if you’re building a distributed computer program, something with no sole owner and no centralized decision-making apparatus, Ethereum is a great choice. For the perfect use case, think something like Augur — they needed a fully automated, ownerless way to make truth-decisions about real world events…so they could circumvent regulatory enforcement for what is basically a gambling platform.
But most blockchain companies don’t need smart contracts to execute their core business logic or want to dodge some legal or jurisdictional problem. They just want to issue digital assets and process transactions. That’s exactly where Ethereum will let you down. If you’re building something where:
- you issue a token
- that users will trade back and forth, and
- you want those trades to happen cheaply and in real-time
Ethereum is the wrong choice. It’s slow and it’s really f’ing expensive, and it fails to act like you want in both the “one account doing a lot” and the “many accounts doing a little” cases.
Let’s get to the data. Our testing was based on a third-party load test, designed by Kik, and we tweaked the spec wherever possible to make Ethereum perform better. As you can see, it stubbornly defied our efforts. Our results and methodology are in our GitHub, and we encourage you to check our work.
Problem 1: Your most enthusiastic users will have the worst experience.
Ethereum queues transactions on a per-account basis, and yet miners don’t prioritize transactions by wait time. In fact, given transactions with equal gas prices, miners are assigned them at random. So an active account builds up a transaction queue, and the network has no mechanism to clear it. The result, for high-volume accounts, is an ever-increasing transaction lag.
Ethereum processes transactions using two numbers, a transaction nonce (what we’ll call the “nonce”) and an account nonce that, for clarity, we’ll sometimes call the “count”. The transaction nonce puts an account’s transactions in order; the account nonce counts whenever one of them is mined. When a new transaction, with its nonce, is submitted, Ethereum compares that nonce to the current count to decide what to do. If the transaction’s nonce is lower than the count, the transaction is ignored. If it’s higher, the transaction is delayed. Only if/when the nonce matches the count can the transaction move into a block. Here’s a simplified diagram of how it works:
This is actually very similar to the “please take a number” systems you see at a deli or at a government office like the DMV, and it’s a fairly common way to prevent replay attacks. Lots of other chains do something similar. However, Ethereum’s transaction-to-block algorithm (or, really, lack thereof) adds the wrinkle that the people working your DMV window here — the miners — aren’t necessarily accountable to the next number in line.
Miners often have their own criteria for the transactions they’ll accept. Many only accept high-gas-price transactions. Some only accept their own transactions. Miners like these will let block space go unused before filling it with something from your queue. So now imagine a DMV where certain windows are telling people “sorry can’t help you,” while more people file into the waiting room every second and you have all these jokers in front of you who have to get helped before you can even talk to someone — and, voila, you have some idea of how Ethereum handles transactions.
We weren’t aware it worked like this until after we tried to implement Kik’s load spec: 480 accounts each submitting 1 txn/minute on average for 3 hours. That’s 86,400 total transactions, an average of 8 per second.
We spun up the test, using ETH Gas Station’s “standard” estimates for gas, expecting a median confirmation time of about 30 seconds, and, lo and behold, 13 hours later, more than half of our transactions still hadn’t made it into a block. We stopped the test at 13h and 50m, and 50.1% of our transactions were missing. (Reminder: the raw data is in our GitHub, if you want to check our work.) We thought we’d messed up somehow, but, no. We had just created a bunch of long lines, and some jabroni transactions had stood there all day doing nothing.
When you read elsewhere about “Ethereum transaction times”, the posted numbers almost always suppose a single, one-off event. They do not pertain in an application-level environment. We ran the Kik test again just to really make sure we were doing everything right, spending another 6.9 ETH, and we got essentially the same result.
Here’s a typical experience from that run — this is just the account that happened to be first alpha-numerically. You can see the wait times grow as transactions pile up.
It’s one thing to talk about “settlement time” in the abstract. But think about the above data in terms of actual user experience. The more someone uses your Ethereum app, the slower it goes. After just three hours, their transactions are taking 8 hours to confirm.
Of course Kik’s test spec said we should submit transactions for three hours and then stop, so that’s what we did. In the real world, you can’t build in downtime to allow the count to catch up — so in theory transaction queues just get worse and worse. In practice, of course, as your Ethereum app becomes unresponsive, users will help it catch up, by leaving.
Here’s the performance distribution from that second test. I trimmed the slowest 5% so the long-tail doesn’t skew the overall picture.
For comparison, this is what Kik measured running the same spec (on Stellar.)
I just grabbed this plot from their post, and I don’t have the original data, so I can’t show my results on the same chart. But using the magic of computers, I can at least overlay the curves:
Everything looks comparable until you notice the x-axes. The waits we measured on Ethereum are 3,000 times longer. That’s the queuing problem in a nutshell.
This performance issue is currently a fundamental part of Ethereum. Improvements like sharding or Casper are promising in theory, but those will be complex fixes layered over Ethereum’s almost maximal complexity. Something like lightning can rely on Bitcoin’s inherent simplicity; whereas there’s nothing basic to fall back on here. A skyscraper is usually built on bedrock, not on top of another skyscraper, yet that is what a lot of Ethereum scaling solutions propose to do.
The only certain performance improvement is to spend more on gas and hope to plow through each account queue faster. We in fact did that in a third three-hour trial — which we ran because of our “we should do what we can to make this work” commitment.
The previous two tests had used the “standard” ETH Gas Station recommendation. We used the “fast” tier (≈4 Gwei at the time) for the third and spent 11.8 ETH on our 480 accounts.
Performance improves — to only 500 times slower than Kik’s results on Stellar — but it’s still not fast enough. The backlog builds and payments hang around with nothing to do.
Problem 2: Very High Cost of Wide Adoption
Ever thus to power users. But Ethereum is also unsuitable for the other kind of adoption, what you might see with an app like, I dunno, Etsy, where instead of a few people going deep, you have lots of people checking in every once in a while. That’s because an Ethereum app’s per-user costs go up quickly as it adds users, and that’s why you see stuff like 70x price spikes whenever anyone tries using the network across many accounts.
We captured this data incidentally, looking for a workaround to the queueing problem. To keep transactions from piling up, we refactored the Kik spec as follows: instead of a few accounts submitting a bunch of transactions, we spun up a bunch of accounts (28,800) and had each of them just do a single transaction. To stick to the original test’s guideline of 8 total txn/s we submitted the transactions over the course of an hour.
Curiously, this didn’t actually help performance very much. The median confirmation time was 23 minutes — actually slower than the “fast” test above. Even weirder, some of the first transactions we submitted were the last to confirm:
We knew account queues couldn’t be the issue. It turned out that as soon as our transactions started hitting the network, miners’ fees soared. So our earliest transactions, submitted with pre-test “standard” pricing, were quickly priced-out. They lingered in low priority for hours.
We had discovered another of Ethereum’s negative reinforcement loops. Adding users immediately scales cost. In the real world, increasing the number of units implies lower per-unit costs. Basically the whole private sector is built on this idea — “economy of scale”. But here: each incremental user immediately increases the per-user cost. It’s like bizarro economics.
You can see prices climbing ≈6x over the short 1-hour run of our test.
Again, the built-in time limit of the test makes everything look more sustainable than it actually is. Extrapolate this chart out and drop the needle somewhere in the middle. What do your per-user costs look like after two weeks of steady usage? Two years?
The above test cost us $1,445 for a single hour. It ran when gas prices were low, just ≈1 Gwei for standard speed, and it puttered along at just 8 transactions a second. To run a basic test, that’s $12.6m a year.
If you apply that cost structure to a real business, you see that Ethereum’s fees are already unsustainably high. For example, PayPal does about 240 transactions a second. Put aside the performance fixes it would take to make that happen and put aside the rising-price dynamic I just documented. If PayPal had been built on Ethereum and paid our observed rate, they would’ve laid out $380m in network fees last year. That would’ve been 21% of their net income, and, again, that’s pretending that you could somehow freeze prices.
An idealized version of Ethereum won’t work for one of the most profitable transactional businesses in the world. How is the real version going to work for the rest of us?
“If you want to build a decentralized Uber and Lyft on top of an unscalable Ethereum, you are screwed. Full stop.” — Vitalik Buterin
I encourage you to watch the full panel this quote came from — it shows four of the most important members of the Ethereum team saying a lot of what I’ve been saying here. When today’s high-profile ICO becomes tomorrow’s cautionary tale, it will only be bad for everyone in the ecosystem, and they know it.
There’s no doubt that the Ethereum community is the strongest in blockchain, and there very likely wouldn’t be token economies at all without Vitalik’s vision. It’s not Ethereum’s fault that developers are asking from the tech what it was never meant to deliver. It’s the people chasing last year’s ICO dollars, regardless of what’s actually the right tool. Ethereum’s problems all start with misguided entrepreneurs. Don’t become another one of them.
If you’re building a transactional app, the protocol will not support the behavior your users will expect. I have immense respect for the ambition and the complexity of Ethereum, but I’ve come to see it as blockchain haute couture. Beautiful, intricate, high-concept, high-minded. And not what you wanna wear to work.
If you want trustless, distributed computation — if that’s really what you’re building — definitely use it. If you never plan to actually launch anything, Ethereum’s great for that, too — just ask the over 50% of ICO projects who disappear after their token sales, if you can find them. They’ve already hit the Great Filter. And likely some beach in Puerto Rico.
But if you want to build a business that sticks around — if you plan a typical user-to-user service and don’t need to tie up your business logic in a smart contract — if you plan to issue a digital asset and you plan to transact at high volumes as a core part of your strategy, pick a platform that is optimized for that. Do what we did, and build on Stellar.
Tomer Weller designed and ran the load tests and provided technical guidance for this post.