The fallacy of proof of work as an information value signal

There has been much discussion over the last few months about the use of proof of work to indicate levels of importance of articles, emails and more. The argument is that if someone pays for proof of work then they must believe that a piece of information is important, and therefore you should too.

The whole idea springs from the belief that proof of work is a costly signal paid for by miners in order to prove that they are better at what they do than other miners. The fact that they could invest that much money into hash machinery must mean that their node is bigger and faster than the rest… right?

Well… no. I believe it’s actually quite the opposite, and in this article I will outline why.

What most people fail to understand where hash power is concerned is that it can be owned by anyone regardless of whether or not they control a node and that those owners have the ability to connect or disconnect it to any node they feel best represents their interests. These hash operators are free agents in a market for their labour and will work for the nodes who are make them the most money.

In other words, the hash power is not the costly signal, but the reward. Incentivising the most free agents to work their block templates as possible is the reason that node operators will create and display a whole array of costly signals.

To delve deeper, let’s consider things from the perspective of a hash operating free agent. We have access to reasonably cheap energy and a physical location to store the machines so we set up a small system. Say 5,000 ASIC miners. We don’t want to run a node, so our next best option is to connect to someone else’s.

The question for us is then: Which node is the best node?

For us to maximise the potential return of our investment, we must choose the node which we think will make the most of the work we can apply to its block template. There are a wide variety of nodes who will pay us for our attention and it’s our job to understand what the performance characteristics are that will guide us to the best node.

Let’s go through a few possible answers…

The node that finds the most blocks

The first thought that springs to mind is often ‘The node that finds the most blocks.’ However, it’s fairly easy to argue that this is not necessarily the case. The node that finds the most blocks might be the best node, but it might also be a poor node whose operator is covering for its bad performance by expending huge sums of money on directly controlled hash power. In this context, the node may be operated without efficiency or strategy and the hash power is applied like a blunt force weapon in a battle of wits. Sending our work their way just plays into their weakness, possibly costing us lost energy due to their inability to propagate their blocks properly.

So if it’s not the node that finds the most blocks, maybe it’s:

The node that wins the richest blocks

There has been a lot of discussion around the wavelike patterns in hash application that are expected to emerge as the block reward transitions away from the subsidy and onto fees as the primary source of revenue. Would it not be true then to say that ‘the node that wins the richest blocks’ is also the one that is wisest at applying its hash power? I would argue no, this is not the case. But first let me define ‘the richest blocks’…

As the fee market emerges as the main generator of income, nodes must seek the means to find blocks which are profitable. Remember, to achieve a 10-minute running average, for every block that takes 1 minute to discover there must be another block which took 19 minutes to find. Consider then, the state of a node’s block template one minute immediately after a block has been found. The rate at which transactions arrive for validation should remain reasonably constant within a 10 minute timeframe (although it will fluctuate on a 24 hour and even weekly or seasonal basis) and so a block template that only has 1 minute’s worth of transactions in it would contain only 1/19th of the transaction fees as a block that was 19 minutes old. Importantly, just as much proof of work is necessary to find the 1-minute block as the 19-minute block.

For the node operator who uses hash as a blunt weapon, this may be something they ignore, running their hash power all day regardless of the fee content as their only goal is to win the most blocks. This strategy works today, while the block subsidy is the vast majority of the block reward, but over time the operator must evolve or the blunt force weapon will find itself swinging against nothing, losing energy to enemies who are faster, lighter and who have a strategy.

In future, nodes will publish each iteration of their own block template to a pool. From that pool, hash operators will pick up the header and check the reward against the difficulty and determine whether that block is profitable for them to mine. The key thing here is that every hash machine operates in different circumstances. They cost different amounts to purchase and install. They use different amounts of energy to process hashes and the energy itself costs a different amount, and may even fluctuate at very short (< 5 minute) intervals.

There is more to this, but that’s just a small example of the considerations. In this way it becomes up to every individual hash machine operator to evaluate for themselves whether or not the latest block template would be profitable for them to mine. The equation is quite simple. The block contains a reward and a difficulty. This means that the reward per hash is reward over difficulty and if this amount is greater than their instantaneous cost per hash, the template is profitable. Turn the machine on and go! Easy right? Not so fast.

While the node that wins the richest blocks is probably making profit for its hash partners, it is not really the node itself which is determining the richness of the blocks it competes for, but the hash operators who have determined that this node’s template represented at that time their best most profitable option per unit of hash power applied. To this end, I foresee that a hash power operator might receive templates from a multitude of competing nodes and will be able to evaluate each one on a series of criteria from the block profitability, payment per share, node connectivity and more, and will choose where to point their workers on an instantaneous basis.

At this point it could then be argued that the node that wins the richest blocks is simply a poorly connected node whose owner has a focus on low cost operation and which aggregates the transactions nobody else wants to touch. This node probably gets increased amounts of hash power when they have managed to collect huge numbers of low-cost transactions and their block reward is too much of a prize to ignore even if there is a very high chance of blocks being orphaned. These higher orphan rates would skew the node’s results towards the blocks that are larger and more profitable to hash against but show the astute hash operator that their mid-priced blocks are potentially a risky proposition.

As hash operators build tools that reflect their incentives to dynamically route their effort across the network, they will be the ones who determine which block templates are rich enough to mine, meaning the node with the richest blocks is possibly not a good yard stick for us to measure the best node on the network.

This brings us to what I believe represents the true ‘best node’ and lays out the development agenda for node operators who want to compete in the space long term.

The node that has the fewest blocks orphaned

Orphan blocks are a natural outcome of block discovery on the Bitcoin network and occur when two or more nodes find competing block solutions at or around the same time.

When this happens, we get what is called an ‘Orphan Race’ which is the competition between the competing nodes to push their block across the network and have it validated first by the largest percentage of the other nodes. If the majority of the network judges it to be their ‘first seen’ block it will have the best chance of having the next block in the chain built upon it which is the first requirement in becoming an immutable part of the longest chain of proof of work.

The best way to lose the fewest orphan races is to be involved in the fewest possible orphan races. This is achieved through a two-fold strategy:

  1. Have the least possible energy spent finding solutions to blocks that compete with your own
  2. Spend the least possible energy finding solutions that compete with blocks found by others

To make sure the least possible energy is spent finding a competing block to one they have discovered, a node has to ensure they their block is seen and validated by the largest number of possible nodes. For this the node operator must have excellent connectivity to the other nodes in the network. This doesn’t just mean fat pipes with high speed and low latency, although they are part of the solution.

The node must work to ensure that as many of the other nodes as possible are aware of all of the transactions in their block. For every transaction another node doesn’t know about, there is a delay while that node requests the transaction, gets a response and performs a validation.

If the node has hundreds or even thousands of transactions that other nodes do not, this can have a disastrous impact on the node’s ability to win blocks due to the long delays involved in processing their blocks whenever they are found. This means choosing transactions that other nodes will validate and rejecting ones they don’t and which add delays to the block being accepted by the majority of the network.

For a real-world example of this, look no further than Mempool’s experiments with WeatherSV transactions. Initially, Mempool was mining the weather transactions for 0.2 satoshis per byte and propagating those transactions across the network, however every other node was ignoring them causing huge delays in validation every time Mempool found a block.

The situation was so bad initially that Mempool went from being a reliable 10% miner, finding between 12 and 17 blocks per day to finding zero blocks in the first 24 hours of the trial. After several attempts to finesse the situation, mempool landed on the strategy of limiting the WeatherSV transactions to 4MB per block, and eventually when the rest of the network began accepting and propagating transactions at 0.25 satoshis per byte, WeatherSV simply paid the premium to remove the friction their low fee deal was causing Mempool.

The second part of having the fewest competing blocks discovered after your own block is found is making sure that the most possible nodes are aware of the contents and structure of your block as you build it such that their validation process is as fast as possible. While it isn’t something many (or maybe any) nodes do yet, streaming block templates are a planned feature of the upcoming Teranode system, and one which I think will generate some of the most visible costly signalling on the network.

Nodes will have the ability to directly inform other nodes of not just which transactions are in their block, but the order in which they have been placed into the Merkle tree as well. This means that when a nonce is found that solves the difficulty problem, other nodes which already know 99.9% of the information needed to validate the block only need to process the remaining 0.1%. Against a node who does not use this system and who must propagate not just their solution, but the complete order of all the transactions in their block (and any whole transactions that other miners don’t have) and then wait while those nodes build their block to make sure it is valid, the connected node which is using the streaming block templates will almost always prevail.

Imagine a node which finds a solution to a 1 billion transaction block that takes other nodes around 20 seconds to validate because they don’t have the template. At any moment during that 30 seconds another node whose template the receiver was already aware of could step in and steal its thunder. That’s a 5% chance of being orphaned every single time a block is found. If that node controls 10% of the network hash, that represents one lost block almost every days vs a node who is telling everyone exactly what they are doing.

Finally, in order to have the fastest possible awareness of other blocks being found, the node must also have the most knowledge of what the other nodes’ block templates are. Like the node who pushes their block template out to the most possible peers, the node who retains the most knowledge of its peers has the ability to validate almost any solution they are presented with in a very short period of time, giving them the ability to update their block template to one that excludes any of the transactions in the block that was just found, ensuring that the free agents providing them hash power have the lowest possible chance of hashing against a block template that has been superseded.

So, one can see that in a world where the nodes are relying on fees and blocks are very large that proof of work is not in fact the costly signal that nodes display but is in fact the reward a node is given for being the most well connected and having the best knowledge of other nodes on the network. A node that wins a lot of blocks might only win them because they pay for the hash themselves, and might lose a lot of orphan races with better connected nodes.

Similarly, a node that wins fewer blocks but which are very rich in fees might not only be badly connected but also unable to pay for its own hash to mask the fact. They still win blocks because their strategy is to generate the largest possible fee pools and win hash for the big payouts.

It is the node whose infrastructure is best connected and who has the best knowledge of what their competitors are doing at any instantaneous moment who will receive my vote for being the node who loses the fewest blocks for it is in the lack of losses that their hash providers find the most gains. As their hash providers will determine that their block is profitable to mine earlier than the node who seeks the outsized rewards because of the lower risk they will naturally find less of the blocks carrying huge rewards and more blocks that carry reasonable rewards but a very low chance of being lost to an orphan race.

So it is the node’s connectivity and ability to compute the activity of others in real-time which become the costly signals to display. Hash power is the reward a node gets for having the best fitness, and the least wasted proof of work is the prize the hash operators get for choosing the node with the best array of costly signals.

Streaming block templates will be very intensive, costly and expensive to run. For example, if a node shares information with 10 other nodes, and each block has an average of 1,000,000,000 transactions in it, that node would need a Merkle tree 64GB large for each node they are indexing, which they would need to be updating and managing constantly.

Streaming block templates may prove one of the most interesting and challenging computing problems to be solved in Bitcoin, but the node operators who master it will be rewarded with less wasted work and the gravitation of a larger proportion of the free agent hash rate thanks to the costly signals they are demonstrating.

Yeah but what does this have to do with other applications of Proof of Work?

Well I would say this has everything to do with other applications of proof of work, as it shows that they have been designed with an incomplete understanding of what costly signals are in the block competition as it operates in Bitcoin.

The whole basis of their application is a misunderstanding of the proof of work as a costly signal rather than as the reward received for displaying the costly signals, and as such carry forward this misunderstanding into these other applications of proof of work.

Proof of work works in the process of building blocks because it is cumulative. Every single block added to the longest valid chain is a demonstration that the node which built it deemed the previous block to have been the first valid extension it saw on the previous solution. It implies that the node agrees that the previous block was mathematically correct, legally just and had its own proof of work in-line with the requirements of the network at the time.

While the block subsidy is in play, the network will continue to be defined by the race to gather the most hash power. As the race transitions to larger blocks, the incentives will transition to the construction of faster networks, storage and computing systems to enable the real-time systems needed. These are the costly signals on display.

Proof of work comes as a reward from outside the node and its application is dynamic and cut throat. A node whose blocks are orphaned too often due to poor connectivity will lose hash power quickly. A node whose blocks are rejected due to bad assembly or illegal transactions will lose hash power even faster. These outcomes incentivise nodes to treat their hash providers with respect and to work hard to prove that they are the most worthy node in the network.

The application of proof of work to random information has none of this intricacy. It is a blunt weapon being used in an information war where knowledge of user preferences and strong relationships are the order of the day. Why would I care about a piece of information that ranked highly because someone had spent $10,000 building a big number against it? What can I build upon that big number? What does this big number secure me against?

The Bitcoin block chain is secured through the accumulation of proof of work which makes it very hard to attack the chain. It works because it is used to validate a singular data structure—the chain of blocks itself. If I apply it to a picture or a meme, what exactly am I signalling? If you show me a meme that has had proof of work applied to it why should I care? Not only does this proof of work not build on other work, but there is no way to build upon it. So what purpose does it serve? I may be a contrarian in this space, but as far as I can tell, all it does is lines the pockets of the miners who convince people it’s important while giving everyone else a confusing indication of who was willing to burn the most money for attention.

I, for one, don’t care to watch a money fire. Where in all of this is the reward for me? When I see information, I want it to be ranked based on MY preferences. Based on what I want to see and where I want to put my attention. I don’t care about magic numbers and I don’t see where they add value to my life. If you want my attention, deliver information I can use, not magic numbers you paid for.

New to Bitcoin? Check out CoinGeek’s Bitcoin for Beginners section, the ultimate resource guide to learn more about Bitcoin—as originally envisioned by Satoshi Nakamoto—and blockchain.