BSV as a virtual machine

BSV as a virtual machine

This week, we explore some more technical aspects behind Bitcoin BSV, and the lessor known under-the-hood components that are more of an interest to students of computer science. We will talk about compilers, byte code and BSV as a Computer. So if this is your cup of tea, suit up, grab your shield and sword, valiant knight, for we be going to slay dragons today!

When I was in university, there was a special room in the Mathematics building, where the computer science dept was located, with an ominous sign above the door that read: “Abandon all hope all ye who enter here!”

This was, of course, the home of the computer science club… a recluse group of geeks who have an unhealthy penchant for model train sets. In fact, one of the rooms featured a giant model set, which was part of a fourth-year course on real-time operating systems. At the time, I harbored a general aversion towards the denizens who frequented that room, partly because they often spoke in a language that I could not comprehend, though mostly because of the ungodly smell that permeated from its dank interiors.

What I didn’t know at the time, was that the mystical arts being practiced within, would be something that I would become vastly engrossed in, some 25 years later in life… namely, that dark art, of computer language compilers. Why are compilers important? Because BSV is a computer. More specifically, it is a virtual machine. Back in the early days of computing, every CPU had a different instruction set, and compilers had the difficult task of translating high level computer code into a different form of machine code which was specific to the computer hardware on which you wanted the program to run. This was largely the reason that it was such a ‘dark art’ and the expert practitioners in the field were often deemed ‘wizards.’ (And appropriately as such, the low earthly standards of common personal hygiene no longer seemed to concern them).

In recent years, the concept of a virtual machine, that is, a simulated machine, with a common byte code language has been popularized and vastly simplified the task of language compilation. For no longer did compilers need to translate languages to each individual machine code, but just to an interim byte code, which was designed universal enough such that specific virtual machines implementations which processed this universal byte code could be run on top of any specific hardware, eliminating the one too many translational task that compilers needed to support.

The most common success case for the use of a virtual machine can be seen in Java, still the most popular language for enterprise application development. Most recently with projects like LLVM and WebAssembly, the idea that languages should build to an intermediate representation (or IR), which can then be separately compiled to specific hardware or executed directly on a VM has really highlighted the power of this model.

So how does Bitcoin fit into all of this? Well, as mentioned, BSV is a virtual machine itself. The nodes that validate transactions on the blockchain all support an instruction set which is Turing complete—so long as there are no theoretical limits on the script sizes—which means that you can program the network to calculate anything computable. This bitcoin script can be thought of as BSV’s byte code. And programming to it is a simple task of developing compilers which can turn high-level languages into BSVs byte code. Some projects like Scrypt are already making great headway in this space.

However, compared to standard FORTH, BSV VM byte code lacks several key features (OP codes):

  • OP_CALL or the ability to pass execution to subroutines
  • OP_LOOP or the ability to re-execute given blocks of code
  • Identifiers
  • Direct memory allocation/access

Without these seemingly essential features in the language, how can the machine be Turing Complete? Well, suffice it to say, these are questions best posed to experts in computer science, who no doubt would be able to more succinctly answer these questions in terms that would satisfy the most discerning experts. I am not one such person. However, in simple layman terms, each one of these features is but a mere convenience tool rather than an essential requirement for the operation of a general computer. For instance: being able to call to externally defined subroutines (OP_CALL) can be side-stepped by replacing the call with the subroutine code itself inline.

The need for looping (OP_LOOP) can be replaced by just iterating the loop code ad verbatim in the program, with appropriate iteration variables put onto the stack, indeed this is how much recursive looping is implemented under the covers anyway. Without the need to call subroutines, identifiers are not needed, given that all variables can be replaced by their run time values, something veteran FORTH programmers already are accustomed to, as locals are rarely if ever named.

Finally, the need for direct memory access can be avoided if all temporary storage needs can be accommodated by the use of the main and alternate program stack. In fact, of the three essential features of structured programming: Sequencing, Selection, and Repetition, BSV supports the first two natively, and can mimic the third.

Of all the restrictions the last one restricting random memory access would seem to be the most restrictive, as this restricts programs to work within their own temporary sandbox, unable to take advantage of the memory and storage resources of the hardware platform that they may happen to be running on, but this actually is an advantage, if you consider that you want the programs to be run universally across a heterogeneous network of machines from high-powered servers in a data center to lowly Raspberry Pi’s, or even low-powered embedded devices.

Forcing this restriction on a program, means that any and all information that a program needs to access must be supplied to it in the transaction that the program’s UTXO resides in, the transaction which is ‘spending’ it, which in this case would be more appropriate to say ‘executing’ it, which is made possible with smart contracting protocols on BSV such as STAS which allow for reflection, introspection and spending conditions imposed the transaction’s own outputs. As we have seen the success of this computational model, over the last few decades with browsers and Javascript, the next iteration may be with BSV and Bitcoin compute nodes.

As people start to realize that Bitcoin is just a new kind of virtual machine, inevitably the clever ones among us will start to work out how to efficiently program it to do useful things. One of the most promising evolutions in computation models in recent times is that of MapReduce, or what is sometimes called distributed data oriented programming. Basically, it involves a given function or process which is ‘mapped’ over a large dataset, with a way to collect/aggregate the results from multiple parallel executors. The Bitcoin network, with its dynamically adjusting number of nodes and executors, seems well posed to be leveraged as a giant pool of compute task workers all ready and willing to execute code over public datasets, producing results written to the blockchain—for a price, of course.

New to blockchain? Check out CoinGeek’s Blockchain for Beginners section, the ultimate resource guide to learn more about blockchain technology.