Introduction

In this tutorial we will be looking into the scripting language used by bitcoin. Bitcoin script is a simple forth-like stack based language, which in simple terms means that it operates using a first-in-last-out principle (FILO) stack based data structure.

Background

Bitcoin is a mammoth project consisting of various concepts. By breaking these down into smaller chunks, or a separation of concerns approach, we get a better understanding of how the internals work without getting too overwhelmed.

Let's get started  

For the purpose of simplicity, we will be evaluating our scripts puzzles using a tool called btcdeb, or the Bitcoin Script Debugger as the git author kallewoof refers to it. We are currently working on adding btcdeb to our online command line interface sandboxing environment, so in the meantime, you will have to install this yourself.

By using btcdeb, we are able to separate ourselves from having to think about all the other components related to bitcoin and focus directly on the fundamental concepts behind learning bitcoin script. We will cover a more advanced version of using btcdeb for debugging more complicated examples in a future tutorial.

So let's execute our first bitcoin script!

gr0kchain:~ $ btcdeb ['OP_2 OP_1 OP_ADD']
btcdeb -- type `btcdeb -h` for start up options
valid script
3 op script loaded. type `help` for usage information
script  |  stack
--------+--------
2       |
1       |
OP_ADD  |
#0000 2
btcdeb> step
		<> PUSH stack 02
script  |  stack
--------+--------
1       |      02
OP_ADD  |
#0001 1
btcdeb> step
		<> PUSH stack 01
script  |  stack
--------+--------
OP_ADD  |      01
        |      02
#0002 OP_ADD
btcdeb> step
		<> POP  stack
		<> POP  stack
		<> PUSH stack 03
script  |  stack
--------+--------
        |      03
btcdeb>

In this example, we are invoking the btcdeb command from the command line interface, then using the exec command to execute the following script.

OP_2 OP_1 OP_ADD

This is a fairly straight forward arithmetic operation which adds 1 and 2 together which evaluates to 3. The first thing to notice might be the strange sequence in which this is performed, this should become more clear once we dive into the internals of how bitcoin script is interpreted. Let's visualise what this means.

Imagine a stack of books, one placed on top of another as follows.

As we can see here, stacking these on top of one another follows what we call a first-in-last-out stack, meaning, the sequence in which books are removed from the stack is in the reverse order compared to how they were added. This operation for adding a book is typically referred to as pushing items onto the stack. Removing a book from the stack would result in the top book (last one added) to be removed first, and therefore the last item to be removed will be the book at the bottom of the stack, a process we call popping items from the stack.

Another thing to observe when looking at our original script are the values prefixed with the term OP_. These are what we refer to as opcodes, or operation codes in bitcoin script. Operation codes in context of our stack of books can be described by associating various definitions for each colour of the books in our stack. Let's imagine we assign values to our books as follows:

Green = 1

Blue = 2

Purple = addition (+)

We now have a simple vocabulary for doing some simple addition using a stack! When we stack our books in the following order we end up with something resembling the following.

Green
Blue
Purple

When we translate this using our vocabulary we have the following stack.

1
2
+

For us to evaluate this however, we'll need an additional stack (another stack of books) which can be used to execute this expression step by step. We'll call these our script stack and execution stack respectively.

Script Stack Execution Stack
1
2
+ <Empty>

We can now start moving items from one stack to the other.

First we pop an item from the Script Stack and then push it onto our Execution Stack. So as the first step, we pop the value 1 from our script stack and push it onto the execution stack as follows.

Script Stack Execution Stack
2
+ 1

We then pop the value 2 from our script stack and push this onto the execution stack.

Script Stack Execution Stack
2
+ 1

And finally we pop the value + from our script stack to our execution stack.

Script Stack Execution Stack
+
2
1

Neat! We now have an inverse of our original stack! This is the basic principle when we refer to stacks, or stack based data structures. Let's take this one step further however and separate our vocabulary into operational and numerals types.

Green = 1 - (numeral) When encountered pop from script stack, and push onto the execution stack

Blue = 2 - (numeral) When encountered pop from script stack, and push onto the execution stack

Purple = addition (+ operational) - When encountered pop two items from the execution stack add them together, then push the result back onto the execution stack.

As before, we can now repeat the previous process as follows.

First we pop an item from the script stack and then push it onto our Execution Stack. So as the first item, we pop the value 1 from our script stack and push it onto the execution stack as follows.

Script Stack Execution Stack
1
2
+ <Empty>

We then pop the value 2 from our script stack and push this onto the execution stack.

Script Stack Execution Stack
2
+ 1

Based on the new rules we added to our vocabulary, whenever we encounter the + or Red book, we now need to pop two items from the execution stack, add them together, and push the result back onto the execution stack. This would result in the following.

First we pop the + from the script stack, our rules then indicate that we pop the top 2 elements from the execution stack, add them together, and push the result back onto the execution stack.

Script Stack Execution Stack
<EMPTY> 3

It's that simple :D

Now let's go back to our original bitcoin script.

OP_2 OP_1 OP_ADD

Here our vocabulary of operations are OP_2, OP_1 and OP_ADD, which can be translated to our previous book example as follows.

Green = OP_2

Blue = OP_1

Purple = addition OP_ADD

Using the same stack based operations we covered before, we can use this to do some simple arithmetic. Bitcoin script defines a list of opcodes for more advanced operations categorised into constants, flow control, stack, splice, bitwise logic, arithmetic, crypto, locktime, pseudo-words and reserved words respectively, each with their own rules.  For a list of these checkout the bitcoin wiki.

Back to btcdeb

In the introduction of this tutorial we said that "Bitcoin script is a simple forth-like stack based language, basically meaning that it operates using a first in last out principle (FILO).", which hopefully makes sense now based on our previous examples.

Now that we have the basics covered, let's use btcdeb to explore some more examples.

OP_6 OP_2 OP_SUB OP_4 OP_EQUAL

Here we will be subtracting 2 from 6, then testing our result to see if it equals 4. To execute this using btcdeb, we execute the following and pass in the script we would like to execute as the first argument.

gr0kchain:~ $ btcdeb ['OP_6 OP_2 OP_SUB OP_4 OP_EQUAL']
Note:
You can also load the script using the exec command after invoking btcdeb.

You should be presented with the following output.

btcdeb -- type `btcdeb -h` for start up options
valid script
5 op script loaded. type `help` for usage information
script   |  stack
---------+--------
6        |
2        |
OP_SUB   |
4        |
OP_EQUAL |
#0000 6
btcdeb>

Here we can see that two stacks have been created for us nl. our script and stack (execution). Our bitcoin script was pushed onto the script stack in the reverse sequence as it was presented, where OP_EQUAL is added first, followed by OP_4 and OP_SUB, OP_2 and finally OP_6.

To start evaluating our stack, we use the step command in btcdeb.

```console
btcdeb> step
		<> PUSH stack 06
script   |  stack
---------+--------
2        |      06
OP_SUB   |
4        |
OP_EQUAL |
#0001 2
btcdeb>
```
Hint:
You can use the rewind command to step backwards.

Here we see the rules of our opcodes kicking in, where the top element of our script column is being popped, and then pushed onto the stack column. Let's continue by executing the next step in the process.

```console
btcdeb> step
		<> PUSH stack 02
script   |  stack
---------+--------
OP_SUB   |      02
4        |      06
OP_EQUAL |
#0002 OP_SUB
btcdeb>
```

As before OP_2 was popped from the script stack and pushed onto the stack.

btcdeb> step
		<> POP  stack
		<> POP  stack
		<> PUSH stack 04
script   |  stack
---------+--------
4        |      04
OP_EQUAL |
#0003 4
btcdeb>

Here OP_SUB was popped from the script stack, and OP_2 and OP_6 where popped from the stack, subtracted as 6 - 2, and the result pushed back onto our stack as 04 or OP_4.

The next operation we pop the value 4 from the script stack, and push it onto the stack.

btcdeb> step
		<> PUSH stack 04
script   |  stack
---------+--------
OP_EQUAL |      04
         |      04
#0004 OP_EQUAL
btcdeb>

And finally, we check wether or not our aritmatic was correct by comparing the last two items on our stack using the OP_EQUAL opcode. Bitcoin transactions are considered valid if the last element on the stack is true. We will look into this in future tutorials, but for now, consider this as having met the conditions for a valid transaction!

btcdeb> step
		<> POP  stack
		<> POP  stack
		<> PUSH stack 01
script   |  stack
---------+--------
         |      01
btcdeb>

Now that you have the basics, try playing around with some additional opcodes to familiarise yourself with the concepts behind them. Also take a look at some of the interesting script puzzles out there, like those bitcoin core developer Peter Todd has published.

For more information on using btcdeb, checkout the github repo, or execute help from the command line.

gr0kchain:~ $ btcdeb --help
syntax: btcdeb [-q|--quiet] [--tx=[amount1,amount2,..:]<hex> [--txin=<hex>] [--modify-flags=<flags>|-f<flags>] [--select=<index>|-s<index>] [<script> [<stack bottom item> [... [<stack top item>]]]]]
if executed with no arguments, an empty script and empty stack is provided
to debug transaction signatures, you need to provide the transaction hex (the WHOLE hex, not just the txid) as well as (SegWit only) every amount for the inputs
e.g. if a SegWit transaction abc123... has 2 inputs of 0.1 btc and 0.002 btc, you would do tx=0.1,0.002:abc123...
you do not need the amounts for non-SegWit transactions
by providing a txin as well as a tx and no script or stack, btcdeb will attempt to set up a debug session for the verification of the given input by pulling the appropriate values out of the respective transactions. you do not need amounts for --tx in this case
you can modify verification flags using the --modify-flags command. separate flags using comma (,). prefix with + to enable, - to disable. e.g. --modify-flags="-NULLDUMMY,-MINIMALIF"
the standard (enabled by default) flags are:
・ P2SH
・ STRICTENC
・ DERSIG
・ LOW_S
・ NULLDUMMY
・ MINIMALDATA
・ DISCOURAGE_UPGRADABLE_NOPS
・ CLEANSTACK
・ CHECKLOCKTIMEVERIFY
・ CHECKSEQUENCEVERIFY
・ WITNESS
・ DISCOURAGE_UPGRADABLE_WITNESS_PROGRAM
・ MINIMALIF
・ NULLFAIL
・ WITNESS_PUBKEYTYPE```

Help from within the interactive btcdeb command line interface (cli).

gr0kchain:~ $ btcdeb 
btcdeb -- type `btcdeb -h` for start up options
0 op script loaded. type `help` for usage information
script  |  stack
--------+--------
btcdeb> help
step     Execute one instruction and iterate in the script.
rewind   Go back in time one instruction.
stack    Print stack content.
altstack Print altstack content.
vfexec   Print vfexec content.
exec     Execute command.
tf       Transform a value using a given function.
print    Print script.
help     Show help information.
btcdeb>

And some other useful utility functions!

gr0kchain:~ $ btcdeb
btcdeb -- type `btcdeb -h` for start up options
0 op script loaded. type `help` for usage information
script  |  stack
--------+--------
btcdeb> tf -h
echo             [*]       show as-is serialized value
hex              [*]       convert into a hex string
int              [arg]     convert into an integer
reverse          [arg]     reverse the value according to the type
sha256           [message] perform SHA256
ripemd160        [message] perform RIPEMD160
hash256          [message] perform HASH256 (SHA256(SHA256(message))
hash160          [message] perform HASH160 (RIPEMD160(SHA256(message))
base58chk-encode [pubkey]  encode [pubkey] using base58 encoding (with checksum)
base58chk-decode [string]  decode [string] into a pubkey using base58 encoding (with checksum)
bech32-encode    [pubkey]  encode [pubkey] using bech32 encoding
bech32-decode    [string]  decode [string] into a pubkey using bech32 encoding
verify-sig       [sighash] [pubkey] [signature] verify the given signature for the given sighash and pubkey
combine-pubkeys  [pubkey1] [pubkey2] combine the two pubkeys into one pubkey
tweak-pubkey     [value] [pubkey] multiply the pubkey with the given 32 byte value
addr-to-scriptpubkey [address] convert a base58 encoded address into its corresponding scriptPubKey
scriptpubkey-to-addr [script]  convert a scriptPubKey into its corresponding base58 encoded address
add              [value1] [value2] add two values together
sub              [value1] [value2] subtract value2 from value1
btcdeb>

For convenience, I've also put together a reference sheet of opcodes. Note that not all of these are enabled or supported in btcdeb.

Dec    Hex  Opcode
--------------------------
   0   0x0  OP_0 
  76  0x4c  OP_PUSHDATA1 
  77  0x4d  OP_PUSHDATA2 
  78  0x4e  OP_PUSHDATA4 
  79  0x4f  OP_1NEGATE 
  80  0x50  OP_RESERVED 
  81  0x51  OP_1 
  82  0x52  OP_2 
  83  0x53  OP_3 
  84  0x54  OP_4 
  85  0x55  OP_5 
  86  0x56  OP_6 
  87  0x57  OP_7 
  88  0x58  OP_8 
  89  0x59  OP_9 
  90  0x5a  OP_10 
  91  0x5b  OP_11 
  92  0x5c  OP_12 
  93  0x5d  OP_13 
  94  0x5e  OP_14 
  95  0x5f  OP_15 
  96  0x60  OP_16 
  97  0x61  OP_NOP 
  98  0x62  OP_VER 
  99  0x63  OP_IF 
 100  0x64  OP_NOTIF 
 101  0x65  OP_VERIF 
 102  0x66  OP_VERNOTIF 
 103  0x67  OP_ELSE 
 104  0x68  OP_ENDIF 
 105  0x69  OP_VERIFY 
 106  0x6a  OP_RETURN 
 107  0x6b  OP_TOALTSTACK 
 108  0x6c  OP_FROMALTSTACK 
 109  0x6d  OP_2DROP 
 110  0x6e  OP_2DUP 
 111  0x6f  OP_3DUP 
 112  0x70  OP_2OVER 
 113  0x71  OP_2ROT 
 114  0x72  OP_2SWAP 
 115  0x73  OP_IFDUP 
 116  0x74  OP_DEPTH 
 117  0x75  OP_DROP 
 118  0x76  OP_DUP 
 119  0x77  OP_NIP 
 120  0x78  OP_OVER 
 121  0x79  OP_PICK 
 122  0x7a  OP_ROLL 
 123  0x7b  OP_ROT 
 124  0x7c  OP_SWAP 
 125  0x7d  OP_TUCK 
 126  0x7e  OP_CAT - Disabled
 127  0x7f  OP_SUBSTR - Disabled
 128  0x80  OP_LEFT - Disabled
 129  0x81  OP_RIGHT - Disabled
 130  0x82  OP_SIZE 
 131  0x83  OP_INVERT - Disabled 
 132  0x84  OP_AND - Disabled
 133  0x85  OP_OR - Disabled
 134  0x86  OP_XOR - Disabled
 135  0x87  OP_EQUAL 
 136  0x88  OP_EQUALVERIFY 
 137  0x89  OP_RESERVED1 
 138  0x8a  OP_RESERVED2 
 139  0x8b  OP_1ADD 
 140  0x8c  OP_1SUB 
 141  0x8d  OP_2MUL - Disabled
 142  0x8e  OP_2DIV - Disabled
 143  0x8f  OP_NEGATE 
 144  0x90  OP_ABS 
 145  0x91  OP_NOT 
 146  0x92  OP_0NOTEQUAL 
 147  0x93  OP_ADD 
 148  0x94  OP_SUB 
 149  0x95  OP_MUL - Disabled
 150  0x96  OP_DIV - Disabled
 151  0x97  OP_MOD - Disabled
 152  0x98  OP_LSHIFT - Disabled
 153  0x99  OP_RSHIFT - Disabled
 154  0x9a  OP_BOOLAND 
 155  0x9b  OP_BOOLOR 
 156  0x9c  OP_NUMEQUAL 
 157  0x9d  OP_NUMEQUALVERIFY 
 158  0x9e  OP_NUMNOTEQUAL 
 159  0x9f  OP_LESSTHAN 
 160  0xa0  OP_GREATERTHAN 
 161  0xa1  OP_LESSTHANOREQUAL 
 162  0xa2  OP_GREATERTHANOREQUAL 
 163  0xa3  OP_MIN 
 164  0xa4  OP_MAX 
 165  0xa5  OP_WITHIN 
 166  0xa6  OP_RIPEMD160 
 167  0xa7  OP_SHA1 
 168  0xa8  OP_SHA256 
 169  0xa9  OP_HASH160 
 170  0xaa  OP_HASH256 
 171  0xab  OP_CODESEPARATOR 
 172  0xac  OP_CHECKSIG 
 173  0xad  OP_CHECKSIGVERIFY 
 174  0xae  OP_CHECKMULTISIG 
 175  0xaf  OP_CHECKMULTISIGVERIFY 
 176  0xb0  OP_NOP1 
 177  0xb1  OP_CHECKLOCKTIMEVERIFY 
 178  0xb2  OP_CHECKSEQUENCEVERIFY 
 179  0xb3  OP_NOP4 
 180  0xb4  OP_NOP5 
 181  0xb5  OP_NOP6 
 182  0xb6  OP_NOP7 
 183  0xb7  OP_NOP8 
 184  0xb8  OP_NOP9 
 185  0xb9  OP_NOP10 
 255  0xff  OP_INVALIDOPCODE 

Happy hacking fellow bitcoiner!

Conclusion

In this tutorial we had a look at the fundamental concepts underpinning bitcoins scripting language. We defined a process of operational codes (opcodes) and how they are evaluated using a stack based data structure.