Introduction
In this tutorial we will be looking into the scripting language used by bitcoin. Bitcoin script is a simple forth-like stack based language, which in simple terms means that it operates using a first-in-last-out principle (FILO) stack based data structure.
Background
Bitcoin is a mammoth project consisting of various concepts. By breaking these down into smaller chunks, or a separation of concerns approach, we get a better understanding of how the internals work without getting too overwhelmed.
Let's get started
For the purpose of simplicity, we will be evaluating our scripts puzzles
using a tool called btcdeb
, or the Bitcoin Script Debugger as the git author kallewoof
refers to it. We are currently working on adding btcdeb
to our online command line interface sandboxing environment, so in the meantime, you will have to install this yourself.
By using btcdeb
, we are able to separate ourselves from having to think about all the other components related to bitcoin and focus directly on the fundamental concepts behind learning bitcoin script. We will cover a more advanced version of using btcdeb
for debugging more complicated examples in a future tutorial.
So let's execute our first bitcoin script!
gr0kchain:~ $ btcdeb ['OP_2 OP_1 OP_ADD']
btcdeb -- type `btcdeb -h` for start up options
valid script
3 op script loaded. type `help` for usage information
script | stack
--------+--------
2 |
1 |
OP_ADD |
#0000 2
btcdeb> step
<> PUSH stack 02
script | stack
--------+--------
1 | 02
OP_ADD |
#0001 1
btcdeb> step
<> PUSH stack 01
script | stack
--------+--------
OP_ADD | 01
| 02
#0002 OP_ADD
btcdeb> step
<> POP stack
<> POP stack
<> PUSH stack 03
script | stack
--------+--------
| 03
btcdeb>
In this example, we are invoking the btcdeb command from the command line interface, then using the exec
command to execute the following script.
OP_2 OP_1 OP_ADD
This is a fairly straight forward arithmetic operation which adds 1
and 2
together which evaluates to 3
. The first thing to notice might be the strange sequence in which this is performed, this should become more clear once we dive into the internals of how bitcoin script is interpreted. Let's visualise what this means.
Imagine a stack of books, one placed on top of another as follows.
As we can see here, stacking these on top of one another follows what we call a first-in-last-out stack, meaning, the sequence in which books are removed from the stack is in the reverse order compared to how they were added. This operation for adding a book is typically referred to as pushing items onto the stack. Removing a book from the stack would result in the top book (last one added) to be removed first, and therefore the last item to be removed will be the book at the bottom of the stack, a process we call popping items from the stack.
Another thing to observe when looking at our original script are the values prefixed with the term OP_
. These are what we refer to as opcodes
, or operation codes in bitcoin script. Operation codes in context of our stack of books can be described by associating various definitions for each colour of the books in our stack. Let's imagine we assign values to our books as follows:
Green = 1
Blue = 2
Purple = addition (+)
We now have a simple vocabulary for doing some simple addition using a stack! When we stack our books in the following order we end up with something resembling the following.
Green
Blue
Purple
When we translate this using our vocabulary we have the following stack.
1
2
+
For us to evaluate this however, we'll need an additional stack (another stack of books) which can be used to execute this expression step by step. We'll call these our script stack
and execution stack
respectively.
Script Stack | Execution Stack |
1 | |
2 | |
+ | <Empty> |
We can now start moving items from one stack to the other.
First we pop an item from the Script Stack
and then push it onto our Execution Stack
. So as the first step
, we pop the value 1
from our script stack
and push it onto the execution stack
as follows.
Script Stack | Execution Stack |
2 | |
+ | 1 |
We then pop the value 2
from our script stack
and push this onto the execution stack
.
Script Stack | Execution Stack |
2 | |
+ | 1 |
And finally we pop the value +
from our script stack
to our execution stack
.
Script Stack | Execution Stack |
+ | |
2 | |
1 |
Neat! We now have an inverse of our original stack! This is the basic principle when we refer to stacks, or stack based data structures. Let's take this one step further however and separate our vocabulary into operational
and numerals
types.
Green = 1 - (numeral) When encountered pop from script stack
, and push onto the execution stack
Blue = 2 - (numeral) When encountered pop from script stack
, and push onto the execution stack
Purple = addition (+ operational) - When encountered pop two items from the execution stack
add them together, then push the result back onto the execution stack
.
As before, we can now repeat the previous process as follows.
First we pop an item from the script stack
and then push it onto our Execution Stack
. So as the first item, we pop the value 1
from our script stack
and push it onto the execution stack
as follows.
Script Stack | Execution Stack |
1 | |
2 | |
+ | <Empty> |
We then pop the value 2
from our script stack
and push this onto the execution stack
.
Script Stack | Execution Stack |
2 | |
+ | 1 |
Based on the new rules we added to our vocabulary, whenever we encounter the +
or Red book, we now need to pop
two items from the execution stack
, add them together, and push the result back onto the execution stack
. This would result in the following.
First we pop the +
from the script stack
, our rules then indicate that we pop the top 2
elements from the execution stack
, add them together, and push the result back onto the execution stack
.
Script Stack | Execution Stack |
<EMPTY> | 3 |
It's that simple :D
Now let's go back to our original bitcoin script.
OP_2 OP_1 OP_ADD
Here our vocabulary of operations are OP_2
, OP_1
and OP_ADD
, which can be translated to our previous book example as follows.
Green = OP_2
Blue = OP_1
Purple = addition OP_ADD
Using the same stack based operations we covered before, we can use this to do some simple arithmetic. Bitcoin script defines a list of opcodes for more advanced operations categorised into constants, flow control, stack, splice, bitwise logic, arithmetic, crypto, locktime, pseudo-words and reserved words respectively, each with their own rules. For a list of these checkout the bitcoin wiki.
Back to btcdeb
In the introduction of this tutorial we said that "Bitcoin script is a simple forth-like stack based language, basically meaning that it operates using a first in last out principle (FILO).", which hopefully makes sense now based on our previous examples.
Now that we have the basics covered, let's use btcdeb
to explore some more examples.
OP_6 OP_2 OP_SUB OP_4 OP_EQUAL
Here we will be subtracting 2 from 6, then testing our result to see if it equals 4. To execute this using btcdeb
, we execute the following and pass in the script
we would like to execute as the first argument.
gr0kchain:~ $ btcdeb ['OP_6 OP_2 OP_SUB OP_4 OP_EQUAL']
Note:
You can also load the script using theexec
command after invokingbtcdeb
.
You should be presented with the following output.
btcdeb -- type `btcdeb -h` for start up options
valid script
5 op script loaded. type `help` for usage information
script | stack
---------+--------
6 |
2 |
OP_SUB |
4 |
OP_EQUAL |
#0000 6
btcdeb>
Here we can see that two stacks have been created for us nl. our script
and stack
(execution). Our bitcoin script was pushed onto the script
stack in the reverse sequence as it was presented, where OP_EQUAL
is added first, followed by OP_4
and OP_SUB
, OP_2
and finally OP_6
.
To start evaluating our stack, we use the step
command in btcdeb
.
```console
btcdeb> step
<> PUSH stack 06
script | stack
---------+--------
2 | 06
OP_SUB |
4 |
OP_EQUAL |
#0001 2
btcdeb>
```
Hint:
You can use therewind
command to step backwards.
Here we see the rules of our opcodes kicking in, where the top element of our script
column is being popped, and then pushed onto the stack
column. Let's continue by executing the next step
in the process.
```console
btcdeb> step
<> PUSH stack 02
script | stack
---------+--------
OP_SUB | 02
4 | 06
OP_EQUAL |
#0002 OP_SUB
btcdeb>
```
As before OP_2
was popped from the script
stack and pushed
onto the stack
.
btcdeb> step
<> POP stack
<> POP stack
<> PUSH stack 04
script | stack
---------+--------
4 | 04
OP_EQUAL |
#0003 4
btcdeb>
Here OP_SUB
was popped from the script
stack, and OP_2
and OP_6
where popped from the stack
, subtracted as 6 - 2
, and the result pushed back onto our stack
as 04
or OP_4
.
The next operation we pop the value 4
from the script
stack, and push it onto the stack
.
btcdeb> step
<> PUSH stack 04
script | stack
---------+--------
OP_EQUAL | 04
| 04
#0004 OP_EQUAL
btcdeb>
And finally, we check wether or not our aritmatic was correct by comparing the last two items on our stack
using the OP_EQUAL
opcode. Bitcoin transactions are considered valid if the last element on the stack
is true. We will look into this in future tutorials, but for now, consider this as having met the conditions for a valid transaction!
btcdeb> step
<> POP stack
<> POP stack
<> PUSH stack 01
script | stack
---------+--------
| 01
btcdeb>
Now that you have the basics, try playing around with some additional opcodes
to familiarise yourself with the concepts behind them. Also take a look at some of the interesting script puzzles out there, like those bitcoin core developer Peter Todd has published.
For more information on using btcdeb
, checkout the github repo, or execute help from the command line.
gr0kchain:~ $ btcdeb --help
syntax: btcdeb [-q|--quiet] [--tx=[amount1,amount2,..:]<hex> [--txin=<hex>] [--modify-flags=<flags>|-f<flags>] [--select=<index>|-s<index>] [<script> [<stack bottom item> [... [<stack top item>]]]]]
if executed with no arguments, an empty script and empty stack is provided
to debug transaction signatures, you need to provide the transaction hex (the WHOLE hex, not just the txid) as well as (SegWit only) every amount for the inputs
e.g. if a SegWit transaction abc123... has 2 inputs of 0.1 btc and 0.002 btc, you would do tx=0.1,0.002:abc123...
you do not need the amounts for non-SegWit transactions
by providing a txin as well as a tx and no script or stack, btcdeb will attempt to set up a debug session for the verification of the given input by pulling the appropriate values out of the respective transactions. you do not need amounts for --tx in this case
you can modify verification flags using the --modify-flags command. separate flags using comma (,). prefix with + to enable, - to disable. e.g. --modify-flags="-NULLDUMMY,-MINIMALIF"
the standard (enabled by default) flags are:
・ P2SH
・ STRICTENC
・ DERSIG
・ LOW_S
・ NULLDUMMY
・ MINIMALDATA
・ DISCOURAGE_UPGRADABLE_NOPS
・ CLEANSTACK
・ CHECKLOCKTIMEVERIFY
・ CHECKSEQUENCEVERIFY
・ WITNESS
・ DISCOURAGE_UPGRADABLE_WITNESS_PROGRAM
・ MINIMALIF
・ NULLFAIL
・ WITNESS_PUBKEYTYPE```
Help from within the interactive btcdeb
command line interface (cli).
gr0kchain:~ $ btcdeb
btcdeb -- type `btcdeb -h` for start up options
0 op script loaded. type `help` for usage information
script | stack
--------+--------
btcdeb> help
step Execute one instruction and iterate in the script.
rewind Go back in time one instruction.
stack Print stack content.
altstack Print altstack content.
vfexec Print vfexec content.
exec Execute command.
tf Transform a value using a given function.
print Print script.
help Show help information.
btcdeb>
And some other useful utility functions!
gr0kchain:~ $ btcdeb
btcdeb -- type `btcdeb -h` for start up options
0 op script loaded. type `help` for usage information
script | stack
--------+--------
btcdeb> tf -h
echo [*] show as-is serialized value
hex [*] convert into a hex string
int [arg] convert into an integer
reverse [arg] reverse the value according to the type
sha256 [message] perform SHA256
ripemd160 [message] perform RIPEMD160
hash256 [message] perform HASH256 (SHA256(SHA256(message))
hash160 [message] perform HASH160 (RIPEMD160(SHA256(message))
base58chk-encode [pubkey] encode [pubkey] using base58 encoding (with checksum)
base58chk-decode [string] decode [string] into a pubkey using base58 encoding (with checksum)
bech32-encode [pubkey] encode [pubkey] using bech32 encoding
bech32-decode [string] decode [string] into a pubkey using bech32 encoding
verify-sig [sighash] [pubkey] [signature] verify the given signature for the given sighash and pubkey
combine-pubkeys [pubkey1] [pubkey2] combine the two pubkeys into one pubkey
tweak-pubkey [value] [pubkey] multiply the pubkey with the given 32 byte value
addr-to-scriptpubkey [address] convert a base58 encoded address into its corresponding scriptPubKey
scriptpubkey-to-addr [script] convert a scriptPubKey into its corresponding base58 encoded address
add [value1] [value2] add two values together
sub [value1] [value2] subtract value2 from value1
btcdeb>
For convenience, I've also put together a reference sheet of opcodes. Note that not all of these are enabled or supported in btcdeb
.
Dec Hex Opcode
--------------------------
0 0x0 OP_0
76 0x4c OP_PUSHDATA1
77 0x4d OP_PUSHDATA2
78 0x4e OP_PUSHDATA4
79 0x4f OP_1NEGATE
80 0x50 OP_RESERVED
81 0x51 OP_1
82 0x52 OP_2
83 0x53 OP_3
84 0x54 OP_4
85 0x55 OP_5
86 0x56 OP_6
87 0x57 OP_7
88 0x58 OP_8
89 0x59 OP_9
90 0x5a OP_10
91 0x5b OP_11
92 0x5c OP_12
93 0x5d OP_13
94 0x5e OP_14
95 0x5f OP_15
96 0x60 OP_16
97 0x61 OP_NOP
98 0x62 OP_VER
99 0x63 OP_IF
100 0x64 OP_NOTIF
101 0x65 OP_VERIF
102 0x66 OP_VERNOTIF
103 0x67 OP_ELSE
104 0x68 OP_ENDIF
105 0x69 OP_VERIFY
106 0x6a OP_RETURN
107 0x6b OP_TOALTSTACK
108 0x6c OP_FROMALTSTACK
109 0x6d OP_2DROP
110 0x6e OP_2DUP
111 0x6f OP_3DUP
112 0x70 OP_2OVER
113 0x71 OP_2ROT
114 0x72 OP_2SWAP
115 0x73 OP_IFDUP
116 0x74 OP_DEPTH
117 0x75 OP_DROP
118 0x76 OP_DUP
119 0x77 OP_NIP
120 0x78 OP_OVER
121 0x79 OP_PICK
122 0x7a OP_ROLL
123 0x7b OP_ROT
124 0x7c OP_SWAP
125 0x7d OP_TUCK
126 0x7e OP_CAT - Disabled
127 0x7f OP_SUBSTR - Disabled
128 0x80 OP_LEFT - Disabled
129 0x81 OP_RIGHT - Disabled
130 0x82 OP_SIZE
131 0x83 OP_INVERT - Disabled
132 0x84 OP_AND - Disabled
133 0x85 OP_OR - Disabled
134 0x86 OP_XOR - Disabled
135 0x87 OP_EQUAL
136 0x88 OP_EQUALVERIFY
137 0x89 OP_RESERVED1
138 0x8a OP_RESERVED2
139 0x8b OP_1ADD
140 0x8c OP_1SUB
141 0x8d OP_2MUL - Disabled
142 0x8e OP_2DIV - Disabled
143 0x8f OP_NEGATE
144 0x90 OP_ABS
145 0x91 OP_NOT
146 0x92 OP_0NOTEQUAL
147 0x93 OP_ADD
148 0x94 OP_SUB
149 0x95 OP_MUL - Disabled
150 0x96 OP_DIV - Disabled
151 0x97 OP_MOD - Disabled
152 0x98 OP_LSHIFT - Disabled
153 0x99 OP_RSHIFT - Disabled
154 0x9a OP_BOOLAND
155 0x9b OP_BOOLOR
156 0x9c OP_NUMEQUAL
157 0x9d OP_NUMEQUALVERIFY
158 0x9e OP_NUMNOTEQUAL
159 0x9f OP_LESSTHAN
160 0xa0 OP_GREATERTHAN
161 0xa1 OP_LESSTHANOREQUAL
162 0xa2 OP_GREATERTHANOREQUAL
163 0xa3 OP_MIN
164 0xa4 OP_MAX
165 0xa5 OP_WITHIN
166 0xa6 OP_RIPEMD160
167 0xa7 OP_SHA1
168 0xa8 OP_SHA256
169 0xa9 OP_HASH160
170 0xaa OP_HASH256
171 0xab OP_CODESEPARATOR
172 0xac OP_CHECKSIG
173 0xad OP_CHECKSIGVERIFY
174 0xae OP_CHECKMULTISIG
175 0xaf OP_CHECKMULTISIGVERIFY
176 0xb0 OP_NOP1
177 0xb1 OP_CHECKLOCKTIMEVERIFY
178 0xb2 OP_CHECKSEQUENCEVERIFY
179 0xb3 OP_NOP4
180 0xb4 OP_NOP5
181 0xb5 OP_NOP6
182 0xb6 OP_NOP7
183 0xb7 OP_NOP8
184 0xb8 OP_NOP9
185 0xb9 OP_NOP10
255 0xff OP_INVALIDOPCODE
Happy hacking fellow bitcoiner!
Conclusion
In this tutorial we had a look at the fundamental concepts underpinning bitcoins scripting language. We defined a process of operational codes (opcodes) and how they are evaluated using a stack
based data structure.
Comments