Introduction
Ever wonder how bitcoin nodes talk to each other? Well, in this tutorial we'll be covering the raw details behind the TCP based bitcoin wire protocol, not to be confused with the more popular user based RPC interface.
Background
Like many other internet servers and clients, bitcoin has its own communication protocol. Contrary to popular belief, no encryption is currently required when sparking up a conversation between nodes, so we can listen in on and talk to them without too much hassle.
Let's get started
Before writing any code, let's start off by probing at a node from the command line to see if we can get them to spill the beans.
Here's an example:
grokchain:~ bitcoindev$ echo -en "fabfb5da76657273696f6e000000000064000000358d493262ea0000010000000000000011b2d05000000000010000000000000000000000000000000000ffff000000000000000000000000000000000000000000000000ffff0000000000003b2eb35d8ce617650f2f5361746f7368693a302e372e322fc03e0300" | xxd -r -p | nc localhost 18444 -i 2 | xxd
00000000: fabf b5da 7665 7273 696f 6e00 0000 0000 ....version.....
00000010: 6600 0000 db1c 38a4 7c11 0100 0500 0000 f.....8.|.......
00000020: 0000 0000 3f96 585c 0000 0000 0100 0000 ....?.X\........
00000030: 0000 0000 0000 0000 0000 0000 0000 ffff ................
00000040: 0000 0000 0000 0500 0000 0000 0000 0000 ................
00000050: 0000 0000 0000 0000 ffff 0000 0000 480c ..............H.
00000060: a1cd ff81 f434 8eac 102f 5361 746f 7368 .....4.../Satosh
00000070: 693a 302e 3132 2e31 2f00 0000 0001 fabf i:0.12.1/.......
00000080: b5da 7665 7261 636b 0000 0000 0000 0000 ..verack........
00000090: 0000 5df6 e0e2 fabf b5da 7069 6e67 0000 ..].......ping..
000000a0: 0000 0000 0000 0800 0000 f74e 9ecb 6b09 ...........N..k.
000000b0: b381 cb9f f83e fabf b5da 6765 7468 6561 .....>....gethea
000000c0: 6465 7273 0000 4500 0000 31ae 7582 62ea ders..E...1.u.b.
000000d0: 0000 0106 226e 4611 1a0b 59ca af12 6043 ...."nF...Y...`C
000000e0: eb5b bf28 c34f 3a5e 332a 1fc7 b2b7 3cf1 .[.(.O:^3*....<.
000000f0: 8891 0f00 0000 0000 0000 0000 0000 0000 ................
00000100: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000110: 0000 00 ...
Wow, how's that for a conversation starter!? Don't worry if none of this makes sense just yet, by the end of this tutorial it will!
Note
Don't currently have a bitcoin development environment set up? No problem, we have your back! We've setup a web based mechanism which provisions your very own private session that includes these tools and comes preconfigured with a bitcoin node in testing mode. https://bitcoindev.network/bitcoin-cli-sandbox/
Alternatively, we have also provided a simple docker container configured in regtest mode that you can install for testing purposes.
gr0kchain:~ $ docker volume create --name=bitcoind-data gr0kchain:~ $ docker run -v bitcoind-data:/bitcoin --name=bitcoind-node -d \ -p 18444:18444 \ -p 127.0.0.1:18332:18332 \ bitcoindevelopernetwork/bitcoind-regtest
Let's unpack what's happening above. The bitcoin wire protocol consists of about 27 different message types at the time of writing (0.17.0). Each message has various fields which are expressed in the form of a binary structure that can be sent over the network to any public node which exposes itself on the internet. For simplicity, we will start talking to a node using two useful utilities nl.
- netcat - a simple unix utility which reads and writes data across network connections, using the TCP or UDP protocol; and
- xxd - which helps create a hex dump of a given file or standard input, or converts a hex dump back to its original binary form
We use netcat to establish the TCP connection between you and the bitcoin node, and then xxd to encode/decode messages that are send to and from the node. We will later look at how we can accomplish this using code.
For now, let's take a closer look at the data we are sending to the node.
Saying "Hello"
In our example above, we pushed the following string at the node. This is hex encoded since the bitcoin wire protocol is binary.
fabfb5da76657273696f6e000000000064000000358d493262ea0000010000000000000011b2d05000000000010000000000000000000000000000000000ffff000000000000000000000000000000000000000000000000ffff0000000000003b2eb35d8ce617650f2f5361746f7368693a302e372e322fc03e0300
All messages communicated between nodes take the following general structure, where the payload differs based on the command we are sending.
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | magic | uint32_t | Magic value indicating message origin network, and used to seek to next message when stream state is unknown |
12 | command | char[12] | ASCII string identifying the packet content, NULL padded (non-NULL padding results in packet rejected) |
4 | length | uint32_t | Length of payload in number of bytes |
4 | checksum | uint32_t | First 4 bytes of sha256(sha256(payload)) |
? | payload | uchar[] | The actual data |
And of course, our payload for the version command
Field Size | Description | Data type | Comments |
---|---|---|---|
4 | version | int32_t | Identifies protocol version being used by the node |
8 | services | uint64_t | bitfield of features to be enabled for this connection |
8 | timestamp | int64_t | standard UNIX timestamp in seconds |
26 | addr_recv | net_addr | The network address of the node receiving this message |
Fields below require version ≥ 106 | |||
26 | addr_from | net_addr | The network address of the node emitting this message |
8 | nonce | uint64_t | Node random nonce, randomly generated every time a version packet is sent. This nonce is used to detect connections to self. |
? | user_agent | var_str | User Agent (0x00 if string is 0 bytes long) |
4 | start_height | int32_t | The last block received by the emitting node |
Fields below require version ≥ 70001 | |||
1 | relay | bool | Whether the remote peer should announce relayed transactions or not, see BIP 0037 |
We are able to decode our hex string by breaking it down as follows:
0000 fa bf b5 da 76 65 72 73 69 6f 6e 00 00 00 00 00 ....version.....
0010 64 00 00 00 35 8d 49 32 62 ea 00 00 01 00 00 00 d...5.I2b.......
0020 00 00 00 00 11 b2 d0 50 00 00 00 00 01 00 00 00 .......P........
0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff ................
0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0050 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 ................
0060 3b 2e b3 5d 8c e6 17 65 0f 2f 53 61 74 6f 73 68 ;..]...e./Satosh
0070 69 3a 30 2e 37 2e 32 2f c0 3e 03 00 i:0.7.2/.>..
Message Header:
FA BF B5 DA - Regtest network magic bytes
76 65 72 73 69 6F 6E 00 00 00 00 00 - "version" command
64 00 00 00 - Payload is 100 bytes long
35 8d 49 32 - payload checksum (little endian)
Version message:
62 EA 00 00 - 60002 (protocol version 60002)
01 00 00 00 00 00 00 00 - 1 (NODE_NETWORK services)
11 B2 D0 50 00 00 00 00 - Tue Dec 18 10:12:33 PST 2012
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF FF 00 00 00 00 00 00 - Recipient address info - see Network Address
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF FF 00 00 00 00 00 00 - Sender address info - see Network Address
3B 2E B3 5D 8C E6 17 65 - Node ID
0F 2F 53 61 74 6F 73 68 69 3A 30 2E 37 2E 32 2F - "/Satoshi:0.7.2/" sub-version string (string is 15 bytes long)
C0 3E 03 00 - Last block sending node has is block #212672
A verack packet shall be sent in response if the version packet was accepted by the node.
The following services are currently assigned:
0000 FA BF B5 DA 76 65 72 61 63 6B 00 00 00 00 00 00 ....verack......
0010 00 00 00 00 5D F6 E0 E2 ........
Message header:
FA BF B5 DA - Regtest network magic bytes
76 65 72 61 63 6B 00 00 00 00 00 00 - "verack" command
00 00 00 00 - Payload is 0 bytes long
5D F6 E0 E2 - Checksum (little endian)
In the above example, we have used the Magic value of fabfb5da
. This is one of the predefined values which indicate which network the current node is running on, which in this case is a node in regtest
mode.
Network | Default Port | Start String | Max nBits |
---|---|---|---|
Mainnet | 8333 | 0xf9beb4d9 | 0x1d00ffff |
Testnet | 18333 | 0x0b110907 | 0x1d00ffff |
Regtest | 18444 | 0xfabfb5da | 0x207fffff |
Etiquette
When a node creates an outgoing connection, it will immediately advertise its version. The remote node will respond with its version. No further communication is possible until both peers have exchanged their version.
Now that we've broken the ice...
After initiated the conversation, we can continue to explore some other interesting topics these nodes like talking about.
- version
- verack
- addr
- inv
- getdata
- notfound
- getblocks
- getheaders
- tx
- block
- headers
- getaddr
- mempool
- checkorder
- submitorder
- reply
- ping
- pong
- reject
- filterload, filteradd, filterclear, merkleblock
- alert
- sendheaders
- feefilter
- sendcmpct
- cmpctblock
- getblocktxn
- blocktxn
Conclusion
In this tutorial we looked into demystifying how bitcoin nodes go about communicating with each other. We were able to initiate a conversation with an existing bitcoin node, and explore some of the various message types used by nodes.
References
Bitcoin Wiki Protocol documentation
https://en.bitcoin.it/wiki/Protocol_documentation
Bitcoin Core Client P2P network documentation
https://bitcoin.org/en/developer-reference#p2p-network
Comments