Introduction

Ever wonder how bitcoin nodes talk to each other? Well, in this tutorial we'll be covering the raw details behind the TCP based bitcoin wire protocol, not to be confused with the more popular user based RPC interface.

Background

Like many other internet servers and clients, bitcoin has its own communication protocol. Contrary to popular belief, no encryption is currently required when sparking up a conversation between nodes, so we can listen in on and talk to them without too much hassle.

Let's get started

Before writing any code, let's start off by probing at a node from the command line to see if we can get them to spill the beans.

Here's an example:

grokchain:~ bitcoindev$ echo -en "fabfb5da76657273696f6e000000000064000000358d493262ea0000010000000000000011b2d05000000000010000000000000000000000000000000000ffff000000000000000000000000000000000000000000000000ffff0000000000003b2eb35d8ce617650f2f5361746f7368693a302e372e322fc03e0300" | xxd -r -p | nc localhost 18444 -i 2 | xxd
00000000: fabf b5da 7665 7273 696f 6e00 0000 0000  ....version.....
00000010: 6600 0000 db1c 38a4 7c11 0100 0500 0000  f.....8.|.......
00000020: 0000 0000 3f96 585c 0000 0000 0100 0000  ....?.X\........
00000030: 0000 0000 0000 0000 0000 0000 0000 ffff  ................
00000040: 0000 0000 0000 0500 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 ffff 0000 0000 480c  ..............H.
00000060: a1cd ff81 f434 8eac 102f 5361 746f 7368  .....4.../Satosh
00000070: 693a 302e 3132 2e31 2f00 0000 0001 fabf  i:0.12.1/.......
00000080: b5da 7665 7261 636b 0000 0000 0000 0000  ..verack........
00000090: 0000 5df6 e0e2 fabf b5da 7069 6e67 0000  ..].......ping..
000000a0: 0000 0000 0000 0800 0000 f74e 9ecb 6b09  ...........N..k.
000000b0: b381 cb9f f83e fabf b5da 6765 7468 6561  .....>....gethea
000000c0: 6465 7273 0000 4500 0000 31ae 7582 62ea  ders..E...1.u.b.
000000d0: 0000 0106 226e 4611 1a0b 59ca af12 6043  ...."nF...Y...`C
000000e0: eb5b bf28 c34f 3a5e 332a 1fc7 b2b7 3cf1  .[.(.O:^3*....<.
000000f0: 8891 0f00 0000 0000 0000 0000 0000 0000  ................
00000100: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000110: 0000 00                                  ...

Wow, how's that for a conversation starter!? Don't worry if none of this makes sense just yet, by the end of this tutorial it will!

Note
Don't currently have a bitcoin development environment set up? No problem, we have your back! We've setup a web based mechanism which provisions your very own private session that includes these tools and comes preconfigured with a bitcoin node in testing mode. https://bitcoindev.network/bitcoin-cli-sandbox/

Alternatively, we have also provided a simple docker container configured in regtest mode that you can install for testing purposes.
gr0kchain:~ $ docker volume create --name=bitcoind-data
gr0kchain:~ $ docker run -v bitcoind-data:/bitcoin --name=bitcoind-node -d \
     -p 18444:18444 \
     -p 127.0.0.1:18332:18332 \
     bitcoindevelopernetwork/bitcoind-regtest

Let's unpack what's happening above. The bitcoin wire protocol consists of about 27 different message types at the time of writing (0.17.0). Each message has various fields which are expressed in the form of a binary structure that can be sent over the network to any public node which exposes itself on the internet. For simplicity, we will start talking to a node using two useful utilities nl.

- netcat -  a simple unix utility which reads and writes data across network connections, using the TCP or UDP protocol; and

- xxd - which helps create a hex dump of a given file or standard input, or converts a hex dump back to its original binary form

We use netcat to establish the TCP connection between you and the bitcoin node, and then xxd to encode/decode messages that are send to and from the node. We will later look at how we can accomplish this using code.

For now, let's take a closer look at the data we are sending to the node.

Saying "Hello"

In our example above, we pushed the following string at the node. This is hex encoded since the bitcoin wire protocol is binary.

fabfb5da76657273696f6e000000000064000000358d493262ea0000010000000000000011b2d05000000000010000000000000000000000000000000000ffff000000000000000000000000000000000000000000000000ffff0000000000003b2eb35d8ce617650f2f5361746f7368693a302e372e322fc03e0300

All messages communicated between nodes take the following general structure, where the payload differs based on the command we are sending.

Field Size Description Data type Comments
4 magic uint32_t Magic value indicating message origin network, and used to seek to next message when stream state is unknown
12 command char[12] ASCII string identifying the packet content, NULL padded (non-NULL padding results in packet rejected)
4 length uint32_t Length of payload in number of bytes
4 checksum uint32_t First 4 bytes of sha256(sha256(payload))
 ? payload uchar[] The actual data

And of course, our payload for the version command

Field Size Description Data type Comments
4 version int32_t Identifies protocol version being used by the node
8 services uint64_t bitfield of features to be enabled for this connection
8 timestamp int64_t standard UNIX timestamp in seconds
26 addr_recv net_addr The network address of the node receiving this message
Fields below require version ≥ 106
26 addr_from net_addr The network address of the node emitting this message
8 nonce uint64_t Node random nonce, randomly generated every time a version packet is sent. This nonce is used to detect connections to self.
 ? user_agent var_str User Agent (0x00 if string is 0 bytes long)
4 start_height int32_t The last block received by the emitting node
Fields below require version ≥ 70001
1 relay bool Whether the remote peer should announce relayed transactions or not, see BIP 0037

We are able to decode our hex string by breaking it down as follows:

0000   fa bf b5 da 76 65 72 73 69 6f 6e 00 00 00 00 00  ....version.....
0010   64 00 00 00 35 8d 49 32 62 ea 00 00 01 00 00 00  d...5.I2b.......
0020   00 00 00 00 11 b2 d0 50 00 00 00 00 01 00 00 00  .......P........
0030   00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff  ................
0040   00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0050   00 00 00 00 00 00 00 00 ff ff 00 00 00 00 00 00  ................
0060   3b 2e b3 5d 8c e6 17 65 0f 2f 53 61 74 6f 73 68  ;..]...e./Satosh
0070   69 3a 30 2e 37 2e 32 2f c0 3e 03 00              i:0.7.2/.>..

Message Header:
 FA BF B5 DA                                                                   - Regtest network magic bytes
 76 65 72 73 69 6F 6E 00 00 00 00 00                                           - "version" command
 64 00 00 00                                                                   - Payload is 100 bytes long
 35 8d 49 32                                                                   - payload checksum (little endian)

Version message:
 62 EA 00 00                                                                   - 60002 (protocol version 60002)
 01 00 00 00 00 00 00 00                                                       - 1 (NODE_NETWORK services)
 11 B2 D0 50 00 00 00 00                                                       - Tue Dec 18 10:12:33 PST 2012
 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF FF 00 00 00 00 00 00 - Recipient address info - see Network Address
 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF FF 00 00 00 00 00 00 - Sender address info - see Network Address
 3B 2E B3 5D 8C E6 17 65                                                       - Node ID
 0F 2F 53 61 74 6F 73 68 69 3A 30 2E 37 2E 32 2F                               - "/Satoshi:0.7.2/" sub-version string (string is 15 bytes long)
 C0 3E 03 00                                                                   - Last block sending node has is block #212672

A verack packet shall be sent in response if the version packet was accepted by the node.

The following services are currently assigned:

0000   FA BF B5 DA 76 65 72 61  63 6B 00 00 00 00 00 00   ....verack......
0010   00 00 00 00 5D F6 E0 E2                            ........

Message header:
 FA BF B5 DA                          - Regtest network magic bytes
 76 65 72 61  63 6B 00 00 00 00 00 00 - "verack" command
 00 00 00 00                          - Payload is 0 bytes long
 5D F6 E0 E2                          - Checksum (little endian)

In the above example, we have used the Magic value of fabfb5da. This is one of the predefined values which indicate which network the current node is running on, which in this case is a node in regtest mode.

Network Default Port Start String Max nBits
Mainnet 8333 0xf9beb4d9 0x1d00ffff
Testnet 18333 0x0b110907 0x1d00ffff
Regtest 18444 0xfabfb5da 0x207fffff

Etiquette

When a node creates an outgoing connection, it will immediately advertise its version. The remote node will respond with its version. No further communication is possible until both peers have exchanged their version.

Now that we've broken the ice...

After initiated the conversation, we can continue to explore some other interesting topics these nodes like talking about.

Conclusion

In this tutorial we looked into demystifying how bitcoin nodes go about communicating with each other. We were able to initiate a conversation with an existing bitcoin node, and explore some of the various message types used by nodes.

References

Bitcoin Wiki Protocol documentation

https://en.bitcoin.it/wiki/Protocol_documentation

Bitcoin Core Client P2P network documentation

https://bitcoin.org/en/developer-reference#p2p-network