The Merkle proof standard in a nutshell Merkle proofs enable Bitcoin SV businesses and wallets to implement Simplified Payment Verification (SPV), as described in section 8 of the Bitcoin whitepaper. Thanks to the properties of Merkle proofs, SPV allows a transaction recipient to prove that the sender has control of the source funds of the payment they are offering without downloading the entire blockchain. Merchants can choose to use SPV rather than waiting for a transaction to be confirmed at least six times before concluding the deal. Where the value of small transactions poses a relatively small risk to the merchant, it is more cost-effective for merchants to accept the SPV than to run their own Bitcoin node. Because the block header chain that underpins SPV is so small* and grows at such a nominal rate (about 4MB per year), a business can use simple hardware to generate it at a low cost instead of running their own network node. * A wallet can store all necessary block headers in around 50MB – this covers the entire Blockchain (as of January 2020, with 80 bytes per block and approximately 620,000 blocks in the chain). The total grows linearly at about 4MB per year (i.e., it increases by 80 bytes with each block mined, regardless of the size of that block). Contrast this with the hundreds of gigabytes required to store the entire chain if SPV were not used. Read more |
TS 2020.010-31
Standard fact sheetMerkle proof standardised format
Background
The Merkle proof is fundamental to the Simplified Payment Verification (SPV) model that underpins bitcoin scaling. For SPV wallets to make use of them, they need to be provided to wallets by miners or other actors who download full blocks. As there is likely to be a wide variety of implementations transmitting and receiving Merkle proofs for various reasons, it simplifies implementations if a standard format is adopted.
Problem Statement
Current prior art is embodied in FilteredBlock Peer to Peer (p2p) messages which make use of a bitvector and a list of hashes. This arrangement is unnecessarily confusing, requiring validation of the proof in reverse, and does not provide an easy mechanism to calculate the positional index of a transaction in the Merkle tree. Additionally, this p2p message is likely to be deprecated in favour of other mechanisms due to the inherent scalability problems of using bloom filters for SPV clients.
To determine the validity of the proof, we will provide a description of the following:
- a binary and JSON form of the proposed data structure
- the algorithm
Methods and Concepts
Composite proofs
It is possible to provide multiple proofs in a consolidated form using forward-compatible extensions. In the case where two transactions are very close to each other in a block, many of the provided hash nodes are likely to be duplicated. In some cases, it may be more efficient to provide the base layer of a Merkle sub-tree with a branch connecting the root of that sub-tree to the root of the complete tree. It is arguable that a more efficient consolidated format should be provided that enables the normalization of this duplicate data.
We note the following points with respect to this design decision:
- The algorithm for validating composite proofs is significantly more complex.
- It is expected that most of the use cases will be single proofs.
- The current expectation is that most transfers will occur in the context of JSON over REST where it is expected that GZIP compression will be enabled by default.
- Tests on heavily duplicated proofs have shown negligible differences in sizes after compression for de-duplicated composite data structures and structure where multiple simple proofs are included.
GZIP tests
In this example of a Merkle proof for a 1 terabyte block with an expected 4 billion transactions and an approximate tree depth of 32 layers, we show the resultant size for two proofs where 16 layers overlap.
hashes: 32 * 32 bytes hashes as JSONArray, hashesDup: hashes + hashes
original bytes len - hashes: 1024, hashesDup: 2048
raw bytes len - hashes: 2178, hashesDup: 4354 // JSON stringified
gzip bytes len - hashes: 1241, hashesDup: 1274 // JSON stringified then gzipped
As we can see, the difference between the duplicate hashes after GZIP, compared to a single list of non-duplicated hashes, is negligible.
Scope
Simple proofs only
When we take the above into account and note that even in some cases where a small number of proofs are required together that multiple instances of the simple proof are negligibly larger than a single consolidated proof, this specification only addresses the simple case. It does, however, make some design decisions with a view to being able to accommodate a composite proof using the same format with extensions.
Note: The details of the composite proof will be provided as a separate standards document.
Conventions
In this document, we will describe Merkle trees in the context that they are used in Bitcoin, namely, as a tree formed from the list of transaction IDs included in a Bitcoin block, with a Merkle root for that tree included in the block header.
We refer to the layers of the Merkle tree from the bottom up, with the layer containing transaction IDs as layer 0.
We index transactions in the tree starting from layer 0, which includes the coinbase transaction. In the example below, the element is the 5th in the tree with an index of 4.
By convention, we designate the nodes c
and p
with a layer index, i.e., c[n]
such that c[0]
is the calculated node in layer 0
. Nodes denoted by o
represent nodes that are unnecessary or that can be excluded from the proof.
c = calculated
p = provided in proof
root is both calculated and provided from block header
tx hash in layer 0 is calculated from transaction
4: root
/ \
3: c p
/ \ / \
2: p c o o
/ \ / \ / \ / \
1: o o c p o o o o
/\ /\ /\ /\ /\ /\ /\ /\
0: o o o o c p o o o o o o o o o o
^
|
tx
Execution algorithm
The result of the algorithm proves that a given transaction resides in a given block — specified by its block header. We do not address the validity of the block header itself in this standard but check whether the proof of inclusion is valid for a given header.
By convention we use:
n
to designate the layer of the tree beginning atn == 0
for the bottom layer containing transaction IDs.i
to designate the index of a node in its layer such that the first element in layern
is ati == 0
.
Beginning with the candidate transaction:
- Calculate the transaction ID:
c[0] = sha256d(transaction)
. - If no pair is provided, we assume that
c[0]
is the Merkle root and compare it to the root provided in the block header. - If a pair (
p[0]
) is provided, we must concatenate and hashc[0]
andp[0]
together to obtain the parent node (c[1]
) in the next layer up.- The order is determined by the positional index of
c[0]
. - If
indexOf(c[0]) mod 2 == 0
it is an even indexed node in the layer and is a left hand node.- Thus:
c[1] = sha256d(concat(c[0], p[0]))
.
- Thus:
- Otherwise, it is a right-hand node.
- Thus:
c[1] = sha256d(concat(p[0], c[0])).
- Thus:
- The order is determined by the positional index of
- Continue this process for each layer (
n
) of the Merkle tree:- If
indexOf(c[n]) mod 2 == 0
it is an even indexed node in the layer and is a left hand node.- Thus:
c[n + 1] = sha256d(concat(c[n], p[n]))
.
- Thus:
- Otherwise, it is a right-hand node.
- Thus:
c[n + 1] = sha256d(concat(p[n], c[n]))
.
- Thus:
- If
- Upon reaching the final layer, we have calculated the Merkle root and compared it to ensure that it is equal to the provided Merkle root, usually obtained from a Bitcoin block header.
- If the calculated Merkle root matches the expected Merkle root, we have proven that the transaction ID
c[0]
is included in the Merkle tree designated by the provided Merkle root. By extension, if the Merkle root is contained in a block header with a valid proof of work for a chain of blocks we accept as the longest, then we have proven that the transaction is included in that blockchain.
Note that in some cases, the provided pair will be identical to the calculated pair. This is expected behaviour when the transaction exists near the right-hand edge of the Merkle tree and a corresponding right-hand element does not exist. However, if the elements are equal, it should be confirmed that the calculated element c[n]
is a left hand node, i.e., indexOf(c[n]) mod 2 == 0
.
The alternative condition cannot happen in a Bitcoin Merkle tree. This check guards against the case where the node is near the right-hand edge of an incomplete tree where the Merkle proof for a left and right-hand node can be the same. This check ensures that the proof fails if the index is not correctly given.
Depth attacks
We note that step 1
is a vital part of the proof process. Without validating that the SHA256d (double SHA256) hash is actually derived from a bitcoin transaction, it is possible to send a partial Merkle proof for any node higher up in the tree that will validate correctly.
Required data
We can see from the above description that the following data is required to calculate the Merkle proof:
- The transaction that is being proven is included.
- The Merkle root of the tree that it is included in, such that it can be compared to the calculated root.
- A list of pairs
[p[0], ... , p[n]]
to the calculated nodes. - Whether each pair is a left or right-hand node.
- Note that this can be explicitly provided for each pair, or calculated from the index of the transaction ID in the Merkle tree.
Extension to proof of positional index
We note that if providing the positional index of c[0]
is the mechanism used to determine the left/right ordering that if the proof is successful it will also hold as proof of the positional index in the tree with the following additional check, as long as this check never fails:
if (indexOf(c[n]) mod 2 == 1) then c[n] != p[n]
Then we can assume that the provided index is correct.
Extension to proof of last element
Extending with an additional check, we can also confirm if c[0]
is the last element of the Merkle tree by exploiting the fact that where no right-hand pair exists, a copy of the left-hand pair is used in its place. The following check will always hold true if the element is last in the tree but will always fail at least once if it is not:
if (indexOf(c[n]) mod 2 == 0) then c[n] == p[n]
;
This effectively serves as a proof of the number of transactions in the block where indexOf(c[0]) == len(layer[0])
Merkle proof data structure
The proposed structure itself generally consists of a flags byte, followed by the positional index of the transaction in the Merkle tree, a list of 32 bytes hashes, and the transaction (or the transaction ID/transaction hash digest). Optionally, special encoding can be used for hashes when they are a copy of the matching calculated node. The structure’s head starts at its bottom and adds elements while traversing up each layer toward the Merkle root. Each parent element of the layer above represents the counterpart pair to the parent element of the layer below. This process is repeated until the Merkle root of the structure is reached.
The Merkle proof data can be structured in Binary or in JSON formats as shown below:
Binary form
varint is defined as a Bitcoin varint which is compact for cases where values in the range of 0 – 253 are commonly expected.
type: byte, // 0 == hash (byte[32]), 1 == duplicate (byte[0]), 2 == index (varint)
value: byte[32] or varint or byte[0] //determined by type
Note: Type 2 is reserved for use in an extension format. The duplicate type could have been omitted, allowing us to eliminate the type byte entirely from this standard. However, its inclusion does have some potential space benefits whilst making the proposed extension standard fully backwards compatible.
flags: byte,
index: varint,
txLength: varint, //omitted if flag bit 0 == 0 as it's a fixed length transaction ID
txOrId: byte[32 or variable length],
target: byte[32 or 80], //determined by flag bits 1 and 2
nodeCount: varint,
nodes: node[]
Endianness
Please note that to stay consistent with the bitcoin node and established patterns, the binary representation is the reverse endian of the hex string representation. Take the block hash representation, for instance. where the JSON form of the block header is shown below:
{
"hash": "62ea2ebe6586c3c8f4b0a17806be932fe73816cd84c0d3ce9fe0976739e6cd46",
"confirmations": 7,
"height": 287,
"version": 536870912,
"versionHex": "20000000",
"merkleroot": "1a1e779cd7dfc59f603b4e88842121001af822b2dc5d3b167ae66152e586a6b0",
"num_tx": 8,
"time": 1614196393,
"mediantime": 1614194799,
"nonce": 1,
"bits": "207fffff",
"difficulty": 4.656542373906925e-10,
"chainwork": "0000000000000000000000000000000000000000000000000000000000000240",
"previousblockhash": "4542f59d5fdcfb4861a5455a75abe133aad5bfb17f9aacd3ba60d0a757fa3c9e",
"nextblockhash": "54c597510e7f9c0844cd40b73cac5f4c9169156001ac92736698ed81f71e0c92"
}
This block header in binary form is represented as:
000000209e3cfa57a7d060bad3ac9a7fb1bfd5aa33e1ab755a45a56148fbdc5f9df54245b0a686e55261e67a163b5ddcb222f81a00212184884e3b609fc5dfd79c771e1aa9ae3660ffff7f2001000000
Remember that the block header decoding is done as follows: first 4 bytes for the version, next 32 bytes for the previous block hash, the next 32 bytes for the Merkle Root, the next 4 bytes for the time, the next 4 bytes for the nBits, and the final 4 bytes for the nonce. Note the difference between the 32-byte slice below and the representation in the JSON above.
00000020
9e3cfa57a7d060bad3ac9a7fb1bfd5aa33e1ab755a45a56148fbdc5f9df54245
b0a686e55261e67a163b5ddcb222f81a00212184884e3b609fc5dfd79c771e1a
a9ae3660
ffff7f20
01000000
Flags
There are several choices to be made in a Merkle proof format.
- Transaction of TxId: Whether to include the original transaction given that in most cases it will already be known, or just the transaction ID.
- Target type: Whether the final element should be the Merkle root, the entire block header, the block hash of the block it contains or is omitted completely. In some cases, the Merkle proof may be provided in the context of an envelope data structure (for example. an SPV envelope) that already contains the missing elements.
- Proof type: Whether it is a branch or a tree. The tree option is only used in an extension to this standard but the flag bit is reserved for that purpose.
- Composite proof: Whether this proof is part of a set of proofs and contains references to other proofs in the set. This option is only used in an extension to this standard but the flag bit is reserved for that purpose.
The following bits are reserved for each purpose, counted from the rightmost (least significant) bit. Although some of the following combinations may not have real use cases, for future-proofing, these values are reserved.
Transaction or TxId (bit 0)
Corresponds to: (flags & 0x01)
Where value equals:
0: The `txOrId` field contains a transaction ID
1: The `txOrId` field contains a full transaction
Target type (bits 1 and 2):
Corresponds to: (flags & (0x04 | 0x02)) >> 1
Where value equals:
0: The `target` field contains a block hash
2: The `target` field contains a block header
4: the `target` field contains a merkle root
6: not valid
Proof type (bit 3):
Corresponds to: (flags & 0x08) >> 3
Where value equals:
0: Merkle branch (or sub-branch in the extension format)
1: Merkle tree (or sub-tree in the extension format)
Composite proof (bit 4):
Corresponds to: (flags & 0x10) >> 4
Where value equals:
0: Single proof
1: Composite proof
Defaults
Note that default values have been selected in such a way that a flag value of 0 means:
- The
txOrId
field contains a transaction ID. - The
target
field contains a block hash. - The proof is a Merkle branch.
- The proof is a single proof that can be validated using the simplest algorithm.
JSON form
In the JSON form of this data structure, the flags byte is omitted for simplicity and uses type fields (such as the targetType
or the proofType
) instead for types than cannot be easily inferred like the txOrId
which will be exactly 64 characters for a 32 byte txId and strictly longer than that for the full transaction hex string. The node elements are also broken up into an array of hex-string values (without a 0x
prefix).
// simple version where only a single path is specified
{
// mandatory fields:
"index": 4,
"txOrId": "bytes", // hex bytes, 32 if ID, otherwise interpret as full transaction
"target": "merkleroot_or_blockhash_or_blockheader", // which one is specified by targetType
// contains the provided pair nodes to the calculated nodes in the proof.
// "hash" is a simple hash node
// integer is an index to a subtree or subpath - only used in extension
// "*" is a copy of the calculated node e.g. parent = hash (left | left)
"nodes": ["hash", "hash", "*", ... "hash"],
// optional fields:
// can be one of: 'hash' or 'header' or 'merkleRoot'
"targetType": "header", // defaults to 'hash' if not specified
// can be one of: 'branch' or 'tree'
"proofType": "branch", // defaults to 'branch' if not specified
// can be one of: true or false
"composite": false // defaults to false if not specified
}
Nodes and duplicated indexes
The standard specifies that hashes and potentially block headers are encoded as hex strings. We use the special string "*"
as a marker to indicate that hash is a duplicated node where p[n]
hash should be assumed to be a copy of c[n]
. We choose this special marker string rather than null for the following reasons:
- The encoding is shorter than encoding
null
but slightly longer than encoding an empty string. - To avoid ambiguity in error cases, it is common for software bugs to produce unexpected
null
or empty string results. By makingnull
an unexpected result, these errors are easier to detect.
Index and left/right determination
A complete Merkle proof requires not only the transaction you are proving, but also the transaction ID of its pair. If the transaction index in layer 0 of the tree is i
then where i mod 2 == 0
the pair will be at index i + 1
. Where i mod 2 == 1
the pair will be at index i - 1
. This ordering is required to be maintained when hashing the left/right nodes to calculate the parent. It could optionally also be provided as a flag for each provided node in the tree.
We have chosen to use the positional index rather than left/right flags for the following reasons:
- It simplifies the data structure to include only a single integer.
- The index in the block may be useful in some cases.
- The index in the block can be proven as part of the Merkle proof.
- Calculation of the leftness or rightness of a node along with the indexes of parent nodes during proof execution is trivially simple.
- Whilst it is possible to calculate the positional index using left/right flags, the calculation is more complex and even in the best cases (assuming 1TB blocks) requires just as much data to be included as is required for the positional index.
Nodes
The nodes
field is an array data structure that contains all of the nodes that must be provided as pairs to the calculated nodes. It does NOT include the transaction ID being proven (assumed to be calculated from the transaction) OR the Merkle root (calculated as part of the proof).
Both the binary and the JSON formats specify a mechanism for designating elements of the path that are duplicates, i.e., where the provided element p[n]
should be assumed to be a copy of the calculated element c[n]
for that layer of the tree.
Merkle proof validation
Example code for Merkle Proof validation in both JSON as well as binary formats has been written according to this specification. The JavaScript code is available here:https://github.com/bitcoin-sv-specs/merkle-proof-standard-example. The Golang code is available here: https://github.com/libsv/go-bc/blob/master/verifymerkleproof.go.
Example Merkle Proof
Binary Format
000cef65a4611570303539143dabd6aa64dbd0f41ed89074406dc0e7cd251cf1efff69f17b44cfe9c2a23285168fe05084e1254daa5305311ed8cd95b19ea6b0ed7505008e66d81026ddb2dae0bd88082632790fc6921b299ca798088bef5325a607efb9004d104f378654a25e35dbd6a539505a1e3ddbba7f92420414387bb5b12fc1c10f00472581a20a043cee55edee1c65dd6677e09903f22992062d8fd4b8d55de7b060006fcc978b3f999a3dbb85a6ae55edc06dd9a30855a030b450206c3646dadbd8c000423ab0273c2572880cdc0030034c72ec300ec9dd7bbc7d3f948a9d41b3621e39
JSON Format
{
"index": 12,
"txOrId": "ffeff11c25cde7c06d407490d81ef4d0db64aad6ab3d14393530701561a465ef",
"target": "75edb0a69eb195cdd81e310553aa4d25e18450e08f168532a2c2e9cf447bf169",
"nodes": [
"b9ef07a62553ef8b0898a79c291b92c60f7932260888bde0dab2dd2610d8668e",
"0fc1c12fb1b57b38140442927fbadb3d1e5a5039a5d6db355ea25486374f104d",
"60b0e75dd5b8d48f2d069229f20399e07766dd651ceeed55ee3c040aa2812547",
"c0d8dbda46366c2050b430a05508a3d96dc0ed55aea685bb3d9a993f8b97cc6f",
"391e62b3419d8a943f7dbc7bddc90e30ec724c033000dc0c8872253c27b03a42"
]
}
Relationships
Relationship | Description |
---|---|
IP licences and dependencies | nChain Patent Pledge for Merkle Proof Standardized Format Specification |
Copyright | © 2021 Bitcoin Association for BSV. All rights reserved. Unless otherwise specified, or required in the context of its implementation on BSV Blockchain, no part of this standard may be reproduced or utilised otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission of Bitcoin Association. |
Previous versions | n/a |
Extends (existing standards extended by this standard) | n/a |
Modifies (existing standards changed by this standard) | n/a |
Deprecates (existing standards obsoleted by this standard) | n/a |
Depends on (existing standards that are foundational or prerequisite to this standard) | n/a |
Prior Art | n/a |
Existing Solution | n/a |
References | n/a |
Supplementary Material
Summary of comments received during both internal and public review
Generally, there was a weight of opinion that the common use case should not be unduly burdened with complexity that is only required for the corner cases. In order to address this concern changes have been made, primarily to the JSON use case, such that using default options results in a much simpler JSON object. This retains the possibility of using the alternate options without breaking the standard, whilst hiding the optional typing fields in the default case.
Should the API calls also be defined? If so, what would be the capability ID to include the API in well-known?
Use of the data structure in API’s is out of scope for this standard as it is specifically about the data structure itself. Introducing API’s to the scope would become necessarily proscriptive as we’d have make decisions about REST vs JSON-RPC vs GRPC. This can and should be addressed in separate standards that reference this one.
What benefits in supporting rawtx and txid, as well as multi formats of merkleroot? Will developers be struggling deciding which format to use?
Txid would be usually, however it offers the option to combine transaction and merkle proof in a single data structure. This is not the specified default however it is allowed in order to avoid an entirely new standard being required in the edge case.
Reference Implementations
Bitcoin SV Node software
https://www.bitcoinsv.io/node
Electrum SV
https://electrumsv.io/
Electrum X
https://electrumx.readthedocs.io/en/latest/
To record an implementation of the standard, please register below.