The Envelope specification standard in a nutshell One of the most valuable features of Bitcoin is the ability to immutably record data on a time-stamped public ledger. This provides the ability to audit and prove the existence of data of all types, from financial transaction records to book manuscripts, mathematical formulae and even digital works of art. The problem is that data files come in many different formats, each requiring compatible software to decode, open and process. This means that if you were trying to find a specific type of data on the blockchain, you would have to go through the process of accessing all of the irrelevant data files to find the ones you were looking for. The TSC envelope specification proposes that each data record is wrapped in a standard ‘envelope’ or ‘wrapper’ protocol that allows software to easily determine how the data is formatted and how to either process it or display it to a user. Read more |
TS 2020.011-30
Envelope Specification
Standard fact sheetAttribute | Description |
Version | 1.0 |
Authors | Curtis Ellis |
Reviewers | Attila Aros, Jonathan Aird, Jack Davies, Dylan Murray, Jaime Salom Viñado, Mathias Wulff, Roger Taylor |
Acknowledgements | The Authors would particularly like to thank the following people for their contributions to and comments on the standard document: Derek Moore, Steve Shadders |
Tags and Categories | On-chain data, Data and token interoperability |
Publication Date | September 3, 2021 |
Valid Until | |
Copyright | © 2021 Bitcoin Association for BSV. All rights reserved. |
IP Generation | No intellectual property has been sought for this standard |
Known Implementations | Tokenized Envelope System Notify the TSC if your company has implemented the Standard by using the form in the side column. |
Applies To | Wallets will need to use this standard to read/write transactions containing it. Data services will need to use this standard to filter transactions based on data within it. |
BRFC ID | |
Status | Published |
Visibility | Public |
Background
Currently, data stored within a Bitcoin SV transaction typically utilizes a standard unspendable output that begins with OP_FALSE OP_RETURN. The first push data after that is generally the protocol identifier which specifies the format/encoding of the subsequent data in the output script.
This standard aims to improve the way data is stored within Bitcoin transactions by creating a general purpose data envelope that allows for efficient identification, structuring, and interoperability (or layering) of 1 or more data protocols. We will refer to these other protocols as sub-protocols for the purposes of this document.
Problem Statement
When a software application sees an output script that begins with OP_FALSE OP_RETURN, it knows that it is provably unspendable and likely contains data that has meaning outside of the Bitcoin protocol. Currently, it is difficult for blockchain indexing software to determine the encoding and formatting of the data because there are many different protocols already in use, and there are no constraints/control over what, or how many protocols will be stored within a Bitcoin transaction. In other words, interacting software is unable to quickly determine what it knows or does not know about the data. This uncertainty can add systemic costs to IT infrastructure that is interested in some or all of the data.
Objectives
This standard aims to provide a framework for specifying and combining different protocols embedded in Bitcoin transactions. It also aims to be agnostic to the way the data is stored within Script, and equally supportive of all types of protocols and protocol identifiers.
The protocol needs to be as simple and lightweight as possible, to enable easy integration with various services, and to be efficient regarding processing speed, to support high throughput systems, and its storage footprint, to keep mining fees as low as possible.
The Envelope protocol also aims to provide a framework to allow for the interoperability of sub-protocols. For example, if you define a data format protocol, but want to support encryption or Metanet, you can simply use this specification to combine those protocols on top of your protocol without your protocol having to know anything about Metanet or encryption. This helps on both sides. It helps the protocol developers to concentrate on the specific functionality they want to provide, and it also helps software developers to support more functionality by not having to implement specific support for combinations of functionality like Metanet, encryption, compression, and many other features for each protocol.
Scope
The Envelope protocol is focused on everything required to provide a common system for identifying all protocol(s) used to encode on-chain data embedded in scripts, as well as providing a framework to allow for interoperability between these protocols.
Out of scope
Protocols used within an Envelope and their meaning are out of scope. The Envelope protocol simply defines the framework that allows for the use, interoperability, and layering of sub-protocols.
Methods and Concepts
Data Location Within Bitcoin Scripts
Envelope data can be provided in any part of the script that is not meant to be executed. This protocol provides some recommended scripts, although it is agnostic to the exact script used.
Recommended
- An unspendable locking script that starts with
OP_FALSE
OP_RETURN
followed by the envelope data is the most easily identifiable as unspendable by a parser. - A spendable locking script with
OP_RETURN
after it, followed by the envelope data is also fairly easily identified by a parser though it requires parsing through spendable scripts, to look forOP_RETURN
s. Note that common scripts that end with anOP_VERIFY
variant of an op code likeOP_CHECKSIGVERIFY
will not work properly becauseOP_VERIFY
consumes the top stack item and exits the script if it fails. However if it doesn’t fail then it will continue operation. AnOP_RETURN
following it will fail if there is nothing on the stack or consume the next stack item (not the result of the previous operation, which was eaten byOP_VERIFY
) and either proceed on the basis of a stack item that probably isn’t the one you expected. Potentially rendering a script that should have been successful, failed. For example, if the last op code isOP_CHECKSIGVERIFY
and the signature check passes, the script will exit successfully as that’s the last op code. If you addOP_RETURN
after it, the script will fail even with a correct signature. This problem is solved by replacingOP_CHECKSIGVERIFY
withOP_CHECKSIG
such thatOP_RETURN
will exit the script and succeed or fail based on the output of theOP_CHECKSIG
.
Push data
Push data in this document refers to Bitcoin script push operations that, when in executable code, would cause a value to be pushed to the stack. They consist of an op code that specifies the size of the data or the method of specifying the size of the data, followed by the data.
For data sizes up to 75 bytes (hex 0x4b
) a single byte op code is provided that is a byte containing the size. So, for a pushdata containing 32 bytes of data, you write the byte containing the value 32 0x20
followed by 32 bytes of data.
For longer data sizes, there are:
OP_PUSHDATA1
(0x4c
)OP_PUSHDATA2
(0x4d
)OP_PUSHDATA4
(0x4e
)
Each of those op codes is followed by the specified number of bytes, 1, 2, or 4 containing a little endian integer representing the length of the data, followed by the data.
Encapsulating data in push data elements is particularly important in spendable outputs. Even unexecuted parts of a script must pass the grammar check (i.e. OP_IF
like op codes must have a matching OP_ENDIF
). Whilst it is possible to put any bytes after an OP_RETURN
in an unspendable output this is because it is intended to be unspendable. If those bytes happen to contain 0x63
(OP_IF
) and doesn’t follow with 0x68
(OP_ENDIF
). Then the script would fail the grammar check and be unspendable even if the spending script is executed successfully.
Examples
Single byte push op (up to 75 bytes)
Script: 0x080102030405060708
The first byte is an op code that tells the bitcoin script interpreter to push the next 8 bytes of the script onto the stack.
OP_PUSHDATA1
Script: 0x4c640102030405…
The first byte 0x4c
is the value for OP_PUSHDATA1
that tells the interpreter that the next byte specifies the number of bytes to read from the script and push onto the stack. 0x64
is the value 100 telling the interpreter to push the next 100 bytes of the script onto the stack.
OP_PUSHDATA2
Script: 0x4d15320102030405…
The first byte 0x4d
is the value for OP_PUSHDATA2
that tells the interpreter that the next 2 bytes, using little endian, specify the number of bytes to read from the script and push onto the stack. 0x1532
are the little endian representation for the number 12,821 telling the interpreter to push the next 12,821 bytes of the script onto the stack.
OP_PUSHDATA4
Script: 0x4e153247000102030405…
The first byte 0x4e
is the value for OP_PUSHDATA4
that tells the interpreter that the next 4 bytes, using little endian, specify the number of bytes to read from the script and push onto the stack. 0x15324700
is the little endian representation for the number 4,665,877 telling the interpreter to push the next 4,665,877 bytes of the script onto the stack.
Numbers
Numbers used to specify the number of protocol identifiers and the number of push datas will use standard Bitcoin “script” format.
For numbers 1 through 16, use op codes OP_1
through OP_16
(hex byte values 0x51
through 0x60
). A push data is used for larger numbers. The push data’s data is a little endian number, where the first byte is least significant and the last byte is the most significant. If the first bit of the most significant byte (the last byte) is set then the number is negative. If that byte is 0x80
then it is not used as part of the value, otherwise the bits of that byte are inverted. To represent a positive number with the first bit set, like 128 0x80
, a zero byte is added as the most significant byte (the last byte). For the purposes of this protocol, only positive numbers are valid, since they are counts of protocol identifiers or push datas, and so the only exception is to add a zero byte if the last byte has its most significant bit set. Otherwise, trailing zero bytes are removed when encoding.
Examples
Less than or equal to 16
To specify 3 protocol IDs or push datas simply use the specific op code OP_3
.
Script: OP_3
Greater than 16
To specify 25 protocol IDs or push datas, a push data must be used to push the value 25 onto the stack.
Script: 0x0119
0x01
tells the interpreter to push 1 byte to the stack. The hex value 0x19
equals 25 (in decimal).
Highest bit set
To specify 128 protocol IDs or push datas, a push data and a special zero byte is needed to distinguish it from a negative number. This is because, in traditional computing, negative integers are represented using two’s complement and this means that the highest bit is set when the value is negative.
Script: 0x028000
0x02
tells the interpreter to push 2 bytes to the stack. The hex value 0x80
represents the value 128, if it were an unsigned integer, but it can also be interpreted as -1 if it is an 8 bit signed integer. To prevent it representing -1, a zero byte 0x00
must be added to the end. To clarify, the interpreter would interpret 0x0180
as -1.
Values higher than 8 bits
To specify values over 255, more than 1 value byte using little endian is required.
Script: 0x021532
0x02
tells the interpreter to push 2 bytes to the stack. The bytes 0x1532
in little endian represent the value 12,821.
Non-executable Data
Data within a Bitcoin script that is provably not executed by the interpreter.
- After an
OP_RETURN
- Within an
OP_FALSE OP_IF
…OP_ENDIF
. Note that the script within here must be grammatically correct or the script won’t be spendable and must not contain any mismatchedIF
op codes or theOP_ENDIF
will not be recognized. There should only be push datas within theIF
op codes. OP_DROP
after being pushed
Sub-protocol
Sub-protocol is used to refer to protocols that are used within the Envelope protocol and referenced by the Envelope protocol.
Specification
The data identified as non-executable by the methods above, is in the following format:
- starts with a push data that contains the value
0xbd01
to identify the envelope protocol version 1. - first byte is the envelope protocol and the second is the version. A push data op code to push 2 bytes is
0x02
, so the first piece of data should be0x02bd01
.
This is followed by 1 or more self-contained sections of data as shown below:
Section
Envelope data is divided into sections to allow for multiple sets of sub-protocols to be included. Each section of data starts with an Envelope section header that provides information about the data in that section. The Envelope section header is as follows:
- A number that specifies the number of protocol identifiers. (i.e.
OP_1
) - A push data with each protocol identifier being used. A protocol-identifier can be any unique data wrapped in a push data. The order specifies the order used to decode, and the reverse should be used to encode. For example, the first might be an encryption protocol, and the next, a data format protocol. When the data is written, it is first formatted according to the data format and then encrypted. When read, it is first decrypted and then read in the data format.
- A number that specifies the number of push data, (or op codes) that are encoded using the specified protocols.
- After the specified number of push data, if the non-executable part of the script still has data remaining, then it is assumed that a new envelope section will start with a new section header including a new set of protocol identifiers.
Name | Type | Note |
Protocol Count | Script Number (OP_1, …) | |
Protocol Identifiers | push data | Repeats “Protocol Count “ times |
Push Data Count | Script Number (OP_1, …) |
Sub-protocols are processed in a specific order so that it is clear which push data applies to each. A sub-protocol processes input data either directly from the bitcoin script or from the output of the previous sub-protocol. A sub-protocol can either output data to the next sub-protocol or output context relevant information like an image or document.
For example, if the first protocol specified is for encryption then its input is the bitcoin script data. The first push data can be a header that defines how the data following it is to be decrypted. Then the next protocol can take the decrypted output from the first sub-protocol and create an image or document from it.
Examples
The following are possible sub-protocols (for example purposes only).
Simple Single Protocol
OP_FALSE OP_RETURN 0x02bd01 OP_1 0x06 “proto1” OP_1 0x10 “some proto1 data”
Data | Description |
OP_FALSE OP_RETURN | Specifies that the script is unspendable and contains data. |
0x02bd01 | This is the push data containing the Envelope protocol ID that specifies that the following data is in accordance with the Envelope protocol. |
OP_1 | Only 1 protocol is used for the following data. |
0x06 “proto1” | The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify a hypothetical data protocol. |
OP_1 | Only 1 push data is used for the proto1 protocol data. |
0x10 ”some proto1 data” | This push data is the data according to the proto1 protocol. |
Encrypted Data Protocol
OP_FALSE OP_RETURN 0x02bd01 OP_2 0x05 “CRYPT” 0x06 “proto1” OP_2 0x1b0000000100... 0x…
Data | Description |
OP_FALSE OP_RETURN | Specifies that the script is unspendable and contains data. |
0x02bd01 | The push data containing the Envelope protocol ID that specifies the following data is in accordance with the Envelope protocol. |
OP_2 | 2 protocols are used for the following data. |
0x05 “CRYPT” | The 0x05 specifies a push data containing 5 bytes. The bytes are ASCII “CRYPT” and specify the hypothetical encryption protocol. |
0x06 “proto1” | The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify the hypothetical data protocol. |
OP_2 | There are 2 push datas used by the preceding protocols. |
0x1b0000000100... | A push data containing the header data for the hypothetical “CRYPT” protocol. The specifics are irrelevant for this discussion. |
0x… | A push data containing the proto1 data that is encrypted with CRYPT. |
Encrypted And Unencrypted Data Protocol
OP_FALSE OP_RETURN 0x02bd01 OP_2 0x05 “CRYPT” 0x06 “proto1” OP_2 0x1b0000000100... 0x... OP_1 0x06 “proto1” OP_1 0x10 “some proto1 data”
Data | Description |
OP_FALSE OP_RETURN | Specifies that the script is unspendable and contains data. |
0x02bd01 | This is the push data containing the Envelope protocol ID which specifies that the following data is in accordance with the Envelope protocol. |
OP_2 | 2 protocols are used for the related data. |
0x05 “CRYPT” | The 0x05 specifies a push data containing 5 bytes. The bytes are ASCII “CRYPT” and specify the encryption protocol. |
0x06 “proto1” | The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify a hypothetical data protocol. |
OP_2 | There are 2 push datas used by the preceding protocols. |
0x1b0000000100... | A push data containing the header data for the hypothetical “CRYPT” protocol. The specifics are irrelevant for this discussion. |
0x… | A push data containing the proto1 data that is encrypted with CRYPT. |
OP_1 | This is the start of a new section since the 2 push datas specified in the previous section header were consumed. OP_1 specifies that only 1 protocol is used for the following data. |
0x06 “proto1” | The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify a hypothetical data protocol. |
OP_1 | Only one push data is used for the proto1 protocol data. |
0x10 ”some proto1 data” | This push data is the data according to the proto1 protocol. |
Layering Diagram
This diagram shows how multiple sub-protocols can be combined via appending or layering depending on the specifics of the sub-protocol. This is an example and the actual data can be in a different place in the script and include different sub-protocols. This Envelope consists of 2 Envelope sections. The first uses the MNET sub-protocol and the second uses the CRYPT and proto1 sub-protocols.
History
Log | Description |
Errata | |
Previous versions | |
Change Log | Protobuf was determined to be too complex for this purpose and so was replaced with simpler bitcoin like binary data encoding. Extensions were determined to be too complex and not agnostic enough for a top level protocol and so Envelope was changed to be completely agnostic but to enable combining of sub-protocols to allow MetaNet, encryption, and other standards to be combined with data formatting protocols. Standard Bitcoin script numbers were determined to be better than a simpler custom number format. |
Decision Log | Extensions (Between versions 0.1 and 0.2) The original protocol contained optional “extensions” for metanet and encryption. It was decided that the base protocol should be completely agnostic and provide the ability to combine multiple protocols in different ways to enable that type of functionality more dynamically. This was based on feedback from Jaime Salom Vinado, Jonathan Aird, and Jack Davies, Encoding Change (Between versions 0.1 and 0.2) The original protocol was a pushdata for the Envelope protocol ID followed by a pushdata containing Protobuf data. It was decided that a simpler encoding is more appropriate since there should be no extensions, so very few fields and very little data needs to be at the Envelope protocol level. Simpler is better. This was based on feedback from Roger Taylor and Jonathan Aird. Numbers (Between versions 0.2 and 0.3) Version 0.2 had a simplified version of bitcoin script numbers used for protocol id and push data counts because negative numbers are not needed. It was decided to use standard bitcoin script numbers instead so it is more standardized even though they require special rules to handle negative numbers. This was based on feedback from Jonathan Aird and Roger Taylor. |
Relationships
Relationship | Description |
IP licences and dependencies | This standard was created as an independent work under auspices of the Bitcoin SV Technical Standards Committee. Whilst best efforts have been made to ensure that this standard and its implementations do not infringe intellectual property rights of any third party, Bitcoin Association can offer no guarantee relating to third party intellectual property rights. |
Copyright | © 2021 Bitcoin Association for BSV. All rights reserved. Unless otherwise specified, or required in the context of its implementation on BSV Blockchain, no part of this standard may be reproduced or utilised otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission of Bitcoin Association. |
Extends | |
Modifies | |
Deprecates | |
Depends On | |
Prior Art | https://github.com/bitcoin-sv-specs/op_return https://bitcom.bitdb.network/#/ |
Existing Solution | |
References |
Supplementary Material
Summary of comments received during both internal and public review
During review and discussion several points of consensus were reached.
- The protocol should be as simple as possible since it may have many implementations.
- The protocol should be more agnostic and rather than supporting extensions it should be possible to “layer” sub-protocols to enable more advanced usage. This enables sub-protocols to enable data transformation and other functionality without specific protocol support.
- The protocol should not use the Protocol Buffers protocol for encoding protocol level data, but use a more Bitcoin script friendly encoding. This was partially to remove any implementation dependencies and also to make it more Bitcoin developer friendly.
Reference Implementations
Tokenized Envelope System
https://github.com/tokenized/envelope
To record an implementation of the standard, please register below.