Envelope Specification
by Curtis Ellis on 03 September 2021
Expiry date
Licence
No intellectual property has been sought for this standard
Applies to
Wallets will need to use this standard to read/write transactions containing it. Data services will need to use this standard to filter transactions based on data within it.
Standard Stage
In Consideration Draft Internal Review Public Review Published Recommended
Thanks for submiting your comments!
The Envelope specification standard in a nutshell

One of the most valuable features of Bitcoin is the ability to immutably record data on a time-stamped public ledger. This provides the ability to audit and prove the existence of data of all types, from financial transaction records to book manuscripts, mathematical formulae and even digital works of art. The problem is that data files come in many different formats, each requiring compatible software to decode, open and process. This means that if you were trying to find a specific type of data on the blockchain, you would have to go through the process of accessing all of the irrelevant data files to find the ones you were looking for. The TSC envelope specification proposes that each data record is wrapped in a standard ‘envelope’ or ‘wrapper’ protocol that allows software to easily determine how the data is formatted and how to either process it or display it to a user.
Read more

TS 2020.011-30

Envelope Specification

Standard fact sheet
AttributeDescription
Version1.0
AuthorsCurtis Ellis
ReviewersAttila Aros, Jonathan Aird, Jack Davies, Dylan Murray, Jaime Salom Viñado, Mathias Wulff, Roger Taylor
AcknowledgementsThe Authors would particularly like to thank the following people for their contributions to and comments on the standard document: Derek Moore, Steve Shadders
Tags and CategoriesOn-chain data, Data and token interoperability
Publication DateSeptember 3, 2021
Valid Until
Copyright© 2021 Bitcoin Association for BSV. All rights reserved.
IP GenerationNo intellectual property has been sought for this standard
Known ImplementationsTokenized Envelope System
Notify the TSC if your company has implemented the Standard by using the form in the side column.
Applies ToWallets will need to use this standard to read/write transactions containing it. Data services will need to use this standard to filter transactions based on data within it.
BRFC ID
StatusPublished
VisibilityPublic

Background

Currently, data stored within a Bitcoin SV transaction typically utilizes a standard unspendable output that begins with OP_FALSE OP_RETURN. The first push data after that is generally the protocol identifier which specifies the format/encoding of the subsequent data in the output script.

This standard aims to improve the way data is stored within Bitcoin transactions by creating a general purpose data envelope that allows for efficient identification, structuring, and interoperability (or layering) of 1 or more data protocols. We will refer to these other protocols as sub-protocols for the purposes of this document.

Problem Statement

When a software application sees an output script that begins with OP_FALSE OP_RETURN, it knows that it is provably unspendable and likely contains data that has meaning outside of the Bitcoin protocol. Currently, it is difficult for blockchain indexing software to determine the encoding and formatting of the data because there are many different protocols already in use, and there are no constraints/control over what, or how many protocols will be stored within a Bitcoin transaction. In other words, interacting software is unable to quickly determine what it knows or does not know about the data. This uncertainty can add systemic costs to IT infrastructure that is interested in some or all of the data.

Objectives

This standard aims to provide a framework for specifying and combining different protocols embedded in Bitcoin transactions. It also aims to be agnostic to the way the data is stored within Script, and equally supportive of all types of protocols and protocol identifiers.

The protocol needs to be as simple and lightweight as possible, to enable easy integration with various services, and to be efficient regarding processing speed, to support high throughput systems, and its storage footprint, to keep mining fees as low as possible.

The Envelope protocol also aims to provide a framework to allow for the interoperability of sub-protocols. For example, if you define a data format protocol, but want to support encryption or Metanet, you can simply use this specification to combine those protocols on top of your protocol without your protocol having to know anything about Metanet or encryption. This helps on both sides. It helps the protocol developers to concentrate on the specific functionality they want to provide, and it also helps software developers to support more functionality by not having to implement specific support for combinations of functionality like Metanet, encryption, compression, and many other features for each protocol.

Scope

The Envelope protocol is focused on everything required to provide a common system for identifying all protocol(s) used to encode on-chain data embedded in scripts, as well as providing a framework to allow for interoperability between these protocols.

Out of scope

Protocols used within an Envelope and their meaning are out of scope. The Envelope protocol simply defines the framework that allows for the use, interoperability, and layering of sub-protocols.

Methods and Concepts

Data Location Within Bitcoin Scripts

Envelope data can be provided in any part of the script that is not meant to be executed. This protocol provides some recommended scripts, although it is agnostic to the exact script used.

  • An unspendable locking script that starts with OP_FALSE OP_RETURN followed by the envelope data is the most easily identifiable as unspendable by a parser.
  • A spendable locking script with OP_RETURN after it, followed by the envelope data is also fairly easily identified by a parser though it requires parsing through spendable scripts, to look for OP_RETURNs. Note that common scripts that end with an OP_VERIFY variant of an op code like OP_CHECKSIGVERIFY will not work properly because OP_VERIFY consumes the top stack item and exits the script if it fails. However if it doesn’t fail then it will continue operation. An OP_RETURN following it will fail if there is nothing on the stack or consume the next stack item (not the result of the previous operation, which was eaten by OP_VERIFY) and either proceed on the basis of a stack item that probably isn’t the one you expected. Potentially rendering a script that should have been successful, failed. For example, if the last op code is OP_CHECKSIGVERIFY and the signature check passes, the script will exit successfully as that’s the last op code. If you add OP_RETURN after it, the script will fail even with a correct signature. This problem is solved by replacing OP_CHECKSIGVERIFY with OP_CHECKSIG such that OP_RETURN will exit the script and succeed or fail based on the output of the OP_CHECKSIG.

Push data

Push data in this document refers to Bitcoin script push operations that, when in executable code, would cause a value to be pushed to the stack. They consist of an op code that specifies the size of the data or the method of specifying the size of the data, followed by the data.

For data sizes up to 75 bytes (hex 0x4b) a single byte op code is provided that is a byte containing the size. So, for a pushdata containing 32 bytes of data, you write the byte containing the value 32 0x20 followed by 32 bytes of data.

For longer data sizes, there are:

  • OP_PUSHDATA1 (0x4c)
  • OP_PUSHDATA2 (0x4d)
  • OP_PUSHDATA4 (0x4e)

Each of those op codes is followed by the specified number of bytes, 1, 2, or 4 containing a little endian integer representing the length of the data, followed by the data.

Encapsulating data in push data elements is particularly important in spendable outputs. Even unexecuted parts of a script must pass the grammar check (i.e. OP_IF like op codes must have a matching OP_ENDIF). Whilst it is possible to put any bytes after an OP_RETURN in an unspendable output this is because it is intended to be unspendable. If those bytes happen to contain 0x63 (OP_IF) and doesn’t follow with 0x68 (OP_ENDIF). Then the script would fail the grammar check and be unspendable even if the spending script is executed successfully.

Examples

Single byte push op (up to 75 bytes)

Script: 0x080102030405060708

The first byte is an op code that tells the bitcoin script interpreter to push the next 8 bytes of the script onto the stack.

OP_PUSHDATA1

Script: 0x4c640102030405…

The first byte 0x4c is the value for OP_PUSHDATA1 that tells the interpreter that the next byte specifies the number of bytes to read from the script and push onto the stack. 0x64 is the value 100 telling the interpreter to push the next 100 bytes of the script onto the stack.

OP_PUSHDATA2

Script: 0x4d15320102030405…

The first byte 0x4d is the value for OP_PUSHDATA2 that tells the interpreter that the next 2 bytes, using little endian, specify the number of bytes to read from the script and push onto the stack. 0x1532 are the little endian representation for the number 12,821 telling the interpreter to push the next 12,821 bytes of the script onto the stack.

OP_PUSHDATA4

Script: 0x4e153247000102030405…

The first byte 0x4e is the value for OP_PUSHDATA4 that tells the interpreter that the next 4 bytes, using little endian, specify the number of bytes to read from the script and push onto the stack. 0x15324700 is the little endian representation for the number 4,665,877 telling the interpreter to push the next 4,665,877 bytes of the script onto the stack.

Numbers

Numbers used to specify the number of protocol identifiers and the number of push datas will use standard Bitcoin “script” format.

For numbers 1 through 16, use op codes OP_1 through OP_16 (hex byte values 0x51 through 0x60). A push data is used for larger numbers. The push data’s data is a little endian number, where the first byte is least significant and the last byte is the most significant. If the first bit of the most significant byte (the last byte) is set then the number is negative. If that byte is 0x80 then it is not used as part of the value, otherwise the bits of that byte are inverted. To represent a positive number with the first bit set, like 128 0x80, a zero byte is added as the most significant byte (the last byte). For the purposes of this protocol, only positive numbers are valid, since they are counts of protocol identifiers or push datas, and so the only exception is to add a zero byte if the last byte has its most significant bit set. Otherwise, trailing zero bytes are removed when encoding.

Examples

Less than or equal to 16

To specify 3 protocol IDs or push datas simply use the specific op code OP_3.

Script: OP_3

Greater than 16

To specify 25 protocol IDs or push datas, a push data must be used to push the value 25 onto the stack.

Script: 0x0119

0x01 tells the interpreter to push 1 byte to the stack. The hex value 0x19 equals 25 (in decimal).

Highest bit set

To specify 128 protocol IDs or push datas, a push data and a special zero byte is needed to distinguish it from a negative number. This is because, in traditional computing, negative integers are represented using two’s complement and this means that the highest bit is set when the value is negative.

Script: 0x028000

0x02 tells the interpreter to push 2 bytes to the stack. The hex value 0x80 represents the value 128, if it were an unsigned integer, but it can also be interpreted as -1 if it is an 8 bit signed integer. To prevent it representing -1, a zero byte 0x00 must be added to the end. To clarify, the interpreter would interpret 0x0180 as -1.

Values higher than 8 bits

To specify values over 255, more than 1 value byte using little endian is required.

Script: 0x021532

0x02 tells the interpreter to push 2 bytes to the stack. The bytes 0x1532 in little endian represent the value 12,821.

Non-executable Data

Data within a Bitcoin script that is provably not executed by the interpreter.

  • After an OP_RETURN
  • Within an OP_FALSE OP_IFOP_ENDIF. Note that the script within here must be grammatically correct or the script won’t be spendable and must not contain any mismatched IF op codes or the OP_ENDIF will not be recognized. There should only be push datas within the IF op codes.
  • OP_DROP after being pushed

Sub-protocol

Sub-protocol is used to refer to protocols that are used within the Envelope protocol and referenced by the Envelope protocol.

Specification

The data identified as non-executable by the methods above, is in the following format:

  • starts with a push data that contains the value 0xbd01 to identify the envelope protocol version 1.
  • first byte is the envelope protocol and the second is the version. A push data op code to push 2 bytes is 0x02, so the first piece of data should be 0x02bd01.

This is followed by 1 or more self-contained sections of data as shown below:

Section

Envelope data is divided into sections to allow for multiple sets of sub-protocols to be included. Each section of data starts with an Envelope section header that provides information about the data in that section. The Envelope section header is as follows:

  1. A number that specifies the number of protocol identifiers. (i.e. OP_1)
  2. A push data with each protocol identifier being used. A protocol-identifier can be any unique data wrapped in a push data. The order specifies the order used to decode, and the reverse should be used to encode. For example, the first might be an encryption protocol, and the next, a data format protocol. When the data is written, it is first formatted according to the data format and then encrypted. When read, it is first decrypted and then read in the data format.
  3. A number that specifies the number of push data, (or op codes) that are encoded using the specified protocols.
  4. After the specified number of push data, if the non-executable part of the script still has data remaining, then it is assumed that a new envelope section will start with a new section header including a new set of protocol identifiers.
NameTypeNote
Protocol CountScript Number (OP_1, …)
Protocol Identifierspush dataRepeats “Protocol Count “ times
Push Data CountScript Number (OP_1, …)

Sub-protocols are processed in a specific order so that it is clear which push data applies to each. A sub-protocol processes input data either directly from the bitcoin script or from the output of the previous sub-protocol. A sub-protocol can either output data to the next sub-protocol or output context relevant information like an image or document.

For example, if the first protocol specified is for encryption then its input is the bitcoin script data. The first push data can be a header that defines how the data following it is to be decrypted. Then the next protocol can take the decrypted output from the first sub-protocol and create an image or document from it.

Examples

The following are possible sub-protocols (for example purposes only).

Simple Single Protocol

OP_FALSE OP_RETURN 0x02bd01 OP_1 0x06 “proto1” OP_1 0x10 “some proto1 data”

DataDescription
OP_FALSE OP_RETURNSpecifies that the script is unspendable and contains data.
0x02bd01This is the push data containing the Envelope protocol ID that specifies that the following data is in accordance with the Envelope protocol.
OP_1Only 1 protocol is used for the following data.
0x06 “proto1”The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify a hypothetical data protocol.
OP_1Only 1 push data is used for the proto1 protocol data.
0x10 ”some proto1 data”This push data is the data according to the proto1 protocol.

Encrypted Data Protocol

OP_FALSE OP_RETURN 0x02bd01 OP_2 0x05 “CRYPT” 0x06 “proto1” OP_2 0x1b0000000100... 0x…

DataDescription
OP_FALSE OP_RETURNSpecifies that the script is unspendable and contains data.
0x02bd01The push data containing the Envelope protocol ID that specifies the following data is in accordance with the Envelope protocol.
OP_22 protocols are used for the following data.
0x05 “CRYPT”The 0x05 specifies a push data containing 5 bytes. The bytes are ASCII “CRYPT” and specify the hypothetical encryption protocol.
0x06 “proto1”The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify the hypothetical data protocol.
OP_2There are 2 push datas used by the preceding protocols.
0x1b0000000100...A push data containing the header data for the hypothetical “CRYPT” protocol. The specifics are irrelevant for this discussion.
0x…A push data containing the proto1 data that is encrypted with CRYPT.

Encrypted And Unencrypted Data Protocol

OP_FALSE OP_RETURN 0x02bd01 OP_2 0x05 “CRYPT” 0x06 “proto1” OP_2 0x1b0000000100... 0x... OP_1 0x06 “proto1” OP_1 0x10 “some proto1 data”

DataDescription
OP_FALSE OP_RETURNSpecifies that the script is unspendable and contains data.
0x02bd01This is the push data containing the Envelope protocol ID which specifies that the following data is in accordance with the Envelope protocol.
OP_22 protocols are used for the related data.
0x05 “CRYPT”The 0x05 specifies a push data containing 5 bytes. The bytes are ASCII “CRYPT” and specify the encryption protocol.
0x06 “proto1”The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify a hypothetical data protocol.
OP_2There are 2 push datas used by the preceding protocols.
0x1b0000000100...A push data containing the header data for the hypothetical “CRYPT” protocol. The specifics are irrelevant for this discussion.
0x…A push data containing the proto1 data that is encrypted with CRYPT.
OP_1This is the start of a new section since the 2 push datas specified in the previous section header were consumed. OP_1 specifies that only 1 protocol is used for the following data.
0x06 “proto1”The 0x06 specifies a push data containing 6 bytes. The bytes are ASCII “proto1” and specify a hypothetical data protocol.
OP_1Only one push data is used for the proto1 protocol data.
0x10 ”some proto1 data”This push data is the data according to the proto1 protocol.

Layering Diagram

This diagram shows how multiple sub-protocols can be combined via appending or layering depending on the specifics of the sub-protocol. This is an example and the actual data can be in a different place in the script and include different sub-protocols. This Envelope consists of 2 Envelope sections. The first uses the MNET sub-protocol and the second uses the CRYPT and proto1 sub-protocols.

OP_FALSE OP_RETURN illustration

History

LogDescription
Errata
Previous versions
Change LogProtobuf was determined to be too complex for this purpose and so was replaced with simpler bitcoin like binary data encoding. Extensions were determined to be too complex and not agnostic enough for a top level protocol and so Envelope was changed to be completely agnostic but to enable combining of sub-protocols to allow MetaNet, encryption, and other standards to be combined with data formatting protocols. Standard Bitcoin script numbers were determined to be better than a simpler custom number format.
Decision LogExtensions (Between versions 0.1 and 0.2)

The original protocol contained optional “extensions” for metanet and encryption. It was decided that the base protocol should be completely agnostic and provide the ability to combine multiple protocols in different ways to enable that type of functionality more dynamically.
This was based on feedback from Jaime Salom Vinado, Jonathan Aird, and Jack Davies,

Encoding Change (Between versions 0.1 and 0.2)

The original protocol was a pushdata for the Envelope protocol ID followed by a pushdata containing Protobuf data. It was decided that a simpler encoding is more appropriate since there should be no extensions, so very few fields and very little data needs to be at the Envelope protocol level. Simpler is better.
This was based on feedback from Roger Taylor and Jonathan Aird.

Numbers (Between versions 0.2 and 0.3)

Version 0.2 had a simplified version of bitcoin script numbers used for protocol id and push data counts because negative numbers are not needed. It was decided to use standard bitcoin script numbers instead so it is more standardized even though they require special rules to handle negative numbers.
This was based on feedback from Jonathan Aird and Roger Taylor.

Relationships

RelationshipDescription
IP licences and dependenciesThis standard was created as an independent work under auspices of the Bitcoin SV Technical Standards Committee. Whilst best efforts have been made to ensure that this standard and its implementations do not infringe intellectual property rights of any third party, Bitcoin Association can offer no guarantee relating to third party intellectual property rights.
Copyright© 2021 Bitcoin Association for BSV. All rights reserved.
Unless otherwise specified, or required in the context of its implementation on BSV Blockchain, no part of this standard may be reproduced or utilised otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission of Bitcoin Association.
Extends
Modifies
Deprecates
Depends On
Prior Arthttps://github.com/bitcoin-sv-specs/op_return https://bitcom.bitdb.network/#/
Existing Solution
References

Supplementary Material

Summary of comments received during both internal and public review

During review and discussion several points of consensus were reached. 

  • The protocol should be as simple as possible since it may have many implementations.
  • The protocol should be more agnostic and rather than supporting extensions it should be possible to “layer” sub-protocols to enable more advanced usage. This enables sub-protocols to enable data transformation and other functionality without specific protocol support.
  • The protocol should not use the Protocol Buffers protocol for encoding protocol level data, but use a more Bitcoin script friendly encoding. This was partially to remove any implementation dependencies and also to make it more Bitcoin developer friendly.

Reference Implementations

Tokenized Envelope System
https://github.com/tokenized/envelope

Record an Implementation

To record an implementation of the standard, please register below.

Already have an account?
Request for a Review or Withdrawal

To request for a review or withdrawal of the standard, please register below.

Already have an account?
Tags
Suggest new tags for this standard
or
Overview
Overview
Become a Contributor
If you wish to join us on this mission to make BSV the public blockchain of choice please fill in our preliminary registration form below. We look forward to having you on board.