1. What is the Schnorr Signature Scheme?
The Schnorr Signature Scheme is a digital signature scheme to increase the privacy and scalability of the bitcoin network.
2. Who invented the Schnorr Signature Scheme and when?
The Schnorr signature scheme and Taproot technology are proposals to improve the BIP-340 and BIP-341 bitcoin protocol. On January 21, 2020, developer Peter Welle included a change request for softfork.
The Schnorr signature scheme was proposed in 1991 by German cryptographer Klaus-Peter Schnorr, a professor at Frankfurt University.
The scheme proposed by Schnorr is a modification of the El Gamal (1985) and Fiat-Shamir (1986) schemes, but has a smaller signature size and also uses the work of cryptographer David Chaum.
Before the scheme was published, Schnorr obtained a number of patents on it, which expired in 2008 when Satoshi Nakamoto introduced bitcoin. Schnorr’s signatures were already usable at the time, but they had not been standardized and were not widely distributed.
When Nakamoto created bitcoin, he had to choose one of the existing signature schemes. He needed an easy to use, secure, open source algorithm. ECDSA met these requirements. The predecessor of ECDSA, the DSA algorithm, was a hybrid of the Schnorr scheme and the El Gamal scheme, and was created to circumvent the Schnorr patents.
ECDSA in bitcoin has become faster and more efficient thanks to the work of Peter Welle and his colleagues, who created an improved elliptic curve, secp256k1.
ECDSA has some shortcomings, and developers have been looking for alternatives. The first discussions about the possible implementation of Schnorr signatures in the bitcoin network took place in 2014, and a few years later developer Peter Velle published the Schnorr BIP.
3. What are the key technical features of Schnorr signatures and their advantages over ECDSA?
Like ECDSA, Schnorr signatures use a discrete logarithm problem. The advantage of the Schnorr signatures is that they use fewer assumptions and have a robust formal-logical proof: their security is easily proven using the random oracle model and the rather complex elliptic curve discrete logarithm problem (ECDLP).
Schnorr signatures are a more transparent application technology which is easier for cryptographers to work with.
Schnorr signatures are provably inflexible, whereas ECDSA signatures are flexible, giving a third party without access to the private key the ability to change an existing valid signature and make a double-waste.
A significant advantage of Schnorr signatures is the property of linearity, realized through linear mathematics.
Schnorr signatures are linear in the sense that they can be subject to addition or subtraction. The result of such operations is a valid signature corresponding to the same addition (or subtraction) of public keys. In the case of ECDSA, this scheme does not work-subtracting or adding such digital signatures makes no sense.
The linearity property of Schnorr signatures allows keys and signatures to be aggregated. Aggregation means that it is possible to combine several public keys into one so that all parties need the same signature. Through the addition of the keys of several inputs, they can be aggregated into a single signature consisting of the partial signatures of all signers.
The equations below illustrate the aggregation process made possible by the linearity property of Schnorr signatures. No one but the participants know that there are three people behind a single public key/signature.
In multi-signature M-of-n transactions, partial signatures are known as threshold signatures. As the graph below shows, in a 3-out-of-5 multi-signature we have M=3 signatures (out of n=5) as part of the transaction inputs.
Multisignature M-of-n transactions need at least M signers and verification of each signature. To verify UTXO’s ownership of the multisignature key, the deblocking scriptSig must contain a number of M ECDSA signatures. The size of scriptSigs grows linearly with the number of M signatures, which increases the size of these transactions (and the amount of transaction fees).
In addition, the observer will know that A, B, and C have signed the transaction, and will be able to identify the multisignature scheme used.
Using Schnorr signatures, M signatures are aggregated into a single signature. Once public keys and threshold signatures are provided, the transaction is authorized and looks like a normal P2PKH transaction.
Schnorr signatures allow only one signature to be created for all M parties. A deblocking script will have a single signature that is an aggregate of all parties’ signatures.
The observer will no longer be able to associate a transaction signature with one person, many people, or a threshold number of people. Although addresses and amounts in transactions are still publicly available, Schnorr signatures make it difficult to use wallet identity/use script technology.
Reducing transaction size and increasing verification speed
Schnorr signatures are 11% smaller than existing signatures, which take up approximately 70-72 bytes in a transaction. Because they take up less space on the blockchain (their fixed size is 64 bytes), it reduces transaction size and fees.
Also, if a bitcoin transaction contains multiple inputs, each one needs a separate signature. For a bitcoin transaction controlled by many signers, each signer must provide a separate ECDSA signature. These signatures are individually verified. To effectively verify a group of signatures, mathematical calculations must be used.
With Schnorr signatures, all inputs will only need one combined signature. Including one signature in a transaction gives additional options for other transactions. Reducing transaction size for multi-signature transactions reduces fees. Key aggregation eliminates the need to verify each individual entry and speeds up the confirmation process.
Compact multi-signatures with ECDSA are more difficult to create because such signatures are DER-encoded, unlike Schnorr signatures, which require less space to encode.
4. How will the Schnorr signature scheme be implemented?
BIP-340 is a standardization of Schnorr signatures, allowing them to be integrated into the bitcoin protocol.
The update itself is not objectionable among developers. They consider it to be the best scheme currently available, because its mathematical properties ensure a high level of calculation correctness, it is resistant to plasticity, and it is relatively fast in terms of transaction validation.
If the Schnorr signatures are implemented, the average user won’t actually notice their appearance (except for a single character in SegWit addresses). Schnorr signatures will not replace ECDSA in bitcoin – both schemes will co-exist.
5. What is Taproot (BIP-341)?
Taproot (BIP-341) is the second part of a proposal involving Schnorr/Taproot/Tapscript. While the Schnorr scheme offers a new type of signature, Taproot extends their functionality by introducing a new version of transaction output and a new way to define spending conditions.
6. Who invented Taproot and when?
The Taproot technology was designed and proposed by Bitcoin Core developer and former Blockstream CTO Gregory Maxwell .
In April 2018, mathematician Andrew Poelstra published a mathematical proof of security (security proof). In July of the same year, Xapo engineer and Bitcoin Core developer Anthony Townes proposed a solution to increase the amount of data used by Taproot.
On May 6, 2019, Peter Welle published proposals to improve the bitcoin protocol, in which he presented Taproot updates in conjunction with Schnorr and MAST signatures. To implement the updates into bitcoin’s code base, Welle proposed a softfork.
On January 21, 2020, Welle included Taproot in a request to accept changes for another softfork.
7. What options does Taproot offer?
While Schnorr signatures allow multi-signature transactions to look like standard (Pay-to-Public-Key-Hash) transactions, Taproot, in combination with Schnorr signatures, expands such capabilities by increasing the group of transaction types that can be given the appearance of standard
- The use of P2PKH and P2WPKH, i.e., single wastes;
- n-is-n spending with MuSig or equivalents (similar to the current use of P2SH and P2WSH 2-of-2 multisignatures);
- k-of-n (for minimal values of n) using the most common k signers;
- Lightning Network channel closures, atomic swaps, and other protocols that can sometimes result in all parties agreeing on the result.
These four categories of usage scenarios represent the majority of bitcoin transactions today. Regardless of the complexity of the contract, Taproot allows you to give the joint result in the blockchain the appearance of a single key expenditure.
The remaining scripts displaying other contract results are not added to the blockchain, thus freeing up space in a particular block for more complex transactions.
8. How does Taproot work?
Understanding Taproot requires a prior understanding of the MAST solution.
The MAST (Merkle abstract syntax tree-based technology) was proposed in 2016 by developer Johnson Lau.
MAST proposes the use of a new witness program and uses Merkle’s tree to decode mutually exclusive branches in a script.
A Merkle tree is a data structure; the term “tree” describes the structure of its branches. The Merkle tree is usually depicted as in the graph below: the root is at the top, the leaves are at the bottom of the graph.
With MAST you can create complex contracts with many different specifiers. Only the executable script is opened, which saves space in the blockchain and allows more complex scripts/contracts to be implemented.
The Merkle tree is created through individual hashing of each script to produce a short, unique identifier. Each identifier is then combined with another identifier and hashed again, creating another short unique identifier for that pair.
This process is repeated and continues until only one identifier remains, referred to as the Merkle root (Address = Hash (1,2) in the graphic above), which uniquely identifies the entire data set in several bytes. The Merkle root can be thought of as a “safe” for coins.
Unlike Pay-to-Script-Hash (P2SH), MAST allows multiple spending conditions to be structured in the Merkle tree. Only satisfied conditions are revealed: using the root and the Merkle tree, it is confirmed that the condition is in the Merkle tree. The rest of the tree remains hidden.
For example, if we have a complex script that says a party cannot spend its coins before a month expires (timelock), or coins can be spent via a 3-of-5 multisignature transaction, then both conditions will be revealed as soon as the coins are spent (this scheme works now in bitcoin).
MAST offers the following possibility: if any data in the Merkle tree is revealed, the Merkle root and a number of additional data (called the Merkle path) can be used to confirm that specific data has been included in the Merkle tree. The rest of the tree (and hence the other terms) remain hashed and hidden. This means that if all participants agree, only the completed condition needs to be disclosed.
Users of complex contracts can create smaller transactions, and the efficiency gains are greater in the case of more complex contracts with more subscripts. MAST, unlike any other existing mechanisms, allows for multiple additional branches, allowing for more advanced smart contracts without the added cost of burdening bitcoin nodes.
In the illustration above, Alice can even add a longer beneficiary chain to the MAST structure without changing the number of bytes used. The size of the commissions does not increase, as she still spends her bitcoins using only 32 bytes. At the network level, blocks will be able to handle a larger number of complex transactions.
The flaw is that by default everyone is forced to use the MAST structure to maintain a proper level of privacy. The top branch of the Merkle tree is always visible, and observers may realize that other spending conditions exist. In addition, there is an increased burden on most transactions that do not need an additional script, increasing their cost.
MAST has not yet been implemented in bitcoin because the changes needed to do so are too complex and could lead to consequences that are not easy to calculate. A possible solution to the problem could be the Schnorr/Taproot/Tapscript solution package, as it acts as a golden mean between simplicity and additional functionality.
9. How does Taproot improve MAST?
Taproot offers its own version of the Merkle tree, called the script tree. Participants can choose to spend with:
- public key as a normal signature;
- Spending with a script.
In the first case, this is the default expenditure path, where single or multilateral public keys are indistinguishable.
In the second case, hidden scripts are not exposed until the spending is done. Different scripts can be organized into a Merkle tree, and the outputs can also be spent by exposing one of the specifiers.
If we spend a transaction using a primary spend script, we simply provide a Merkle proof, which consists of a primary spend script and an alternate spend script hash – this is enough to confirm that the primary spend script is contained in the script tree.
Taproot uses the MAST structure to hide the conditions behind the Merkle root. The Merkle root itself is hidden in this script and allows direct spending through the key. Only the single key is sent to the blockchain – no one sees that there are additional conditions.
In combination with Schnorr signatures, the MAST structure is hidden thanks to Taproot outputs. At the top of the Merkle tree there is an option to publish the single public key and signature. As a result, P2PKH and P2SH transactions look identical.
An illustration is the closing of a Lightning channel.
Lightning feeds are a variation of the 2-of-2 multisignature. Instead of closing the transaction with a cumbersome script, Schnorr allows the signatures to be combined and presented as a public key/Taproot signature. When both parties agree, the result looks as if someone has used up that output with a normal signature, sending it to two addresses. The observer will not be able to tell that it is a Lightning channel.
TapBranch is the script tree (TapTree) for closing the Lightning channel
To hide the MAST structure, the TapBranch hash in the graphic above is hashed using an aggregated public key (thanks to the Schnorr scheme, Alice and Bob can add their public keys to create a Taproot internal key).
The resulting hash is used as a private key, from which another modified public key is derived. Changing keys, also known as hiding a key pair, involves embedding scripts 1 and 2.
The modified public key is then added to Taproot’s internal key to create the Taproot exit key. The process is illustrated below:
As mentioned, there are two spend keys. The default spending path is when Alice and Bob agree to close the Lightning channel, and the Taproot exit key ensures that the transaction looks like a standard P2PKH transaction. In other scenarios, the script used is revealed as soon as the coins are spent, while all other options remain hidden.
In the above example, if Alice and Bob agree to make a Lightning payment, they can jointly merge Schnorr signatures, create a master public key, add signatures together, and create a master signature.
Both parties make partial signatures using their individual keys, and closing the Lightning channel looks like a direct payment to the public key.
When the closure is inconsistent, only the script used is disclosed. Verifiers will be able to determine that the threshold public key has been changed via Merkle root. However, all other options/scripts will remain hidden.
The graphic above shows that the script tree offers a new recovery option to gain access to bitcoins. Taproot provides a recovery option for lost coins (for users with upgraded wallets). If a single key is lost, it is irretrievably lost. If, however, the user loses a private key and their funds are in Taproot’s withdrawal form, there must be another way by which to claim the coins (e.g., restore backup 3-of-5 keys held by the user’s relatives).
Taproot increases the privacy, efficiency, and flexibility of bitcoin scripts by allowing developers to write complex scripts while minimizing the impact on the blockchain.
Sophisticated transactions save significant fees, as data-intensive scripts no longer have to pay fees that are higher than the fees in a standard Pay-to-Public-Key-Hash transaction. The more complex the transaction, the more efficient it is.
Since Taproot allows complicated transactions with only one signature, the number of bytes used for aggregated keys and signatures does not change depending on the number of signers. With multi-signature Witness-Script-Hash (P2WSH), each additional public key adds 8.5 bytes, and each additional signature adds approximately 18.25 bytes.
From a privacy perspective, Taproot minimizes the information about spending conditions for the transaction output that is disclosed on the blockchain. With Taproot, most applications can use a spend path based on a key whose privacy is protected.
Although the Schnorr scheme allows multi-signature transactions to be made visible as regular Pay-to-Public-Key-Hash transactions, Taproot extends the range of transactions that can be made visible (making Pay-to-Public-Key-Hash and Pay-to-Script-Hash indistinguishable).