Here are some design notes on signatures of validity. This is meant mostly for the Corda team - if you're just interested in writing apps you can skip this thread.
Introduction
Database integrity in all decentralised ledger systems is based on running every transaction leading up to a transaction you care about through a verification algorithm. Most blockchain systems also verify transactions you don't directly care about too, in order to check system invariants like the number of coins in circulation.
Sometimes it's useful to outsource this process to a third party, because:
- You don't have sufficient resources to verify a potentially large transaction graph yourself.
- Your peers don't want to trust you with private data.
- You have fallen behind the upgrade schedule and are receiving data from the future which you can't understand.
A good example of (1) is a mobile phone. A good example of (2) is a non-validating notary (the most popular kind, it turns out). A good example of (3) is when the Corda Network moved from minPlatformVersion=3 to minPlatformVersion=4, in order to introduce reference states and pinned network parameters. Everyone had to upgrade. In a world where IT departments are seen as a cost centre and have a "what's in it for me?" mentality, getting everyone to upgrade within the same timeframe can be painful, even if the timeframe is quite generous. We'd like nodes to be able to continue functioning in a degraded security mode in this case, where third parties take on the verification work for them.
But the main driving use case is the first stage of SGX/enclave integration. In phase one we support the "attestation model" described in the updated white paper, where a signature of validity from an enclave is treated as equivalent to doing the validity yourself.
Modes of operation
As we examine different needs several "modes" come to the surface:
- Resolving locally outsourced: A node wishes to outsource verification in a way that doesn't require global consensus, and which lets them create transactions building on the verified transactions. This means they need the dependency graph as today so they can send it onwards, they'll just store it, they won't process it.
- Non-resolving locally outsourced: A node outsources verification to a verifier of their choice (no global consensus) and doesn't resolve it, but this is OK because they never build on the transaction themselves. Examples: non-validating notaries, observer nodes, regulators. Any states processed in this mode are read-only from the PoV of the node.
- Non-resolving globally outsourced: a zone operator has taken the decision that certain entities are trusted with ledger integrity and their signature is accepted in lieu of self-verification. Typically such entities will be enclaves. If enclave integrity is breached, ledger integrity is also.
This is less decentralised but can be useful in some cases, like for projects that want to use Corda mostly as a distributed workflow engine rather than a traditional blockchain. It is also useful to handle (imo, bad) regulations that say you can't send financial data across borders even if encrypted, which would otherwise make a global financial blockchain impossible.
Although some of these models may seem attractive, they each come with a health warning. In particular "non-resolving globally outsourced" means
any breach of enclave integrity is a breach of ledger integrity, and once doubt is introduced there's no real way to get it back short of getting everyone to cough up their local transaction store to some central party that re-validates everything (which may be impossible).
In the intended final mode of operation for enclaves peers pass encrypted transaction chains between themselves, thus every node is still locally verifying but in a way that stops the owner of the node from seeing the historical data. In this mode - the "verification model" (vs the "attestation model") - an enclave breach simply downgrades you to the same privacy level Corda gives you today. As private data gets less interesting as it ages this means enclave breaches self-repair over time.
Implementation approach
Igor has already posted a work in progress PoC PR for SGX attestation model verification (without the enclave part). Most of the code we need is there, but we should probably adjust it to more precisely define the modes in the API.
For semi-validating notaries (non-resolving locally outsourced), no global consensus on enclaves is needed. Theoretically every notary could require different enclaves. To use all of them the client needs all the enclaves, or at least, needs access to them ... the notary client doesn't need SGX hardware themselves because they can encrypt the transactions and send them to a remote enclave being provided by a third party.
Instead we need a way to configure the node to say, "accept <X> as a signature of validity", probably in the config file. Peers also need to find this out some way. There's a NotaryInfo type, but, we concluded above that this functionality isn't notary specific, it's general to several kinds of participant. E.g. a regulator might be comfortable outsourcing verification and if they don't trade themselves, they don't mind the part of the ledger they're watching being read only, so they'd like to advertise support for SoVs too. That suggests it should be a part of the NodeInfo structure rather than NotaryInfo.
Support for the other modes (resolving locally outsourced and non-resolving globally outsourced) can come later. For semi-validating notaries we don't need them. We should at least define the APIs and config file entries with them in mind though.
Enclave identities
When using enclaves to create signatures of validity there isn't a global keypair - every enclave instance has its own. You use remote attestation data to connect a public key with a code hash or code signing key.
What enclaves are acceptable? This is analogous to our signature vs hash constraints. It's hard to know how paranoid nodes will be - is any enclave signed by a particular key OK, or would they prefer to be able to list out the hashes they accept and then reproduce the build of the enclave themselves? This is hard to predict.
Nodes need to be able to advertise constraints on what signatures of validity they will accept. This looks to me like a CompositeKey paired with a (possibly empty) set of hashes. SGX enclaves can only be signed by a single key because it's used for establishing a linear version timeline, but for "agreeing to what's acceptable" purposes we can allow identity keys to sign hashes or code signing keys.
Constraining acceptable enclaves is a problem that's going to crop up in a lot of places. The data types for this should be general, and probably not a part of mainline Corda.
How enclaves fit in with the global identity hierarchy is a topic for a different email.