[EXTERNAL] Re: [corda-dev] Corda's privacy guarantees


Immad Naseer
 

Interesting. The architectural pattern suggested in the CorDapp design language video which Ryan shared looks promising but questions of integrity and transactionality have to be thoughtfully considered in such a design, as Mike pointed out as well.

 

SGX is one way to address it and the work on Conclave looks interesting.

 

The work on ZKPs sounds quite intriguing Mike. If the generic contract validation logic has to be part of the ZKPs then I can see how that's a fairly involved process. Looking forward to reading more about it when you share it out.

 

Here's a half-baked idea. Nodes currently run the verify(tx) function which runs on the entire tx object (all input and output states, commands etc). In certain use cases, it might make sense to have a state specific verify function, in conjunction with a tx-wide verify function. If such a design made sense for a particular use case, the node performing the state validation can be given just enough portion of the transactions (through a Merkle tree) to validate the state chain it's interested in while hiding the rest of the transaction tree which it's not concerned with. Is this approach theoretically feasible? I can understand the reasons for not following this line of thinking if it makes writing the tx validation logic clumsy and complex, but it'll be good to know if the idea is at least sound.

 

Thanks,

Immad

 

From: corda-dev@groups.io <corda-dev@groups.io> On Behalf Of Mike Hearn via Groups.Io
Sent: Friday, January 17, 2020 4:13 AM
To: corda-dev@groups.io
Subject: [EXTERNAL] Re: [corda-dev] Corda's privacy guarantees

 

Yep, that's right Ryan, thanks for the answer.

The design challenge is combining integrity with privacy. The privacy problem you describe Immad can be trivially fixed by moving tokens and updating other state in two separate transactions. However, then you risk bringing back the "breaks" and reconciliation problems that people wanted blockchain to fix. So the standard approach is to combine all changes that must be atomic from a business perspective together and then either mitigate or accept the resulting privacy leaks.

Clearly this is unsatisfying, hence our research into (primarily) SGX but we've also done some research into what it'd take to provide ZKPs in Corda too. In fact I've written a paper on what's involved with ZKPs which I'll get around to publishing on the Corda blog+docsite at some point. But suffice it to say it's a lot, at least for the general case.

With SGX you can do arbitrary full speed computation on encrypted data, so then you can get both privacy and integrity. That's where we're heading.


Mike Hearn
 

Yes, partial validation is an interesting idea that crops up regularly. But it's very tricky to reason about the correctness of systems that use them. Is it still a transaction?

Why do databases have transactions anyway? Some don't. Very old SQL DBs often didn't have them and they came later as an extension. The initial wave of NoSQL databases didn't have them.

Developers like having transactions because otherwise it gets too hard to write bug free code. They can see a dataset that's not in a logically consistent state and have to anticipate that, which is really brain scrambling. A partially validated transaction opens the possibility that nobody verified the whole transaction at any point (e.g. with a non validating notary). The ledger could then become logically inconsistent if one part of the partial tx was invalid. 

Consider a three-way transaction in which party A pays party B tokens and in return party B sends an asset to party C. Corda's design says that nobody should be able to see the ledger in a state where someone might later claim there was a bug and things should be rolled back, as we assume (like all blockchain designs assume) that in a decentralised setting co-ordinating a rollback is very hard. So if the token transfer from A->B is valid but the asset transfer from B->C is rejected by C because the asset is invalid, the tokens shouldn't move. If they did then A might talk to C and discover the asset they paid for didn't move, and so the tokens should come back. Avoiding the need for inter-firm reconciliation was the original design brief for the whole project! Without atomic, valid transactions it's not clear how to achieve that.

Sometimes people counter with: maybe A B and C can all just check the transaction and refuse to sign if it isn't valid? That works in some cases but not others. If validity could be determined entirely through signatures of the relevant parties, anyone who owned some tokens could just edit the value as they see fit and then sign its validity themselves. Or you have to define the issuer as a relevant party, but then they have to counter-sign every transaction and the entire payment system halts if they go offline or decide to start selectively blocking transactions. Normally people assume decentralised = robust, so if you compromise on that to try and get a different privacy model there will be many surprised faces when an outage takes out the entire (supposedly P2P) network.

The reason we've invested in SGX is that it's very hard to combine correctness and privacy if you can't compute on private data and it's not clear that privacy should trump correctness or vice-versa. Real systems usually need both. For now we muddle through finding complex compromises in each case, but once we get SGX integrated in various forms we don't have to pick anymore. Of course, it introduces new tradeoffs and problems (e.g. what if SGX breaks). But that's a different story.

I think it's an open question to what extent supporting decentralised tokens warps the design of blockchain systems. That's where some of our own debates focus these days. If we didn't care about robust HA/private tokens, are there any other cases where validity can't be decided by a group of "relevant participants"? My suspicion is that tokens aren't special and the need to be able to update database entries without the involvement of potentially relevant parties is a requirement of many projects, e.g. anything where going offline at the "right" time would block a trade yielding a commercial advantage for someone. But I don't think we have a rigorous formal approach to analysing such requirements.


Immad Naseer
 

I'm in total agreement with the power of transactional semantics in constructing correct software. In the example you gave, it's much easier for the exchange between A, B and C to be part of the same transaction as opposed to being split in two as that can lead to messy out-of-band reconciliation.

 

I'm not entirely sure whether snipping certain parts of the tx tree necessarily leads to the loss of transactional semantics in the system. I should also clarify that by tx tree, I meant the tree of all transactions which lead to the input states for a particular transaction object.

 

I was asking a slightly different question. Is there value in _sometimes_ not sharing the entire transaction tree? Instead of each node validating the entire transaction tree each time, is it possible to snip certain branches of that tree if the application's integrity will not be jeopardized by it?

 

As a concrete example, consider the example of party A and B performing an atomic swap for a token and a unit of stock (taken from the first message in the thread). A and B should verify the full transaction dealing with that atomic swap, of course. But instead of verifying the full transaction tree of the unit of stock, can A be satisfied by just inspecting the state-chain of the stock, while the other parts of the transaction tree are torn off (*see footnote)? A will not be able to convince itself that the previous transaction which deals with the atomic swap between the stock and a crypto-kitty, was valid as a whole, and it won't inspect the history of the crypto-kitty in question, but this information might not matter to A as it's only concerned with the validity of the unit of stock which it's been given. If the stock being part of an invalid transaction invalidated the stock, this approach won't work of course. But I wonder if there are cases where that's not a problem? We can argue that the onus on verifying the integrity of the transaction between the stock and crypto-kitty is on party B and X which were part of the transaction, and party A doesn't need to concern itself with those details as it's only concerned with the unit of stock. If a party had to verify the entire tx tree each time, the more times an asset changed hands as part of different txs, more and more of the world state will have to be revealed to the validating party.

 

This is a contrived example, of course. And I completely understand why Corda made the design decisions it did. I do wonder if there are use cases which could benefit from the flexibility of choosing to hide some parts of the transaction tree in order to balance the competing concerns of privacy and integrity. Applications could then choose the right point on the privacy/integrity/complexity spectrum for themselves.

 

I do agree that SGX will address all these issues (modulo the concerns around the integrity and security of SGX itself).

 

Thanks,

Immad

 

*Over here, we're assuming that the contract will have a verify function for each state object, in addition to a verify function for the entire tx object. This way it will be possible to still determine the validity of an isolated state object even if the entire transaction object was not available.

 

From: corda-dev@groups.io <corda-dev@groups.io> On Behalf Of Mike Hearn via Groups.Io
Sent: Monday, January 20, 2020 5:11 AM
To: corda-dev@groups.io
Subject: Re: [EXTERNAL] Re: [corda-dev] Corda's privacy guarantees

 

Yes, partial validation is an interesting idea that crops up regularly. But it's very tricky to reason about the correctness of systems that use them. Is it still a transaction?

Why do databases have transactions anyway? Some don't. Very old SQL DBs often didn't have them and they came later as an extension. The initial wave of NoSQL databases didn't have them.

Developers like having transactions because otherwise it gets too hard to write bug free code. They can see a dataset that's not in a logically consistent state and have to anticipate that, which is really brain scrambling. A partially validated transaction opens the possibility that nobody verified the whole transaction at any point (e.g. with a non validating notary). The ledger could then become logically inconsistent if one part of the partial tx was invalid. 

Consider a three-way transaction in which party A pays party B tokens and in return party B sends an asset to party C. Corda's design says that nobody should be able to see the ledger in a state where someone might later claim there was a bug and things should be rolled back, as we assume (like all blockchain designs assume) that in a decentralised setting co-ordinating a rollback is very hard. So if the token transfer from A->B is valid but the asset transfer from B->C is rejected by C because the asset is invalid, the tokens shouldn't move. If they did then A might talk to C and discover the asset they paid for didn't move, and so the tokens should come back. Avoiding the need for inter-firm reconciliation was the original design brief for the whole project! Without atomic, valid transactions it's not clear how to achieve that.

Sometimes people counter with: maybe A B and C can all just check the transaction and refuse to sign if it isn't valid? That works in some cases but not others. If validity could be determined entirely through signatures of the relevant parties, anyone who owned some tokens could just edit the value as they see fit and then sign its validity themselves. Or you have to define the issuer as a relevant party, but then they have to counter-sign every transaction and the entire payment system halts if they go offline or decide to start selectively blocking transactions. Normally people assume decentralised = robust, so if you compromise on that to try and get a different privacy model there will be many surprised faces when an outage takes out the entire (supposedly P2P) network.

The reason we've invested in SGX is that it's very hard to combine correctness and privacy if you can't compute on private data and it's not clear that privacy should trump correctness or vice-versa. Real systems usually need both. For now we muddle through finding complex compromises in each case, but once we get SGX integrated in various forms we don't have to pick anymore. Of course, it introduces new tradeoffs and problems (e.g. what if SGX breaks). But that's a different story.

I think it's an open question to what extent supporting decentralised tokens warps the design of blockchain systems. That's where some of our own debates focus these days. If we didn't care about robust HA/private tokens, are there any other cases where validity can't be decided by a group of "relevant participants"? My suspicion is that tokens aren't special and the need to be able to update database entries without the involvement of potentially relevant parties is a requirement of many projects, e.g. anything where going offline at the "right" time would block a trade yielding a commercial advantage for someone. But I don't think we have a rigorous formal approach to analysing such requirements.


Immad Naseer
 

Just a quick follow up note: I've been using 'tx tree' and 'state-chain' as if they were widely understood terms. And they aren't. Hopefully the meaning was clear from context. The suggestion was to always validate the chain of state transitions all the way up to genesis of the state object, but to optionally snip parts of the tx tree so the state-chain of other, possibly irrelevant state objects were not validated. And for this to be done only if application's integrity semantics weren't violated.

 

From: corda-dev@groups.io <corda-dev@groups.io> On Behalf Of Immad Naseer via Groups.Io
Sent: Monday, January 20, 2020 12:51 PM
To: corda-dev@groups.io
Subject: Re: [EXTERNAL] Re: [corda-dev] Corda's privacy guarantees

 

I'm in total agreement with the power of transactional semantics in constructing correct software. In the example you gave, it's much easier for the exchange between A, B and C to be part of the same transaction as opposed to being split in two as that can lead to messy out-of-band reconciliation.

 

I'm not entirely sure whether snipping certain parts of the tx tree necessarily leads to the loss of transactional semantics in the system. I should also clarify that by tx tree, I meant the tree of all transactions which lead to the input states for a particular transaction object.

 

I was asking a slightly different question. Is there value in _sometimes_ not sharing the entire transaction tree? Instead of each node validating the entire transaction tree each time, is it possible to snip certain branches of that tree if the application's integrity will not be jeopardized by it?

 

As a concrete example, consider the example of party A and B performing an atomic swap for a token and a unit of stock (taken from the first message in the thread). A and B should verify the full transaction dealing with that atomic swap, of course. But instead of verifying the full transaction tree of the unit of stock, can A be satisfied by just inspecting the state-chain of the stock, while the other parts of the transaction tree are torn off (*see footnote)? A will not be able to convince itself that the previous transaction which deals with the atomic swap between the stock and a crypto-kitty, was valid as a whole, and it won't inspect the history of the crypto-kitty in question, but this information might not matter to A as it's only concerned with the validity of the unit of stock which it's been given. If the stock being part of an invalid transaction invalidated the stock, this approach won't work of course. But I wonder if there are cases where that's not a problem? We can argue that the onus on verifying the integrity of the transaction between the stock and crypto-kitty is on party B and X which were part of the transaction, and party A doesn't need to concern itself with those details as it's only concerned with the unit of stock. If a party had to verify the entire tx tree each time, the more times an asset changed hands as part of different txs, more and more of the world state will have to be revealed to the validating party.

 

This is a contrived example, of course. And I completely understand why Corda made the design decisions it did. I do wonder if there are use cases which could benefit from the flexibility of choosing to hide some parts of the transaction tree in order to balance the competing concerns of privacy and integrity. Applications could then choose the right point on the privacy/integrity/complexity spectrum for themselves.

 

I do agree that SGX will address all these issues (modulo the concerns around the integrity and security of SGX itself).

 

Thanks,

Immad

 

*Over here, we're assuming that the contract will have a verify function for each state object, in addition to a verify function for the entire tx object. This way it will be possible to still determine the validity of an isolated state object even if the entire transaction object was not available.

 

From: corda-dev@groups.io <corda-dev@groups.io> On Behalf Of Mike Hearn via Groups.Io
Sent: Monday, January 20, 2020 5:11 AM
To: corda-dev@groups.io
Subject: Re: [EXTERNAL] Re: [corda-dev] Corda's privacy guarantees

 

Yes, partial validation is an interesting idea that crops up regularly. But it's very tricky to reason about the correctness of systems that use them. Is it still a transaction?

Why do databases have transactions anyway? Some don't. Very old SQL DBs often didn't have them and they came later as an extension. The initial wave of NoSQL databases didn't have them.

Developers like having transactions because otherwise it gets too hard to write bug free code. They can see a dataset that's not in a logically consistent state and have to anticipate that, which is really brain scrambling. A partially validated transaction opens the possibility that nobody verified the whole transaction at any point (e.g. with a non validating notary). The ledger could then become logically inconsistent if one part of the partial tx was invalid. 

Consider a three-way transaction in which party A pays party B tokens and in return party B sends an asset to party C. Corda's design says that nobody should be able to see the ledger in a state where someone might later claim there was a bug and things should be rolled back, as we assume (like all blockchain designs assume) that in a decentralised setting co-ordinating a rollback is very hard. So if the token transfer from A->B is valid but the asset transfer from B->C is rejected by C because the asset is invalid, the tokens shouldn't move. If they did then A might talk to C and discover the asset they paid for didn't move, and so the tokens should come back. Avoiding the need for inter-firm reconciliation was the original design brief for the whole project! Without atomic, valid transactions it's not clear how to achieve that.

Sometimes people counter with: maybe A B and C can all just check the transaction and refuse to sign if it isn't valid? That works in some cases but not others. If validity could be determined entirely through signatures of the relevant parties, anyone who owned some tokens could just edit the value as they see fit and then sign its validity themselves. Or you have to define the issuer as a relevant party, but then they have to counter-sign every transaction and the entire payment system halts if they go offline or decide to start selectively blocking transactions. Normally people assume decentralised = robust, so if you compromise on that to try and get a different privacy model there will be many surprised faces when an outage takes out the entire (supposedly P2P) network.

The reason we've invested in SGX is that it's very hard to combine correctness and privacy if you can't compute on private data and it's not clear that privacy should trump correctness or vice-versa. Real systems usually need both. For now we muddle through finding complex compromises in each case, but once we get SGX integrated in various forms we don't have to pick anymore. Of course, it introduces new tradeoffs and problems (e.g. what if SGX breaks). But that's a different story.

I think it's an open question to what extent supporting decentralised tokens warps the design of blockchain systems. That's where some of our own debates focus these days. If we didn't care about robust HA/private tokens, are there any other cases where validity can't be decided by a group of "relevant participants"? My suspicion is that tokens aren't special and the need to be able to update database entries without the involvement of potentially relevant parties is a requirement of many projects, e.g. anything where going offline at the "right" time would block a trade yielding a commercial advantage for someone. But I don't think we have a rigorous formal approach to analysing such requirements.


Mike Hearn
 

Right, I see. Tudor has done some thinking about this, perhaps he will chip in.

You could certainly implement that with the APIs. Partial resolution of a transaction graph would require some new flows that you can customise to specify the privacy policy. There'd be some work needed to store partial transactions in the database, but we need to do that work for SGX integration anyway. In fact it's next up on the list of tasks. So this is really a question of what attacks such an approach enables and whether anyone cares.

I think there are cases where it can be useful to do. One is observer nodes e.g. regulators. An observer is a special case because the cost of being temporarily mistaken can be much lower for them. If an observer traces the history of an asset through a partially visible transaction graph and then at some point someone pops up and says, hey, actually this transaction was reported to you but it wasn't valid and should have never happened, probably it's no big deal. They are just keeping statistics or collecting evidence for enforcement actions, something like that. The real world consequences of a Corda transaction arriving are very much delayed, and can be undone if the full tx is presented and the node is shown that the tx was never valid. In particular undoing mistakes doesn't imply transitively contacting lots of other nodes to say your entire chain of txns was invalid and must be rolled back. It's that chain-of-rollbacks-problem that leads to blockchain systems preferring validity over privacy. Where it doesn't apply you could optimise data propagation.

Over here, we're assuming that the contract will have a verify function for each state object, in addition to a verify function for the entire tx object. This way it will be possible to still determine the validity of an isolated state object even if the entire transaction object was not available.
Yes absolutely.

One of the original motivations for using Java POJOs to model data was so that objects can use private fields, bean validation, constructors etc to enforce local validity of the object graph. In the end Bean Validation didn't quite happen because Hibernate Validator is very heavy and reflection driven, which doesn't mesh too well with deterministic execution. It doesn't have to be that way though and a Bean Validation implementation that used code generation via annotation processors would be a nice thing to have, especially, it'd fit better with AOT compilers like GraalVM native-image. So it's more like a temporary roadblock or priorities issue than anything fundamental.

A contract verify that just invokes a verify method on each state object would therefore be a totally reasonable pattern to have, and ideally even encode into the platform itself. Albeit evolving the data model gets more expensive as time goes on. You end up needing a moral equivalent of polyfills to ensure newer apps can run on older nodes.


Tudor Malene
 

Hi Immad,

 

This is a problem that also kept me up at night a bit. As Mike already mentioned, we are having internal debates about this already.

I will try to briefly describe one way we attempt to approach this problem.

This is by no means a design or something that we plan to implement in the foreseeable future.

For now it's just ideas, but it would be good to hear your thoughts and maybe those of the community.

 

 

This explores what would take to allow the possibility of verifying only partial transactions, without disclosing the irrelevant parts. ( Which is also the problem you describe.)

This is actually what you would expect intuitively to happen as it matches the real world more closely. For example when someone gives you a £10 note, you don't care what it was used for before you got it. All you care is that it is a genuine banknote that other people will also accept.

Solving this problem would also unlock a number of other features that would make life for Corda Networks much easier.

 

As mentioned by Mike, SGX is one way to address the privacy issue. It will hide all the history, but at some performance cost, and with a slight risk that SGX itself can be hacked.

 

This is an idea that is totally orthogonal to the SGX approach.

To understand this line of reasoning, we first have to look at the incentives of different participants in a Corda network.

 

Let's take Mike's example of: "Consider a three-way transaction in which party A pays party B tokens and in return party B sends an asset to party C",

and the business goal of: "Corda's design says that nobody should be able to see the ledger in a state where someone might later claim there was a bug and things should be rolled back, as we assume (like all blockchain designs assume) that in a decentralised setting co-ordinating a rollback is very hard"

 

So what are the high level incentives in this case:

A – wants C to receive 1 AssetX from B, and also in this case wants C to acknowledge the fact that it was A who transferred this asset.

B – wants to receive 1 TokenY

C – just wants the 1 AssetX

IssuerX and IssuerY – want for nobody to mess with the overall value of the asset they issued. Basically all they really care about is that transactions don't leave them undercapitalised.

A,B and C - want to be sure that nobody can steal their assets.

 

Let's see how these requirements are met with the current Corda model first:

Backchain resolution (calling the `verify` method on all historic transactions) would ensure that the incentive of the issuers and that of the participants of nobody stealing their assets are enforced by general consensus. A future node will not accept a chain if it contains one transaction that breaches one of these rules. Also, when redeeming some asset, the issuer itself will verify the entire backchain to further protect his own interests. (The assumption is that the smart contracts are written correctly.)

 

The fact that one slice of the transaction can't happen without the other is ensured by the fact that the transaction is always presented as a whole, so if any bit of it is invalid then all outputs are invalid. This is the critical piece that Mike also described that protects the incentives of the participants in the current transaction.

 

 

If we tear transactions into individual transitions, with the current design, it means that the last step can't be enforced any longer.

As a simple example: Tx(A transfer 1 MegaToken to B, B transfer 1 AssetX to A ).

If A signs first, then one slice of the transaction becomes valid even if B doesn't sign the transaction, so none of the participants can risk to sign first.

 

In a nutshell, to still maintain the atomicity property, but without disclosing the irrelevant bits of a transaction, we need to add another information to the disclosed transaction.

Like a conceptual “Zero knowledge proof” that the transaction is valid overall, even if you are presented with just one slice of it.

 

I think that this is very easy to solve if we consider the incentives described above, and the actors are rational.

 

Let’s say we add another Component group to the transaction Merkle Tree – “Required signers component group”.

Each participant with a stake in this transaction, when they are building it, would add the public key of the owner of the asset(s) they must receive to this list of "Required signers".

In our simple example: A would add B, and B would add A. (If the assets are owned by anonymous keys, it would be those keys of course).

Neither A, nor B would sign this transaction if it didn't contain the other party as a required signer.

Let's assume that in the future, this “Required signers component group” is the only thing that is disclosed, besides the relevant part of the transaction.

This would allow future holders of this token/asset to verify that atomicity was respected in the past by checking that the interest of the relevant Issuer/participant was respected. They would basically just check that the transaction is signed by all those keys.

The other incentives are verified same as before, by running the contract on the entire backchain of only the relevant asset. This would reduce the graph to verify to only the bits of the transactions containing that asset.

 

To recap, this new “Required signers component group”, would act as an atomicity enforcer that is populated by the actual participants in the transaction that act like rational actors.

If everyone uses anonymous keys, then it would not disclose any useful information to future verifiers. It would be a Zero knowledge proof that the transaction is atomic.

 

Note: In the case of the three-way A,B and C example, A would add both B and C to the required signers.

 

Note: This is just a very short description of the idea. There are many more things to consider, and some other small changes to the data model are required.

 

Looking forward to hearing your thoughts.

 

 

From: <corda-dev@groups.io> on behalf of "Mike Hearn via Groups.Io" <mike@...>
Reply to: "corda-dev@groups.io" <corda-dev@groups.io>
Date: Tuesday, 21 January 2020 at 09:53
To: "corda-dev@groups.io" <corda-dev@groups.io>
Subject: Re: [EXTERNAL] Re: [corda-dev] Corda's privacy guarantees

 

Right, I see. Tudor has done some thinking about this, perhaps he will chip in.

You could certainly implement that with the APIs. Partial resolution of a transaction graph would require some new flows that you can customise to specify the privacy policy. There'd be some work needed to store partial transactions in the database, but we need to do that work for SGX integration anyway. In fact it's next up on the list of tasks. So this is really a question of what attacks such an approach enables and whether anyone cares.

I think there are cases where it can be useful to do. One is observer nodes e.g. regulators. An observer is a special case because the cost of being temporarily mistaken can be much lower for them. If an observer traces the history of an asset through a partially visible transaction graph and then at some point someone pops up and says, hey, actually this transaction was reported to you but it wasn't valid and should have never happened, probably it's no big deal. They are just keeping statistics or collecting evidence for enforcement actions, something like that. The real world consequences of a Corda transaction arriving are very much delayed, and can be undone if the full tx is presented and the node is shown that the tx was never valid. In particular undoing mistakes doesn't imply transitively contacting lots of other nodes to say your entire chain of txns was invalid and must be rolled back. It's that chain-of-rollbacks-problem that leads to blockchain systems preferring validity over privacy. Where it doesn't apply you could optimise data propagation.

Over here, we're assuming that the contract will have a verify function for each state object, in addition to a verify function for the entire tx object. This way it will be possible to still determine the validity of an isolated state object even if the entire transaction object was not available.

Yes absolutely.

One of the original motivations for using Java POJOs to model data was so that objects can use private fields, bean validation, constructors etc to enforce local validity of the object graph. In the end Bean Validation didn't quite happen because Hibernate Validator is very heavy and reflection driven, which doesn't mesh too well with deterministic execution. It doesn't have to be that way though and a Bean Validation implementation that used code generation via annotation processors would be a nice thing to have, especially, it'd fit better with AOT compilers like GraalVM native-image. So it's more like a temporary roadblock or priorities issue than anything fundamental.

A contract verify that just invokes a verify method on each state object would therefore be a totally reasonable pattern to have, and ideally even encode into the platform itself. Albeit evolving the data model gets more expensive as time goes on. You end up needing a moral equivalent of polyfills to ensure newer apps can run on older nodes.


Immad Naseer
 

Mike, good to know that the work on storing partial transactions in the vault is up there on the backlog. The ability to inform the transaction graph resolution strategy of the required privacy constraints should then enable this use case. Regarding the contract verify calling the verify of each state object in turn, that will of course work. It might still be useful to have a tx-level verify function which can validate whether all the state objects (which are all independently valid) are being combined in a valid way.

 

I totally hear you on the complexity of evolving the data model while maintaining backwards compatibility.

 

Tudor, your £10 note example was the one I had in mind as well. We typically don't care about the world state a bank note has been involved in modifying - we just care about the validity of the note itself.

 

The 'required signers' approach you presented is very insightful. It goes a step further than what I was originally trying to solve. I was thinking about hiding parts of the _historical_ transaction graph and having participants only validate the state-chain/back-chain of the assets they were dealing with. Your approach enables hiding parts of a _new_ transaction object while still retaining atomicity and commitment guarantees of a transaction object. I loved the approach.

 

It sounds like planned future work here (both shorter and longer term) should unlock this case, if anyone desires to use it.

 

Thanks,

Immad

 

From: corda-dev@groups.io <corda-dev@groups.io> On Behalf Of Tudor Malene via Groups.Io
Sent: Tuesday, January 21, 2020 3:01 AM
To: corda-dev@groups.io
Subject: Re: [EXTERNAL] Re: [corda-dev] Corda's privacy guarantees

 

Hi Immad,

 

This is a problem that also kept me up at night a bit. As Mike already mentioned, we are having internal debates about this already.

I will try to briefly describe one way we attempt to approach this problem.

This is by no means a design or something that we plan to implement in the foreseeable future.

For now it's just ideas, but it would be good to hear your thoughts and maybe those of the community.

 

 

This explores what would take to allow the possibility of verifying only partial transactions, without disclosing the irrelevant parts. ( Which is also the problem you describe.)

This is actually what you would expect intuitively to happen as it matches the real world more closely. For example when someone gives you a £10 note, you don't care what it was used for before you got it. All you care is that it is a genuine banknote that other people will also accept.

Solving this problem would also unlock a number of other features that would make life for Corda Networks much easier.

 

As mentioned by Mike, SGX is one way to address the privacy issue. It will hide all the history, but at some performance cost, and with a slight risk that SGX itself can be hacked.

 

This is an idea that is totally orthogonal to the SGX approach.

To understand this line of reasoning, we first have to look at the incentives of different participants in a Corda network.

 

Let's take Mike's example of: "Consider a three-way transaction in which party A pays party B tokens and in return party B sends an asset to party C",

and the business goal of: "Corda's design says that nobody should be able to see the ledger in a state where someone might later claim there was a bug and things should be rolled back, as we assume (like all blockchain designs assume) that in a decentralised setting co-ordinating a rollback is very hard"

 

So what are the high level incentives in this case:

A – wants C to receive 1 AssetX from B, and also in this case wants C to acknowledge the fact that it was A who transferred this asset.

B – wants to receive 1 TokenY

C – just wants the 1 AssetX

IssuerX and IssuerY – want for nobody to mess with the overall value of the asset they issued. Basically all they really care about is that transactions don't leave them undercapitalised.

A,B and C - want to be sure that nobody can steal their assets.

 

Let's see how these requirements are met with the current Corda model first:

Backchain resolution (calling the `verify` method on all historic transactions) would ensure that the incentive of the issuers and that of the participants of nobody stealing their assets are enforced by general consensus. A future node will not accept a chain if it contains one transaction that breaches one of these rules. Also, when redeeming some asset, the issuer itself will verify the entire backchain to further protect his own interests. (The assumption is that the smart contracts are written correctly.)

 

The fact that one slice of the transaction can't happen without the other is ensured by the fact that the transaction is always presented as a whole, so if any bit of it is invalid then all outputs are invalid. This is the critical piece that Mike also described that protects the incentives of the participants in the current transaction.

 

 

If we tear transactions into individual transitions, with the current design, it means that the last step can't be enforced any longer.

As a simple example: Tx(A transfer 1 MegaToken to B, B transfer 1 AssetX to A ).

If A signs first, then one slice of the transaction becomes valid even if B doesn't sign the transaction, so none of the participants can risk to sign first.

 

In a nutshell, to still maintain the atomicity property, but without disclosing the irrelevant bits of a transaction, we need to add another information to the disclosed transaction.

Like a conceptual “Zero knowledge proof” that the transaction is valid overall, even if you are presented with just one slice of it.

 

I think that this is very easy to solve if we consider the incentives described above, and the actors are rational.

 

Let’s say we add another Component group to the transaction Merkle Tree – “Required signers component group”.

Each participant with a stake in this transaction, when they are building it, would add the public key of the owner of the asset(s) they must receive to this list of "Required signers".

In our simple example: A would add B, and B would add A. (If the assets are owned by anonymous keys, it would be those keys of course).

Neither A, nor B would sign this transaction if it didn't contain the other party as a required signer.

Let's assume that in the future, this “Required signers component group” is the only thing that is disclosed, besides the relevant part of the transaction.

This would allow future holders of this token/asset to verify that atomicity was respected in the past by checking that the interest of the relevant Issuer/participant was respected. They would basically just check that the transaction is signed by all those keys.

The other incentives are verified same as before, by running the contract on the entire backchain of only the relevant asset. This would reduce the graph to verify to only the bits of the transactions containing that asset.

 

To recap, this new “Required signers component group”, would act as an atomicity enforcer that is populated by the actual participants in the transaction that act like rational actors.

If everyone uses anonymous keys, then it would not disclose any useful information to future verifiers. It would be a Zero knowledge proof that the transaction is atomic.

 

Note: In the case of the three-way A,B and C example, A would add both B and C to the required signers.

 

Note: This is just a very short description of the idea. There are many more things to consider, and some other small changes to the data model are required.

 

Looking forward to hearing your thoughts.

 

 

From: <corda-dev@groups.io> on behalf of "Mike Hearn via Groups.Io" <mike@...>
Reply to: "corda-dev@groups.io" <corda-dev@groups.io>
Date: Tuesday, 21 January 2020 at 09:53
To: "corda-dev@groups.io" <corda-dev@groups.io>
Subject: Re: [EXTERNAL] Re: [corda-dev] Corda's privacy guarantees

 

Right, I see. Tudor has done some thinking about this, perhaps he will chip in.

You could certainly implement that with the APIs. Partial resolution of a transaction graph would require some new flows that you can customise to specify the privacy policy. There'd be some work needed to store partial transactions in the database, but we need to do that work for SGX integration anyway. In fact it's next up on the list of tasks. So this is really a question of what attacks such an approach enables and whether anyone cares.

I think there are cases where it can be useful to do. One is observer nodes e.g. regulators. An observer is a special case because the cost of being temporarily mistaken can be much lower for them. If an observer traces the history of an asset through a partially visible transaction graph and then at some point someone pops up and says, hey, actually this transaction was reported to you but it wasn't valid and should have never happened, probably it's no big deal. They are just keeping statistics or collecting evidence for enforcement actions, something like that. The real world consequences of a Corda transaction arriving are very much delayed, and can be undone if the full tx is presented and the node is shown that the tx was never valid. In particular undoing mistakes doesn't imply transitively contacting lots of other nodes to say your entire chain of txns was invalid and must be rolled back. It's that chain-of-rollbacks-problem that leads to blockchain systems preferring validity over privacy. Where it doesn't apply you could optimise data propagation.

Over here, we're assuming that the contract will have a verify function for each state object, in addition to a verify function for the entire tx object. This way it will be possible to still determine the validity of an isolated state object even if the entire transaction object was not available.

Yes absolutely.

One of the original motivations for using Java POJOs to model data was so that objects can use private fields, bean validation, constructors etc to enforce local validity of the object graph. In the end Bean Validation didn't quite happen because Hibernate Validator is very heavy and reflection driven, which doesn't mesh too well with deterministic execution. It doesn't have to be that way though and a Bean Validation implementation that used code generation via annotation processors would be a nice thing to have, especially, it'd fit better with AOT compilers like GraalVM native-image. So it's more like a temporary roadblock or priorities issue than anything fundamental.

A contract verify that just invokes a verify method on each state object would therefore be a totally reasonable pattern to have, and ideally even encode into the platform itself. Albeit evolving the data model gets more expensive as time goes on. You end up needing a moral equivalent of polyfills to ensure newer apps can run on older nodes.