Topics

[Ibmtpm20tss-users] tpm sessions

James Bottomley
 

On Mon, 2019-03-18 at 21:05 +0000, Doug Fraser wrote:
Ken,

We are interfacing to the TPM via openssl with engine = tpm2, so at
that level, do we have direct access to session control?

The test server requests openssl to encrypt a block of data, then
passes it to the client. The client requests openssl to decrypt the
block, and checks its content. At that point, at the application
level, the transaction is complete.

Is there some other operation that the application has to request
from openssl to flush the related context?
No as I explained in the other email, the openssl_tpm2_engine mostly
keeps the TPM clear when it's not doing anything even if the key is
present and referenced from an openssl point of view.

I'll go back to something James mentioned in his reply, the failure
seems to be percolating up from SPI device layer.
Is there a serialization point within the SPI layer to marshal
commands into and out of the TPM?
Yes, it's chip->tpm_mutex which is taken in tpm_common_write via
tpm_try_get_ops.

I noticed a flag based mutex() with tpm_command(), but I haven't
spilled data from that yet.
Is that the serialization point for the interface?

I don't suspect signal integrity at the low clock rate we are
currently running, particularly in regard to the extreme long term
test that was run over the weekend.
well something caused the -EIO in tpm_try_transmit and that can only
come from the underlying driver. For the tis driver it means that
STS_DATA_EXPECT wasn't signalled as it should for transmitting data.

James

James Bottomley
 

On Mon, 2019-03-18 at 15:48 -0500, Kenneth Goldman wrote:
From: Doug Fraser <doug.fraser@...>
To: Kenneth Goldman <kgoldman@...>
Cc: "Ibmtpm20tss-users@..." <Ibmtpm20tss-
users@...>, "openssl-tpm2-engine@groups.io"
<openssl-tpm2-engine@groups.io>
Date: 03/18/2019 03:48 PM
Subject: RE: [Ibmtpm20tss-users] tpm sessions

Ken,

Thank you for that information.

I don?t believe we should be exceeded that limit.

The nature of the test is that each thread spawns a shell that
invokes a client/server pair that each use openssl (and the tpm2
engine) to trade a piece of data. One side is encrypting, the
other decrypting.

So for any given thread, there are two processes, that both
completely exit and then the shell terminates, and the process
repeats, with a new shell invocation.

So for seven test threads, worst case is 2 * 7 simultaneous
applications active (14) which seems safely less than 21.
14 applications ... but how many session per application?
The openssl_tpm2_engine typically only uses 1 session for each command.
We use the same session for hmac and decryption and we don't run
audits. I think all of the commands we use only require at most one
authorization. We also don't keep sessions loaded, so every time you
use the key, we go through a start session/load key/do key op/flush
handles sequence.

That is, a typical operation will use 2-3 sessions - 2 for
authorization and perhaps one for audit.

If an application doesn't either explicitly flush,
or exit so that the resource manager will clean up, sessions
can hang around. Eventually the 64 session limit is reached.
There's another possibility as well, which is that an application that
keeps a session for a long time can cause a gap error (the TPM doesn't
allow session sequence numbers to wrap around).

However, openssl_tpm2_engine was coded with the old /dev/tpm0 interface
in mind, so it tries to keep the session and volatile handle use
periods as short as possible.

If it runs for while and then fails, perhaps there is a
'session leak'?

FWIW, when the RM was first coded, I regression tested it.
I ran 21 processes, and each created 3 session, did something,
and then flushed them, in a loop. It ran quite a while.

However, I did not test a process closing without explicitly
flushing. Should I do that?
We should already have tested that, but there's no harm in making sure.
The current smoke tests are in the kernel under
tools/testing/selftests/tpm2

James


Doug, James:

Should I rerun the test on a newer kernel? Perhaps there
was a regression.

I.e., if I can help, let me know.



Doug Fraser
 

Ken,

We are interfacing to the TPM via openssl with engine = tpm2, so at that level, do we have direct access to session control?

The test server requests openssl to encrypt a block of data, then passes it to the client. The client requests openssl to decrypt the block, and checks its content. At that point, at the application level, the transaction is complete.

Is there some other operation that the application has to request from openssl to flush the related context?

I'll go back to something James mentioned in his reply, the failure seems to be percolating up from SPI device layer.
Is there a serialization point within the SPI layer to marshal commands into and out of the TPM?

I noticed a flag based mutex() with tpm_command(), but I haven't spilled data from that yet.
Is that the serialization point for the interface?

I don't suspect signal integrity at the low clock rate we are currently running, particularly in regard to the extreme long term test that was run over the weekend.

Doug

From: Kenneth Goldman <kgoldman@...>
Sent: Monday, March 18, 2019 4:49 PM
To: Doug Fraser <doug.fraser@...>
Cc: Ibmtpm20tss-users@...; openssl-tpm2-engine@groups.io
Subject: RE: [Ibmtpm20tss-users] tpm sessions


From: Doug Fraser <doug.fraser@...<mailto:doug.fraser@...>>
To: Kenneth Goldman <kgoldman@...<mailto:kgoldman@...>>
Cc: "Ibmtpm20tss-users@...<mailto:Ibmtpm20tss-users@...>" <Ibmtpm20tss-
users@...<mailto:users@...>>, "openssl-tpm2-engine@groups.io<mailto:openssl-tpm2-engine@groups.io>"
<openssl-tpm2-engine@groups.io<mailto:openssl-tpm2-engine@groups.io>>
Date: 03/18/2019 03:48 PM
Subject: RE: [Ibmtpm20tss-users] tpm sessions

Ken,

Thank you for that information.

I don't believe we should be exceeded that limit.

The nature of the test is that each thread spawns a shell that
invokes a client/server pair that each use openssl (and the tpm2
engine) to trade a piece of data. One side is encrypting, the other
decrypting.

So for any given thread, there are two processes, that both
completely exit and then the shell terminates, and the process
repeats, with a new shell invocation.

So for seven test threads, worst case is 2 * 7 simultaneous
applications active (14) which seems safely less than 21.
14 applications ... but how many session per application?

That is, a typical operation will use 2-3 sessions - 2 for
authorization and perhaps one for audit.

If an application doesn't either explicitly flush,
or exit so that the resource manager will clean up, sessions
can hang around. Eventually the 64 session limit is reached.

If it runs for while and then fails, perhaps there is a
'session leak'?

FWIW, when the RM was first coded, I regression tested it.
I ran 21 processes, and each created 3 session, did something,
and then flushed them, in a loop. It ran quite a while.

However, I did not test a process closing without explicitly
flushing. Should I do that?

Doug, James:

Should I rerun the test on a newer kernel? Perhaps there
was a regression.

I.e., if I can help, let me know.

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient:(a) any dissemination or copying of this message is strictly prohibited; and (b) immediately notify the sender by return message and destroy any copies of this message in any form (electronic, paper or otherwise) that you have. The delivery of this message and its information is neither intended to be nor constitutes a disclosure or waiver of any trade secrets, intellectual property, attorney work product, or attorney-client communications.

Kenneth Goldman <kgoldman@...>
 


> From: Doug Fraser <doug.fraser@...>

> To: Kenneth Goldman <kgoldman@...>
> Cc: "Ibmtpm20tss-users@..." <Ibmtpm20tss-
> users@...>, "openssl-tpm2-engine@groups.io"
> <openssl-tpm2-engine@groups.io>

> Date: 03/18/2019 03:48 PM
> Subject: RE: [Ibmtpm20tss-users] tpm sessions
>
> Ken,

>  
> Thank you for that information.
>  
> I don’t believe we should be exceeded that limit.
>  
> The nature of the test is that each thread spawns a shell that
> invokes a client/server pair that each use openssl (and the tpm2
> engine) to trade a piece of data. One side is encrypting, the other
> decrypting.

>  
> So for any given thread, there are two processes, that both
> completely exit and then the shell terminates, and the process
> repeats, with a new shell invocation.

>  
> So for seven test threads, worst case is 2 * 7 simultaneous
> applications active (14) which seems safely less than 21.


14 applications ... but how many session per application?

That is, a typical operation will use 2-3 sessions - 2 for
authorization and perhaps one for audit.  

If an application doesn't either explicitly flush,
or exit so that the resource manager will clean up, sessions
can hang around.  Eventually the 64 session limit is reached.

If it runs for while and then fails, perhaps there is a
'session leak'?

FWIW, when the RM was first coded, I regression tested it.
I ran 21 processes, and each created 3 session, did something,
and then flushed them, in a loop.  It ran quite a while.

However, I did not test a process closing without explicitly
flushing.  Should I do that?

Doug, James:

Should I rerun the test on a newer kernel?  Perhaps there
was a regression.

I.e., if I can help, let me know.

Doug Fraser
 

Ken,

Thank you for that information.

I don't believe we should be exceeded that limit.

The nature of the test is that each thread spawns a shell that invokes a client/server pair that each use openssl (and the tpm2 engine) to trade a piece of data. One side is encrypting, the other decrypting.

So for any given thread, there are two processes, that both completely exit and then the shell terminates, and the process repeats, with a new shell invocation.

So for seven test threads, worst case is 2 * 7 simultaneous applications active (14) which seems safely less than 21.

I am currently rerunning with three active test threads, and so far that has issued about twice the number of openssl engine calls without leading to any failures. If that is still running error free at the end of my day, I am going to let it run through the night. If that runs without error, I am going to binary search between 3 and 7 to see if there is a 'cliff'

Thanks again for the high level view of the numbers and limitations coming into play.

Doug


From: Kenneth Goldman <kgoldman@...>
Sent: Monday, March 18, 2019 2:34 PM
To: Doug Fraser <doug.fraser@...>
Cc: Ibmtpm20tss-users@...; openssl-tpm2-engine@groups.io
Subject: Re: [Ibmtpm20tss-users] tpm sessions

From: Doug Fraser <doug.fraser@...<mailto:doug.fraser@...>>
To: "openssl-tpm2-engine@groups.io<mailto:openssl-tpm2-engine@groups.io>" <openssl-tpm2-engine@groups.io<mailto:openssl-tpm2-engine@groups.io>>,
Doug Fraser <doug.fraser@...<mailto:doug.fraser@...>>, "Ibmtpm20tss-
users@...<mailto:users@...>" <Ibmtpm20tss-users@...<mailto:Ibmtpm20tss-users@...>>
Date: 03/18/2019 02:19 PM
Subject: [Ibmtpm20tss-users] tpm sessions

So we have moved beyond the signaling issues on our TPM for now, but
in ramping up performance saturation testing, I am pounding on the
openssl engine with multiple threads of execution, and I am finding
this fault.

/var/log/messages:Mar 18 16:43:28 C05BCB00C0A000001153 kern.err
kernel: [11840.869864] tpm tpm0: tpm_try_transmit: tpm_send: error -5
/var/log/messages:Mar 18 16:43:28 C05BCB00C0A000001153 kern.err
kernel: [11840.878969] tpm tpm0: A TPM error (357) occurred flushing context

Within the kernel, reflect up through the applications as:

TPM2_StartAuthSession failed with 2309
TPM_RC_SESSION_HANDLES - out of session handles - a session must be
flushed before a new session may be created
Failed to get Key Handle in TPM EC key routines

The underlying tss code is build with:

CCFLAGS += -DTPM_POSIX \
-DTPM_INTERFACE_TYPE_DEFAULT="\"dev\"" \
-DTPM_DEVICE_DEFAULT="\"/dev/tpmrm0\"" \
$(BLD_SYSROOT)

So we should be using the tpmrm resource manager within the kernel.

If I run the test code as a single instance, this never occurs
(within the bounds of 64 hours of constant running)

Is there a practical limit to the openssl engine, underlying tpmrm,
or even the underlying physical block that I am ignoring here?
My view was that as long as you pass through the tpmrm, you might
stall, but the resources would be managed.
Background:

Sessions are different from keys, in that the TPM has to prevent a
replay attack on a saved session context. To do this, the TPM has a
table of active session contexts with a 'version number' to detect and
block the replay.

In TCG jargon (if I have it right):

A loaded session is actually on the TPM.
A saved session is off the TPM, but is still in the table.
An active session is either loaded or saved.

Result:

The table has typically 64 entries for active sessions. Thus,
threads cannot make an unlimited number of sessions, even with
a resource manager.

Again typically, an application needs at most 3 sessions, so
the TPM can handle 21 simultaneous applications.

Keys:

Since keys are not subject to a replay attack, the TPM does not
keep any state when a key is context saved. There is thus
no TPM defined limit to the number of saved key contexts.


I am going back to dig through tpm-tis, in particular, tpm2-cmd.c
and tpm-interface.c.

Doug

_______________________________________________
Ibmtpm20tss-users mailing list
Ibmtpm20tss-users@...<mailto:Ibmtpm20tss-users@...>
https://urldefense.proofpoint.com/v2/url?
u=https-3A__lists.sourceforge.net_lists_listinfo_ibmtpm20tss-2Dusers&d=DwICAg&c=jf_iaSHvJObTbx-
siA1ZOg&r=DZCVG43VcL8GTneMZb8k8lEwb-O1GZktFfre1-
mlmiA&m=XmhnfVlfhSYHtZr6FrXKzEkZPZOJHo_STtu4jislJQE&s=Fs921riOnezqzdagg8OMOP8iqow-
oOAQI5B0wwzwq2M&e=

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient:(a) any dissemination or copying of this message is strictly prohibited; and (b) immediately notify the sender by return message and destroy any copies of this message in any form (electronic, paper or otherwise) that you have. The delivery of this message and its information is neither intended to be nor constitutes a disclosure or waiver of any trade secrets, intellectual property, attorney work product, or attorney-client communications.

Kenneth Goldman <kgoldman@...>
 

> From: Doug Fraser <doug.fraser@...>
> To: "openssl-tpm2-engine@groups.io" <openssl-tpm2-engine@groups.io>,
> Doug Fraser <doug.fraser@...>, "Ibmtpm20tss-
> users@..." <Ibmtpm20tss-users@...>

> Date: 03/18/2019 02:19 PM
> Subject: [Ibmtpm20tss-users] tpm sessions
>
> So we have moved beyond the signaling issues on our TPM for now, but
> in ramping up performance saturation testing, I am pounding on the
> openssl engine with multiple threads of execution, and I am finding
> this fault.
>
> /var/log/messages:Mar 18 16:43:28 C05BCB00C0A000001153 kern.err
> kernel: [11840.869864] tpm tpm0: tpm_try_transmit: tpm_send: error -5
> /var/log/messages:Mar 18 16:43:28 C05BCB00C0A000001153 kern.err
> kernel: [11840.878969] tpm tpm0: A TPM error (357) occurred flushing context
>
> Within the kernel, reflect up through the applications as:
>
> TPM2_StartAuthSession failed with 2309
> TPM_RC_SESSION_HANDLES - out of session handles - a session must be
> flushed before a new session may be created
> Failed to get Key Handle in TPM EC key routines
>
> The underlying tss code is build with:
>
> CCFLAGS +=  -DTPM_POSIX \
>         -DTPM_INTERFACE_TYPE_DEFAULT="\"dev\""  \
>         -DTPM_DEVICE_DEFAULT="\"/dev/tpmrm0\"" \
>         $(BLD_SYSROOT)
>
> So we should be using the tpmrm  resource manager within the kernel.
>
> If I run the test code as a single instance, this never occurs
> (within the bounds of 64 hours of constant running)
>
> Is there a practical limit to the openssl engine, underlying tpmrm,
> or even the underlying physical block that I am ignoring here?
> My view was that as long as you pass through the tpmrm, you might
> stall, but the resources would be managed.


Background:  

Sessions are different from keys, in that the TPM has to prevent a
replay attack on a saved session context. To do this, the TPM has a
table of active session contexts with a 'version number' to detect and
block the replay.

In TCG jargon (if I have it right):

A loaded session is actually on the TPM.
A saved session is off the TPM, but is still in the table.
An active session is either loaded or saved.

Result:

The table has typically 64 entries for active sessions.  Thus,
threads cannot make an unlimited number of sessions, even with
a resource manager.

Again typically, an application needs at most 3 sessions, so
the TPM can handle 21 simultaneous applications.

Keys:

Since keys are not subject to a replay attack, the TPM does not
keep any state when a key is context saved.  There is thus
no TPM defined limit to the number of saved key contexts.

 
> I am going back to dig through tpm-tis, in particular, tpm2-cmd.c
> and tpm-interface.c.
>
> Doug
>
> _______________________________________________
> Ibmtpm20tss-users mailing list
> Ibmtpm20tss-users@...
>
https://urldefense.proofpoint.com/v2/url?
> u=https-3A__lists.sourceforge.net_lists_listinfo_ibmtpm20tss-2Dusers&d=DwICAg&c=jf_iaSHvJObTbx-
> siA1ZOg&r=DZCVG43VcL8GTneMZb8k8lEwb-O1GZktFfre1-
> mlmiA&m=XmhnfVlfhSYHtZr6FrXKzEkZPZOJHo_STtu4jislJQE&s=Fs921riOnezqzdagg8OMOP8iqow-
> oOAQI5B0wwzwq2M&e=
>