[openpgp] respecting key flags for decryption

Discussion:

Vincent Breitmoser

2018-11-07 15:28:59 UTC

Hey folks,

I'd like to get some opinions on a thing:

When a message is encrypted to a public key whose key flags indicate that it may
not be used for encryption - should the receiving implementation still decrypt
data using this key?

Personally I think the only cryptographically sane thing to do is to reject the
data. There is no valid use case for this, and if it happens it's either a bug
or an attack happening. I would also welcome a clarification in the spec that
explicitly stated that a key MUST not be used for purposes that aren't indicated
by its key flags.

The reason why I bring this up is that it does come up in practice and at the
moment, there is no consensus on how to handle such a case between OpenKeychain
and GnuPG. See:
https://github.com/open-keychain/open-keychain/issues/2413
https://dev.gnupg.org/T4235

Cheers

- V

Jon Callas

2018-11-07 22:11:29 UTC

Permalink

> On Nov 7, 2018, at 7:28 AM, Vincent Breitmoser <***@my.amazin.horse> wrote:
>
>
> Hey folks,
>
> I'd like to get some opinions on a thing:
>
> When a message is encrypted to a public key whose key flags indicate that it may
> not be used for encryption - should the receiving implementation still decrypt
> data using this key?
>
> Personally I think the only cryptographically sane thing to do is to reject the
> data. There is no valid use case for this, and if it happens it's either a bug
> or an attack happening. I would also welcome a clarification in the spec that
> explicitly stated that a key MUST not be used for purposes that aren't indicated
> by its key flags.

I disagree.

Before going further, let’s remember that if I set up my key so that I have a signing key and an encryption key (or even no encryption key), and Alice sent me something encrypted to my signing key, it’s *Alice* who has broken protocol, not me. Why punish me because someone else has a broken implementation?

Also let me say that if there’s an implementation that wants to do this, that’s fine. What I’m objecting to is that this would be in OpenPGP. OpenPGP oughta say that an implementation MUST NOT encrypt to a sign-only key, but it should leave it at that.

Additionally, I see nothing wrong with an implementation letting me know that this thing it decrypted was encrypted to my signing key. In fact, I’d prefer to be told that. I can think of circumstances in which this is impractical, but in general, I’d like to be told.

If the software I run messes me up because someone else has a bug, I think I am justified in saying that my software has a resiliency problem. If the implementer came back and said that they do that because the OpenPGP standard says they MUST NOT, my reply would be that then the standard is broken with a resiliency problem.

Imagine this conversation between Alice, who has a broken implementation, and me with a “correct one.”

Me: Hey Alice, here’s a note about, blah blah blah, let me know what you think.
Alice: <Message I can’t decrypt>
Me: Alice, I can’t decrypt what you sent me because you encrypted to my signing key. Could you re-encrypt to my encryption key?
Alice: <Message I can’t decrypt>
Me: No, it happened again, could you try that again?
Alice: <Message I can’t decrypt>

Okay, so what do I do know? Here are some options:

* Change my key so that my signing-only key is dual-use.
* Give Alice my coordinates on Signal, Wire, etc.
* Send Alice an S/MIME cert for me and ask her to use that.

It’s suboptimal to set things up so that a single badly-behaving implementation can poison the community.

It would be completely reasonable in this situation to promote that signing-only keys are utterly broken because the XXX program doesn’t handle it correctly and that means that you’re screwed when someone picked the wrong software. It would be completely reasonable to ask the people not use OpenPGP in any form at all, because you’re screwed by this situation.

In short, the standard should not mandate fragile, brittle, or user-surly behavior. The standard should not forbid resiliency in the face of a bug.

We have one instance in the standard of mandated fragility, and that’s the Critical flag in some sub packets. In this case, the whole point of the feature is to make it brittle, but this is also why in general people will say not to use the Critical flag unless it’s really, really, really critical because you’re almost always just shooting yourself in the foot.

>
> The reason why I bring this up is that it does come up in practice and at the
> moment, there is no consensus on how to handle such a case between OpenKeychain
> and GnuPG. See:
> https://github.com/open-keychain/open-keychain/issues/2413
> https://dev.gnupg.org/T4235

I’m with Werner on the gnupg side of this. In decrypting anyway, gnupg is doing something resilient. Moreover, gnupg is not merely a tool, it’s a reference implementation and as a reference implementation it needs to have meta-features for the purpose of debugging.

Jon

Vincent Breitmoser

2018-11-08 11:24:54 UTC

Permalink

Hey Jon,

thanks for your thoughtful reply!

> Moreover, gnupg is not merely a tool, it’s a reference implementation and as
> a reference implementation it needs to have meta-features for the purpose of
> debugging.

I agree with thre premise here, but not the conclusion. In the concrete bug
report I have, a bank(!) implemented OpenPGP in a way that included two major
blunders: 1) they encrypt to a subkey that doesn't have the flag. 2) they
include a one-pass signature packet, but then no actual signature (which isn't
a valid OpenPGP message, by spec).

The reason this happened, I strongly suspect, is exactly because they treated
GnuPG as a reference implementation: they tested that it worked against GnuPG
(or some frontend), found it worked in practice (without even a warning), and
then left it at that.

> it’s *Alice* who has broken protocol, not me. Why punish me because someone
> else has a broken implementation?

> OpenPGP oughta say that an implementation MUST NOT encrypt to a sign-only key,
> but it should leave it at that.

I believe that this all goes back to Postel's law of "conservative in what you
emit, liberal in what you accept". You're advocating for being "resilient" in
the face of bugs in implementations, and that sounds like a good idea on paper.

However, I've come to believe that for the long term health of an ecosystem. The
reason is that the required "bug compatibility" that is required to actually be
interoperable accumulates over time. This leads to more and more implicit
knowledge (rather than explicit, as in by spec) being necessary to create or
maintain an implementation.

Martin Thomson put these thoughts into an I-D fairly recently:

https://www.ietf.org/archive/id/draft-thomson-postel-was-wrong-03.txt

It is at this point terribly hard and time consuming to write an OpenPGP
implementation that works interoperably, because of a general expectation that
everyone be 100% GnuPG bug compatible. It's just a blip on the radar, but the
case described above happened five years into working on OpenKeychain. And it's
not a fluke, I have more similar incidents (will post about them soon).

> The standard should not forbid resiliency in the face of a bug.

This is a very reasonable sentiment for specs like HTML, we certainly wouldn't
want to outright reject a website just because of a missing </i>. But in the
context of a cryptographic protocol, this is super dangerous.

Being overly relaxed in what we accept means giving attackers a large amount of
wiggling room. This is exactly what brought us EFAIL. We should learn from that,
and I hope we can do better in the future.

- V

Peter Gutmann

2018-11-08 22:31:50 UTC

Permalink

Vincent Breitmoser <***@my.amazin.horse> writes:

>The reason this happened, I strongly suspect, is exactly because they treated
>GnuPG as a reference implementation: they tested that it worked against GnuPG
>(or some frontend), found it worked in practice (without even a warning), and
>then left it at that.

This is a problem with several protocols where there's a single widely-used
implementation. It also affects SSH, a standards-conformant implementation
isn't something that follows RFC 4251-4, it's something that you can connect
to with Putty (server) or that connects to OpenSSH (client). That really is
the conformance-test for SSH, "we can connect to it with Putty, it's now
complete and fully standards-compliant".

Maybe all of these unofficial reference implementations need a strict-checking
mode for when they're being (incorrectly) used as reference implementations...

Peter.

Stephen Farrell

2018-11-09 01:36:49 UTC

Permalink

On 08/11/2018 22:31, Peter Gutmann wrote:
> Maybe all of these unofficial reference implementations need a strict-checking
> mode for when they're being (incorrectly) used as reference implementations...

Nice idea.

S.

Werner Koch

2018-11-09 08:06:54 UTC

Permalink

On Thu, 8 Nov 2018 23:31, ***@cs.auckland.ac.nz said:

> Maybe all of these unofficial reference implementations need a strict-checking
> mode for when they're being (incorrectly) used as reference
> implementations

gpg --compliance=openpgp ....

is intended to do just that. However there is a wide range of case
where things are not specified. For example the decryption with a key
having from the usage flags. Actually I expected to see a warning
there but that is unfortunately not the case and currently master even
rejects to decrypt with bad usage flags [1].

Shalom-Salam,

Werner

[1] https://dev/gnupg.org/T4246
--
Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz.

Peter Gutmann

2018-11-10 03:12:10 UTC

Permalink

Werner Koch <***@gnupg.org> writes:

>> Maybe all of these unofficial reference implementations need a strict-
>> checking mode for when they're being (incorrectly) used as reference
>> implementations
>
> gpg --compliance=openpgp ....
>
>is intended to do just that.

In hindsight my phrasing of the problem wasn't the best, it needs both a
compliance-checking mode and someone who enforces it. At the moment the
compliance check is "the message is accepted by GPG/Putty/OpenSSH in it's
default/most-tolerant configuration", in the sense of "XYZ accepts our
message, therefore it's not our fault if your one doesn't". So you'd need
either some certification body that says "yes, your implementation really is
compliant", or for the standard implementation to warn that the other side is
non-compliant when a message is received from it in order to force it to be
fixed - think Vista's UAC warnings that were created in order to deal with the
everyone-is-admin-all-the-time assumption of many/most Windows apps at the
time.

Unfortunately I don't think either of those will be terribly palatable...

Peter.

Jon Callas

2018-11-10 08:52:38 UTC

Permalink

> On Nov 8, 2018, at 3:24 AM, Vincent Breitmoser <***@my.amazin.horse> wrote:
>
>
> Hey Jon,
>
> thanks for your thoughtful reply!

And thank you for appreciating it.

>
>> Moreover, gnupg is not merely a tool, it’s a reference implementation and as
>> a reference implementation it needs to have meta-features for the purpose of
>> debugging.
>
> I agree with thre premise here, but not the conclusion. In the concrete bug
> report I have, a bank(!) implemented OpenPGP in a way that included two major
> blunders: 1) they encrypt to a subkey that doesn't have the flag. 2) they
> include a one-pass signature packet, but then no actual signature (which isn't
> a valid OpenPGP message, by spec).
>
> The reason this happened, I strongly suspect, is exactly because they treated
> GnuPG as a reference implementation: they tested that it worked against GnuPG
> (or some frontend), found it worked in practice (without even a warning), and
> then left it at that.

I think you’re mischaracterizing my conclusion.

My point and conclusion is that you shouldn’t let people who do stupid stuff contaminate the ecosystem.

We are starting from people who have screwed up. They screwed up because they didn’t understand or they didn’t care. You’re not going to make them understand by typing harder and making your keys click louder. Most importantly of all, you’re not going to make a stupid person smart by modifying the standard. The whole premise that we are starting from is a botched implementation.

My conclusion is that we shouldn’t make working implementations brittle in the belief that this will make the botched implementations suddenly correct.

Now I did go off into a rant and a jeremiad, and that may not have been clear, because good rants and jeremiads tend to wander. So let me try to be clearer while still being entertainingly ranty.

The purpose of a standard is for interoperability. The standard describes a language that my software and your software can use so that together we are better than either one separately. However, the standard is a map, it is not the territory. The territory is the software in part and the people most of all.

If the software is inferior or brain-damaged by following the standard, then this is very bad. Sometimes this happens accidentally. It should never, ever happen that the standard *intentionally* forgets its place in the world and mandates bad experiences for the users. A good software developer makes good experiences for the users. Always. If that means breaking the standard, you break the standard. Especially when the precipitating action is that someone *else* broke the standard.

I never invoked Postel’s principle. If you read what I wrote again, I very carefully, *intentionally* did not invoke it. Trust me, I know what it is, and I have also read Thomson’s I-D. He has a point, but I think that his point is basically just the flip side of the problem. I believe that you’re bringing up as a straw man to beat. Nonetheless, let’s examine it for a moment.

All virtues when taken too far end up being vices. Nearly every nourishing food is bad for you if you eat too much. Nearly every medicine is a poison if you use it improperly. Even moderation (which is what I am preaching here) can be taken too far, and be mealy-mouthed wishy-washiness. Sometimes one has to take a stand. The questions of course include which stand to take, how firmly one takes it, and so on.

There are a lot of people who use (or used) Postel’s principle as an excuse and justification for being a lazy programmer, for being a bad programmer, for writing misfeatures for being unwilling to fix their bugs. People use it as a way to say their shit doesn’t stink, if you’ll forgive my language. When someone uses it as an excuse for their bad behavior, that is bad. But it is not Postel’s principle that is bad, it is the people using it to justify bad behavior that is bad.

Let me digress into an example. This digression is very relevant as it is directly analogous to this situation as you will see.

For better or worse, the PGP software did not implement Blowfish. Personally, I think it was for worse, but it doesn’t matter. The point is that we didn’t, and while the reasons make for a fun anecdote, they genuinely are a digression. We had a problem, though, and that was that there was an implementation that encrypted using Blowfish even when someone’s key didn’t have Blowfish in their symmetric cipher preferences.

This is a violation of the protocol. It’s a bug. The protocol states that you are not to encrypt using a cipher that is not on the recipient’s preference list. Here’s where you should notice the similarity to the sign-only public key.l In each case, the recipient has made statements to the sender and the sender botched them.

Our problem was that our users were getting messages that they couldn’t decrypt and they were justifiably irked. Our initial answer was the same as what this discussion advocated — passing the blame on to the other software. In each case, as well, the implementation that provides the bad experience is in the right. The other side is broken. However, it was *our* users who were getting the bad experience and the other people didn’t want to fix it.

What we did was to get off our our high horse just a little and do the right thing for our users. We put Blowfish into PGP, but only on the decrypt path. It was never in the UI, we never encrypted to Blowfish. (For all of the wacky, downright stupid reasons that are orthogonal to this story.) In the case where someone else’s implementation was broken, we did the right thing for our users. We decrypted the message, and showed it to them as if everything was done correctly, despite the fact that someone else had a botched implementation, and arguably we were botching our own.

Our allegiance was to our users. That is good software engineering, in my opinion. We also went behind the scenes and got the other people to do the right thing, but it took a while, and even after the other implementation was fixed, the latent issue still hung around for a long, long time until all the other users upgraded.

I know what some people are thinking, they want to come up with some contrived example in which if this had happened or that had happened, it would be bad. They’re right that if it was bad, it would have been bad. It’s always true that bad things are bad. But sometimes, ugly things are less bad than the pretty ones. In the case of our Blowfish example, or this one with sign-only keys, it doesn’t *hurt* anything to make a good experience. It doesn’t ruin security. It doesn’t give wrong answers. (To the contrary, it gives *right* answers.) Yeah, it’s ugly. It’s inelegant. If you’d prefer that I characterize it as the “least bad” answer as opposed to a right answer, sure. I’ll concede that. Ultimately, I care about good experiences for the user over purity.

However, I also have a huge live and let live streak as well. If you disagree, and you want to do *your* implementation so that when your users say, “WTF” the answer is, “That’s what the standard says,” then more power to you! It’s your gun and your users’ foot, to my mind.

What I object to, and the whole reason I started this is that I object to changing the standard so that it says I should shoot my users in the foot. That’s the real point. I’m all in favor of the standard saying, “Look here, doofus, don’t encrypt to sign-only keys!” I am not in favor of it saying, “When some doofus has encrypted to your sign-only key, OMG, whatever you do don’t decrypt it!”

You talk below about the health of the ecosystem, and this is where we agree. What we apparently disagree on to my mind is how this case is handled. Remember one again that this starts with a broken implementation that was done by someone else. It isn’t, in my opinion, about Postel’s principle. My interpretation of his principle is about how you handle gray areas, or the inevitable case where reasonable people might disagree about an interpretation. This is a case where the other implementation is flat-out wrong. My principle is not Postel’s. My principle is to do what’s right for the user. In these two cases, the right thing for the user is to decrypt the message. In other cases, it’s going to be to the opposite. (If this were, for example, about someone who botched an MDC, you’d be hearing a very different thing from me, as that can hurt the user.) It sounds to me like yours is to follow the spec. I also think that you believe that there is some greater good in following the spec. I confess that with me, it’s far more whether I as a user would rage-quit the software and just stop using it.

A few more inline comments below.

>> OpenPGP oughta say that an implementation MUST NOT encrypt to a sign-only key,
>> but it should leave it at that.
>
> I believe that this all goes back to Postel's law of "conservative in what you
> emit, liberal in what you accept". You're advocating for being "resilient" in
> the face of bugs in implementations, and that sounds like a good idea on paper.

Again, no, this isn’t Postel’s principle. This is about what’s right for the user. It’s also about freedom to implement.

I think it’s fine for you to be strict, what I object to is the meta-point, that there are no other reasonable interpretations.

>
> It is at this point terribly hard and time consuming to write an OpenPGP
> implementation that works interoperably, because of a general expectation that
> everyone be 100% GnuPG bug compatible.

Then those people are wrong.

As I mentioned above, PGP was not 100% compatible with GnuPG. It didn’t implement Blowfish. I’ve been involved in other implementations where we implemented a subset of all the possibilities of OpenPGP.

The most problematic part of subsetting OpenPGP is dealing with compression, but even that, if you make your keys correctly will work just fine. You can make an OpenPGP implementation that just does a very few things.

> It's just a blip on the radar, but the
> case described above happened five years into working on OpenKeychain. And it's
> not a fluke, I have more similar incidents (will post about them soon).

Sure. People screw up implementations all the time. So what?

Repeating myself, if your software gives your users a bad experience, they’ll find other software. That software might be OpenPGP compliant, but likely they’ll say something like, “Why are you using OpenPGP when you could be using Signal?"

>
>> The standard should not forbid resiliency in the face of a bug.
>
> This is a very reasonable sentiment for specs like HTML, we certainly wouldn't
> want to outright reject a website just because of a missing </i>. But in the
> context of a cryptographic protocol, this is super dangerous.
>

Why? You assert that, but give no evidence. There is no cryptographic problem here. It’s a policy problem.

> Being overly relaxed in what we accept means giving attackers a large amount of
> wiggling room. This is exactly what brought us EFAIL. We should learn from that,
> and I hope we can do better in the future.

You used the weasel word “overly” and I think that’s the crux. The debate we’re having is over what “overly” means. And frankly, this has nothing to do with EFAIL. But that’s another discussion. I would be happy to talk about EFAIL, but it’s irrelevant to this discussion.

Jon