[tor-dev] HS desc replay protection and ed25519 malleability

Discussion:

George Kadianakis

2018-04-18 13:15:35 UTC

Hello Ian, isis, and other crypto people around here!

Here is an intro: In HSv3 we've been using revision counters
(integers++) to do HS desc replay protection, so that bad HSDirs cannot
replay old descs to other HSDirs. We recently learned that this is a bad
idea from a scalability prespective (multiple sites need to track rev
counter...), and also it's needless complexity in the code (tor needs to
cache the counters etc.). See ticket #25552 for more details:
https://trac.torproject.org/projects/tor/ticket/25552
https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt#n1078

In #25552 we've been making plans to ditch the rev counters and replace
them with a casual replay cache. (These replay caches also don't need to
be big, since descriptors are only replayable for a day before the
ephemeral blinded key changes, and the cache can be reset).

Anyhow, now we've been playing the game of "which part of the desc
should we use in the replay cache"? The latest plan here has been to use
the ed25519 descriptor signature since it's something small, simple and
necessarily changes with every fresh descriptor. And this is how we
entered the ed25519 malleability scene.

The basic question here is, can we use the ed25519 signature in our
replay cache and consider it immutable by attackers without the private
key? And should we use R, or S, or both?

According to RFC8032:

Ed25519 and Ed448 signatures are not malleable due to the
check that decoded S is smaller than l. Without this
check, one can add a multiple of l into a scalar part and
still pass signature verification, resulting in malleable
signatures.

However, neither donna or ref10 include such a check explicitly
IIUC. Instead they check whether (RS[63] & 224), which basically ensures
that the high 3 bits of S are zeroed, which ensures S < 2^253. Is that
equivalent to the RFC check? Because if I'm counting right, for most
legit S values you can still add a single l as the attacker and get an
S' = S + l < 2^253 equivalent signature (you can't add 2*l tho).

So what's the state of ed25519 malleability? I know that after the
Bitcoin incident, people have thought about this a lot, so I doubt we
are in a broken state, but I just wanted to make sure that we will not
mess something up. :)

Thanks for the help! :)

Watson Ladd

2018-04-18 13:21:10 UTC

Permalink

Post by George Kadianakis
Hello Ian, isis, and other crypto people around here!
Here is an intro: In HSv3 we've been using revision counters
(integers++) to do HS desc replay protection, so that bad HSDirs cannot
replay old descs to other HSDirs. We recently learned that this is a bad
idea from a scalability prespective (multiple sites need to track rev
counter...), and also it's needless complexity in the code (tor needs to
https://trac.torproject.org/projects/tor/ticket/25552
https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt#n1078
In #25552 we've been making plans to ditch the rev counters and replace
them with a casual replay cache. (These replay caches also don't need to
be big, since descriptors are only replayable for a day before the
ephemeral blinded key changes, and the cache can be reset).
Anyhow, now we've been playing the game of "which part of the desc
should we use in the replay cache"? The latest plan here has been to use
the ed25519 descriptor signature since it's something small, simple and
necessarily changes with every fresh descriptor. And this is how we
entered the ed25519 malleability scene.
The basic question here is, can we use the ed25519 signature in our
replay cache and consider it immutable by attackers without the private
key? And should we use R, or S, or both?
Ed25519 and Ed448 signatures are not malleable due to the
check that decoded S is smaller than l. Without this
check, one can add a multiple of l into a scalar part and
still pass signature verification, resulting in malleable
signatures.
However, neither donna or ref10 include such a check explicitly
IIUC. Instead they check whether (RS[63] & 224), which basically ensures
that the high 3 bits of S are zeroed, which ensures S < 2^253. Is that
equivalent to the RFC check? Because if I'm counting right, for most
legit S values you can still add a single l as the attacker and get an
S' = S + l < 2^253 equivalent signature (you can't add 2*l tho).

This seems right. Malleability is not part of the standard security
definition for signatures.

Post by George Kadianakis
So what's the state of ed25519 malleability? I know that after the
Bitcoin incident, people have thought about this a lot, so I doubt we
are in a broken state, but I just wanted to make sure that we will not
mess something up. :)
Thanks for the help! :)
_______________________________________________
tor-dev mailing list
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

--
"Man is born free, but everywhere he is in chains".
--Rousseau.

George Kadianakis

2018-04-18 13:53:59 UTC

Permalink

Post by Watson Ladd

This seems right. Malleability is not part of the standard security
definition for signatures.

Thanks for the help!

Hmm, so everyone gets a shot at a single malleability "attack" with
everye d25519 sig? What's the point of the (RS[63] & 224) check then?

In this case, we can't use S as the replay cache index since the
attacker can mutate it and still get the sig to verify. Can we use R as
the replay cache index then? Can an attacker given (R,S) find (R',S')
such that the sig still verifies?

Ian Goldberg

2018-04-18 14:31:30 UTC

Permalink

[Warning: recovering from illness. Brain may not be operating at
nominal capacity. :-p ]

Post by George Kadianakis
Thanks for the help!
Hmm, so everyone gets a shot at a single malleability "attack" with
everye d25519 sig? What's the point of the (RS[63] & 224) check then?
In this case, we can't use S as the replay cache index since the
attacker can mutate it and still get the sig to verify.

You could still use (S mod l) as the cache index, though, right?

Post by George Kadianakis
Can we use R as the replay cache index then? Can an attacker given
(R,S) find (R',S') such that the sig still verifies?

Using R itself should, AFAICT, be safe against malleability. More
concretely, whatever representation of R is used in the hash H(R,A,M)
used to compute S (and to verify the signature). But is malleability
the only attack you should be worried about? What if one party produces
two descriptors for two different services, but reuses R to cause a
cache collision? Presumably some untoward badness would result.
Perhaps use the *output* of the hash H(R,A,M)? Or the pair
(R, S mod l)?

It's also not true that malleability is not part of the standard
security definition for signatures. That's exactly the difference
between the WUF and SUF (weakly / strongly unforgeable) security
properties. In a WUF system, the attacker, given a signing oracle,
cannot produce a valid signature on a message she does not present to
the signing orable. In a SUF system, the attacker cannot produce a
valid (message,signature) _pair_ she does not send to / receive from the
signing oracle. A system with malleable signatures can be WUF, but
cannot be SUF.

- Ian

isis agora lovecruft

2018-04-19 18:19:29 UTC

Permalink

Post by Ian Goldberg

You could still use (S mod l) as the cache index, though, right?

Yes, although with the caveat that this is somewhat expensive.

Post by Ian Goldberg

Post by George Kadianakis
Can we use R as the replay cache index then? Can an attacker given
(R,S) find (R',S') such that the sig still verifies?

R is malleable as well, because it's not guaranteed to not have a small
torsion component. Also, I think it's also not exactly strictly guaranteed
to be both random and tied to the domain of the secret key which produced
the signature, i.e. if normally R is produced as:

r = H(H(a)[32:] || M) mod l
R = rB

but one could also do:

r = H(H("yog-shoggoth")[32:] || M) mod l
R = rB

which is still a random thing, albeit not tied to the context of the creator
of the signature. That signature obviously wouldn't verify, but again,
you'd have to do the fairly expensive thing of double-checking that the
signature is valid before using the R as a source of randomness provided by
the actually legit and well-behaved onion service.

Post by Ian Goldberg
What if one party produces two descriptors for two different services, but
reuses R to cause a cache collision? Presumably some untoward badness
would result.

Yes, replay is also possible, and both the small torsion attack and the
wonkiness above is possible by any party with access to the HS descriptor,
and an unreduced scalar with some multiple(s) of R = 2^260 (mod 2^252 +
27742317777372353535851937790883648493) should also be possible by the HS
itself, depending on which curve library you're using.

Post by Ian Goldberg
Perhaps use the *output* of the hash H(R,A,M)? Or the pair
(R, S mod l)?

H(R || A || M) might be okayâŠ but it's still a little icky given that it's
still not strictly tied to the secret key that produces the eventual
signature (cf. the calculation of `V` below).

I would highly advise reusing Trevor Perrin's work on building a VRF into
Generalised EdDSA, [0] which solves precisely this problem (i.e. "how do I get
verifiable randomness, given an ed25519 signature?") in the following way:
|
| The VXEdDSA signing algorithm takes the same inputs as XEdDSA. The output
| is a pair of values. First, a signature (V || h || s), which is a byte
| sequence of length 3b bits, where V encodes a point and h and s encode
| integers modulo q. Second, a VRF output byte sequence v of length equal to
| b bits, formed by multiplying the V output by the cofactor c.
|
| The VXEdDSA verification algorithm takes the same inputs as XEdDSA, except
| with a VXEdDSA signature instead of an XEdDSA signature. If VXEdDSA
| verification is successful, it returns a VRF output byte sequence v of
| length equal to b bits; otherwise it returns false.)
|
| vxeddsa_sign(k, M, Z):
| A, a = calculate_key_pair(k)
| B_{v} = hash_to_point(A || M)
| V = a B_{v}
| r = hash3(a || V || Z) (mod q)
| R = r B
| R_{v} = r B_{v}
| h = hash4(A || V || R || Rv || M) (mod q)
| s = r + ha (mod q)
| v = hash5(c V) (mod 2^{b})
| return (V || h || s), v

(Trevor is using q, where you, Ian, and I are using \ell, and floodyberry
uses n. Also ignore the `calculate_key_pair` function, that's just because
Signal stores keys in Montgomery form. Also ignore the numbers after the
hashes, that's just denoting the labelset system for domain separation.)

Personally, I'd redefine `hash_to_point` such that it does two elligator2
encodings from a 512-bit hash, and then adds the points together, before
multiplying by the cofactor, to ensure it produces all (instead of roughly
half of all) possible points.

Also, (shameless plug) you get all of this basically for free if you just do
generalised EdDSA with Ristretto, [1] since the cofactor is elliminated.
I've been working with Trevor on this, it's called D.A.V.R.O.S. and it'll
be doneâŠ (as) soon (as I have more free time to finish it).

[0]: https://signal.org/docs/specifications/xeddsa/#vxeddsa
[1]: https://doc-internal.dalek.rs/curve25519_dalek/ristretto/index.html

Best regards,

--
â¥â¶ isis agora lovecruft
_________________________________________________________
OpenPGP: 4096R/0A6A58A14B5946ABDE18E207A3ADB67A2CDB8B35
Current Keys: https://fyb.patternsinthevoid.net/isis.txt

Ian Goldberg

2018-04-19 19:52:03 UTC

Permalink

Post by isis agora lovecruft

Post by Ian Goldberg

You could still use (S mod l) as the cache index, though, right?

Yes, although with the caveat that this is somewhat expensive.

Post by Ian Goldberg

Post by George Kadianakis
Can we use R as the replay cache index then? Can an attacker given
(R,S) find (R',S') such that the sig still verifies?

R is malleable as well, because it's not guaranteed to not have a small
torsion component.

But given (R,S), mutating R to R+kT will cause H(R,A,M) to change, and
so S will change unpredictably (to someone who doesn't know the private
key).

Post by isis agora lovecruft
Also, I think it's also not exactly strictly guaranteed
to be both random and tied to the domain of the secret key which produced
r = H(H(a)[32:] || M) mod l
R = rB
r = H(H("yog-shoggoth")[32:] || M) mod l
R = rB
which is still a random thing, albeit not tied to the context of the creator
of the signature. That signature obviously wouldn't verify, but again,
you'd have to do the fairly expensive thing of double-checking that the
signature is valid before using the R as a source of randomness provided by
the actually legit and well-behaved onion service.

I think *anything* you do with the cache label can only be done *after*
you verify the signature. Otherwise, there are way too many avenues for
attack on the cache that come from crafting arbitrary bitstrings that
don't even make sense.

Your point about everything being better if signatures are unique is
definitely a good one (and it obviously makes any WUF system
automatically SUF.)

George Kadianakis

2018-04-24 10:53:43 UTC

Permalink

Post by isis agora lovecruft

Post by Ian Goldberg

You could still use (S mod l) as the cache index, though, right?

Yes, although with the caveat that this is somewhat expensive.
<snip>

Post by Ian Goldberg
Perhaps use the *output* of the hash H(R,A,M)? Or the pair
(R, S mod l)?

H(R || A || M) might be okay… but it's still a little icky given that it's
still not strictly tied to the secret key that produces the eventual
signature (cf. the calculation of `V` below).
I would highly advise reusing Trevor Perrin's work on building a VRF into
Generalised EdDSA, [0] which solves precisely this problem (i.e. "how do I get
|
| The VXEdDSA signing algorithm takes the same inputs as XEdDSA. The output
| is a pair of values. First, a signature (V || h || s), which is a byte
| sequence of length 3b bits, where V encodes a point and h and s encode
| integers modulo q. Second, a VRF output byte sequence v of length equal to
| b bits, formed by multiplying the V output by the cofactor c.
|
| The VXEdDSA verification algorithm takes the same inputs as XEdDSA, except
| with a VXEdDSA signature instead of an XEdDSA signature. If VXEdDSA
| verification is successful, it returns a VRF output byte sequence v of
| length equal to b bits; otherwise it returns false.)
|
| A, a = calculate_key_pair(k)
| B_{v} = hash_to_point(A || M)
| V = a B_{v}
| r = hash3(a || V || Z) (mod q)
| R = r B
| R_{v} = r B_{v}
| h = hash4(A || V || R || Rv || M) (mod q)
| s = r + ha (mod q)
| v = hash5(c V) (mod 2^{b})
| return (V || h || s), v
(Trevor is using q, where you, Ian, and I are using \ell, and floodyberry
uses n. Also ignore the `calculate_key_pair` function, that's just because
Signal stores keys in Montgomery form. Also ignore the numbers after the
hashes, that's just denoting the labelset system for domain separation.)
Personally, I'd redefine `hash_to_point` such that it does two elligator2
encodings from a 512-bit hash, and then adds the points together, before
multiplying by the cofactor, to ensure it produces all (instead of roughly
half of all) possible points.
Also, (shameless plug) you get all of this basically for free if you just do
generalised EdDSA with Ristretto, [1] since the cofactor is elliminated.
I've been working with Trevor on this, it's called D.A.V.R.O.S. and it'll
be done… (as) soon (as I have more free time to finish it).

Hello Isis and thanks for the help!

I found some time to read about XEdDSA and VXEdDSA: they seem like very
interesting constructions and indeed the latter seems like the thing we
should be using here!

That said, my impression is that swapping ed25519 for VXEdDSA in the HS
protoocl is not gonna be easy (especially since we are hoping to fit
this in 034), because then HSDirs will need to be able to verify both
ed25519 and VXEdDSA signatures, which IIUC are different constructions
(that is, an HSDir node can't produce a VRF output with an ed25519 sig).

IMO, a reasonable plan for now is to use either (R, S mod l) or H(R,A,M)
in the replay cache and only *after* the desc signature has been
verified. Then perhaps in the future we should consider whether we can
eventually swap out ed25519 for VXEdDSA so that we can use its VRF
output directly.

I hope this makes sense. I'm gonna start implementing this concept this
week, in hopes that we can fit it in the 034 schedule, because otherwise
revision counters are gonna linger around for longer!

Thanks! :)