Discussion:
[tor-dev] WTF-PAD and the future
George Kadianakis
2018-07-27 15:26:54 UTC
Permalink
Hello Mike,

I had a talk with Marc and Mohsen today about WTF-PAD. I now understand
much more about WTF-PAD and how it works with regards to histograms. I
think I might even understand enough to start some sort of conversation
about it:

Here are some takeaways:

1) Marc and Mohsen think that WTF-PAD might not be the way forward
because of its various drawbacks and its complexity. Apparently there
are various attacks on WTF-PAD that Roger has discovered (SENDME
cells side-channels?) and also the deep learning crowd has done some
pretty good damage to the WTF-PAD padding (90%-60% accuracy?). They
also told me that achieving needed precision on the timings might be
a PITA.

2) From what I understand you are also hoping to use WTF-PAD to protect
against circuit fingerprinting and not just website
fingerprinting. They told me that while this might be plausible,
there is no current research on how well it can achieve that. Are we
hoping to do that? And what research remains here? How can I help?
Which parts of the Tor circuit protocol are we hoping to hide?

3) Marc and Mohsen suggested using application-layer defences because
the application-layer has much better view of the actual structures
that are sent on the wire, instead of the black box view that the
network layer has.

In particular they were mainly concerned about onion services
fingerprinting because they are part of a restricted closed world,
whereas they were less concerned about the entire internet because of
its vast size.

They suggested that we could investigate using the service-side
"alpaca" library for onion services (e.g. as part of securedrop?)
which should resolve the most pressing concern of HS identification.

4) They also told me of research by Tobias Pulls which eliminates the
needs for histograms in WTF-PAD and instead it samples from the
probability distribution directly. They think that this can simplify
things somewhat. Any thoughts on this?

Let me know what you think. I still don't understand the entire space
completely yet, so please be gentle. ;)

Cheers! :)
Mike Perry
2018-07-27 19:03:15 UTC
Permalink
Post by George Kadianakis
Hello Mike,
I had a talk with Marc and Mohsen today about WTF-PAD. I now understand
much more about WTF-PAD and how it works with regards to histograms. I
think I might even understand enough to start some sort of conversation
1) Marc and Mohsen think that WTF-PAD might not be the way forward
because of its various drawbacks and its complexity. Apparently there
are various attacks on WTF-PAD that Roger has discovered (SENDME
cells side-channels?) and also the deep learning crowd has done some
pretty good damage to the WTF-PAD padding (90%-60% accuracy?). They
also told me that achieving needed precision on the timings might be
a PITA.
Are there citations for any of this? Last I heard Matt Wright was
working on a deep learning study but the results were mixed.

Furthermore, we need to do adversarial learning and other optimizations
on these histograms to tune them. They are a generalized approach. Just
like it is not a valid evaluation to train a classifier on a dataset and
then add a new defense and show that it can't classify the defended
traffic using the old model, it is similarly not accurate to develop an
attack on WTF-PAD with a new classifier without also adversarially
optimizing the WTF-PAD histograms under that classifier. When you do
this, your results are not invalidating WTF-PAD, they are only
invalidating the histograms that were tuned against the previous
classifier/attack.

The same thing applies to the SENDME concern. The core piece of the
SENDME issue is "Tor should never send more than 1000 cells without a
SENDME. So *IF* I can tell which cells are SENDMEs, and *IF* I see more
than 1000 cells between them, then AHA I know that some cells are
actually padding and not real traffic".

Both of these are very big *IF*s, and even if they were shown to be
valid assumptions (which AFAIK they have not been), that does not mean
that it is actually useful for a classifier to know the percentage of
padding after 1000 cells, and it also does not mean that there isn't a
simple tweak to the histograms that encodes what looks like SENDME
transmission to that classifier.
Post by George Kadianakis
2) From what I understand you are also hoping to use WTF-PAD to protect
against circuit fingerprinting and not just website
fingerprinting. They told me that while this might be plausible,
there is no current research on how well it can achieve that. Are we
hoping to do that? And what research remains here? How can I help?
Which parts of the Tor circuit protocol are we hoping to hide?
I am designing WTF-PAD to be a framework for deploying padding against
arbitrary traffic analysis attacks. It is meant to allow us to define
histograms on the fly (in the Tor consensus) as these are studied. The
fact that they have not yet been studied is not super relevant to
deploying the framework for it now.
Post by George Kadianakis
3) Marc and Mohsen suggested using application-layer defences because
the application-layer has much better view of the actual structures
that are sent on the wire, instead of the black box view that the
network layer has.
In particular they were mainly concerned about onion services
fingerprinting because they are part of a restricted closed world,
whereas they were less concerned about the entire internet because of
its vast size.
They suggested that we could investigate using the service-side
"alpaca" library for onion services (e.g. as part of securedrop?)
which should resolve the most pressing concern of HS identification.
I mean yeah application-layer defenses are useful for website traffic
fingerprinting, but that is a very narrow slice of the traffic analysis
problems that I want this framework to solve.

WTF-PAD also doesn't rule out hidden service operators using alpaca,
either.
Post by George Kadianakis
4) They also told me of research by Tobias Pulls which eliminates the
needs for histograms in WTF-PAD and instead it samples from the
probability distribution directly. They think that this can simplify
things somewhat. Any thoughts on this?
Yes this is actually exactly what I want to do with the next iteration
of WTF-PAD! The question is what form/model to use for these probability
distributions. Right now we're encoding inter-burst and inter-packet
timings with some weird geometric distribution determining how long
these bursts should go on for, when it might be more natural to encode
and sample from length-based distributions/histograms.

(Histograms vs distribution is not the problem -- its what they encode
and how they encode it that matters).

I don't see this paper on Tobias's website. Is it up anywhere yet?
Post by George Kadianakis
Let me know what you think. I still don't understand the entire space
completely yet, so please be gentle. ;)
I hope I was gentle enough. If there's anything that triggers rage mode
in me me more than someone being wrong on the internet, it's FUD and
hand-wringing being spread on the internet. ;)
--
Mike Perry
George Kadianakis
2018-07-29 13:42:43 UTC
Permalink
Post by Mike Perry
Post by George Kadianakis
Hello Mike,
I had a talk with Marc and Mohsen today about WTF-PAD. I now understand
much more about WTF-PAD and how it works with regards to histograms. I
think I might even understand enough to start some sort of conversation
1) Marc and Mohsen think that WTF-PAD might not be the way forward
because of its various drawbacks and its complexity. Apparently there
are various attacks on WTF-PAD that Roger has discovered (SENDME
cells side-channels?) and also the deep learning crowd has done some
pretty good damage to the WTF-PAD padding (90%-60% accuracy?). They
also told me that achieving needed precision on the timings might be
a PITA.
Are there citations for any of this? Last I heard Matt Wright was
working on a deep learning study but the results were mixed.
I think this is the best we have in terms of public results:
https://arxiv.org/abs/1801.02265
Post by Mike Perry
Post by George Kadianakis
2) From what I understand you are also hoping to use WTF-PAD to protect
against circuit fingerprinting and not just website
fingerprinting. They told me that while this might be plausible,
there is no current research on how well it can achieve that. Are we
hoping to do that? And what research remains here? How can I help?
Which parts of the Tor circuit protocol are we hoping to hide?
I am designing WTF-PAD to be a framework for deploying padding against
arbitrary traffic analysis attacks. It is meant to allow us to define
histograms on the fly (in the Tor consensus) as these are studied. The
fact that they have not yet been studied is not super relevant to
deploying the framework for it now.
ACK.

What other traffic analysis attacks are we looking at addressing here?

I'm thinking of stuff like "circuit fingerprinting of onion services",
but I wonder if histograms and random sampling is too crude to actually
be able to help against sophisticated attacks. I don't have a suggestion
for something better currently.

On that topic, is it decided whether the adaptive padding of WTF-PAD
will also happen during circuit construction, or only after that?
Post by Mike Perry
Post by George Kadianakis
3) Marc and Mohsen suggested using application-layer defences because
the application-layer has much better view of the actual structures
that are sent on the wire, instead of the black box view that the
network layer has.
In particular they were mainly concerned about onion services
fingerprinting because they are part of a restricted closed world,
whereas they were less concerned about the entire internet because of
its vast size.
They suggested that we could investigate using the service-side
"alpaca" library for onion services (e.g. as part of securedrop?)
which should resolve the most pressing concern of HS identification.
I mean yeah application-layer defenses are useful for website traffic
fingerprinting, but that is a very narrow slice of the traffic analysis
problems that I want this framework to solve.
WTF-PAD also doesn't rule out hidden service operators using alpaca,
either.
Agreed.
Post by Mike Perry
Post by George Kadianakis
4) They also told me of research by Tobias Pulls which eliminates the
needs for histograms in WTF-PAD and instead it samples from the
probability distribution directly. They think that this can simplify
things somewhat. Any thoughts on this?
Yes this is actually exactly what I want to do with the next iteration
of WTF-PAD! The question is what form/model to use for these probability
distributions. Right now we're encoding inter-burst and inter-packet
timings with some weird geometric distribution determining how long
these bursts should go on for, when it might be more natural to encode
and sample from length-based distributions/histograms.
(Histograms vs distribution is not the problem -- its what they encode
and how they encode it that matters).
I don't see this paper on Tobias's website. Is it up anywhere yet?
Hmm. Looking at the README of wtfpad (see the APE section), I think this
blog post is the best resource we have on this:
https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
Tobias Pulls
2018-07-29 16:08:11 UTC
Permalink
Post by George Kadianakis
Post by Mike Perry
Post by George Kadianakis
4) They also told me of research by Tobias Pulls which eliminates the
needs for histograms in WTF-PAD and instead it samples from the
probability distribution directly. They think that this can simplify
things somewhat. Any thoughts on this?
Yes this is actually exactly what I want to do with the next iteration
of WTF-PAD! The question is what form/model to use for these probability
distributions. Right now we're encoding inter-burst and inter-packet
timings with some weird geometric distribution determining how long
these bursts should go on for, when it might be more natural to encode
and sample from length-based distributions/histograms.
(Histograms vs distribution is not the problem -- its what they encode
and how they encode it that matters).
I don't see this paper on Tobias's website. Is it up anywhere yet?
Hmm. Looking at the README of wtfpad (see the APE section), I think this
https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
Hi George and Mike,

You found the main writeup of the hasty work I did in this direction a
while back, also some comments in the source [0]. Unfortunately my
funding took me in other directions and I didn't want to publish any
paper without spending more time on it. As written on the blog post it
looks like a promising direction, but please also note that the attack
implementation of Wa-kNN used has some rough edges for example when it
comes to time-based features (so robustness of the naive distributions
when moving around the PT server far from a given). If someone wants to
collaborate on this I'd be more than happy to contribute, got funding to
work on Tor-related things again starting August.

Best,
Tobias

[0]: https://github.com/pylls/basket2/blob/master/padding_ape.go
Mike Perry
2018-08-02 20:26:21 UTC
Permalink
Post by Tobias Pulls
Post by George Kadianakis
Post by Mike Perry
Post by George Kadianakis
4) They also told me of research by Tobias Pulls which eliminates the
needs for histograms in WTF-PAD and instead it samples from the
probability distribution directly. They think that this can simplify
things somewhat. Any thoughts on this?
Yes this is actually exactly what I want to do with the next iteration
of WTF-PAD! The question is what form/model to use for these probability
distributions. Right now we're encoding inter-burst and inter-packet
timings with some weird geometric distribution determining how long
these bursts should go on for, when it might be more natural to encode
and sample from length-based distributions/histograms.
(Histograms vs distribution is not the problem -- its what they encode
and how they encode it that matters).
I don't see this paper on Tobias's website. Is it up anywhere yet?
Hmm. Looking at the README of wtfpad (see the APE section), I think this
https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
Hi George and Mike,
You found the main writeup of the hasty work I did in this direction a
while back, also some comments in the source [0]. Unfortunately my
funding took me in other directions and I didn't want to publish any
paper without spending more time on it. As written on the blog post it
looks like a promising direction, but please also note that the attack
implementation of Wa-kNN used has some rough edges for example when it
comes to time-based features (so robustness of the naive distributions
when moving around the PT server far from a given). If someone wants to
collaborate on this I'd be more than happy to contribute, got funding to
work on Tor-related things again starting August.
This is great! Sorry it took me so long to reply. I've been deep in it
thinking about related traffic analysis issues with onion services.

I'm very much interested in this direction. This is the post, right:
https://www.cs.kau.se/pulls/hot/thebasketcase-ape/

Did you handle deplenishing the distributions when normal traffic is
transmitted? Counting traffic that fits the target distribution as
"already sent padding" (and thus sending padding less overall traffic in
that case) is a key piece of WTF-PAD that allows it to have better
goodput. This is in fact why the original e2e defense was called
"Adaptive Padding". Because its padding distributions adapt to observed
traffic.

If we could alter the distribution in this same way, it may be the a
good way to go. However, histograms tend to be easier to do this with,
and they also encode distributions (just perhaps more tediously and
verbosely).

One of the other things I want to try, that may overlap, is changing the
type of information the distribution/histogram encodes. Inter-packet and
inter-burst delay (encoded as two separate states in the state machines)
is perhaps not as optimal or useful or easy to specify/optimize as
something more naturally resembling web traffic, such as a distribution
of request sizes and object sizes, and some way to simulate concurrent
fetch (selection of overlap) of these object sizes, and subtract these
objects-size instances from the distribution when we see them.

What do you think about that? Does that make sense?

Do you think we should try to do this as a parameterized distribution,
or as a histogram?

Are you interested in attempting to implement both/either?
Post by Tobias Pulls
[0]: https://github.com/pylls/basket2/blob/master/padding_ape.go
Ooh nice! This is done as a PT implementation.

You might like:
https://github.com/mikeperry-tor/vanguards/blob/master/README_SECURITY.md

In it, I recommend obfs4 with iat-mode=2 because it does some limited
traffic packet size and timing obfuscation. Should we consider
recommending basket2 also? Is anyone running bridges with it? Probably
not, I guess :/.
--
Mike Perry
Yawning Angel
2018-08-03 12:26:53 UTC
Permalink
Should we consider recommending basket2 also?
No.
Is anyone running bridges with it? Probably not, I guess :/.
No one should be, it is incomplete, buggy, and needs a re-design.

As a side note, I question the utility of a PT that has the AGPL3
network interaction requirement, though there is an exception for
bridges distributed via BridgeDB and those shipped with Tor Browser.

Regards,
--
Yawning Angel
Mike Perry
2018-08-03 21:41:59 UTC
Permalink
Hi Yawning!
Post by Yawning Angel
Should we consider recommending basket2 also?
No.
Is anyone running bridges with it? Probably not, I guess :/.
No one should be, it is incomplete, buggy, and needs a re-design.
Thanks for the heads up.
Post by Yawning Angel
As a side note, I question the utility of a PT that has the AGPL3
network interaction requirement, though there is an exception for
bridges distributed via BridgeDB and those shipped with Tor Browser.
Would you recommend anything else other than obfs4 at this time, as per
that README_SECURITY doc?
(https://github.com/mikeperry-tor/vanguards/blob/master/README_SECURITY.md)
--
Mike Perry
teor
2018-07-30 03:37:29 UTC
Permalink
Post by George Kadianakis
Post by Mike Perry
Post by George Kadianakis
2) From what I understand you are also hoping to use WTF-PAD to protect
against circuit fingerprinting and not just website
fingerprinting. They told me that while this might be plausible,
there is no current research on how well it can achieve that. Are we
hoping to do that? And what research remains here? How can I help?
Which parts of the Tor circuit protocol are we hoping to hide?
I am designing WTF-PAD to be a framework for deploying padding against
arbitrary traffic analysis attacks. It is meant to allow us to define
histograms on the fly (in the Tor consensus) as these are studied. The
fact that they have not yet been studied is not super relevant to
deploying the framework for it now.
ACK.
What other traffic analysis attacks are we looking at addressing here?
I'm thinking of stuff like "circuit fingerprinting of onion services",
but I wonder if histograms and random sampling is too crude to actually
be able to help against sophisticated attacks. I don't have a suggestion
for something better currently.
On that topic, is it decided whether the adaptive padding of WTF-PAD
will also happen during circuit construction, or only after that?
Padding during circuit construction should work with VPADDING cells:
https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt#n508

At least it did last time I checked:
https://github.com/teor2345/endosome/blob/master/client-or-22929.py
https://trac.torproject.org/projects/tor/ticket/22929

We should avoid using PADDING cells during the handshake, because Tor
sometimes closes the connection:
https://github.com/teor2345/endosome/blob/master/client-or-22934.py

T

--
teor

Please reply @torproject.org
New subkeys 1 July 2018
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B
----------------------------------------------------------------------
Loading...