Discussion:
[tor-dev] Bandwidth Authority Progress
teor
2017-12-19 00:38:12 UTC
Permalink
Hi asn,

Original Subject: Re: [tor-project] Meeting notes, network team meeting, 18 Dec
- Met with David Stainton, Moritz and others. Talked about relay load
balancing and bandwidth dirauths. People are sad about the state of the
network: some relays are overloaded while others idle, many overloaded
relays cant even establish circuits to each other. Need to do something
about it: deploy bwscanner and start thinking about peerflow. What about
isis' bridge bandwidth scanner?
Can you tell us a bit more about this meeting?
What can people do if they want to be involved?
When is the next meeting, or how can people find out about it?

T

--
Tim / teor

PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B
ricochet:ekmygaiu4rzgsk6n
------------------------------------------------------------------------
George Kadianakis
2017-12-19 12:44:47 UTC
Permalink
Post by teor
Hi asn,
Original Subject: Re: [tor-project] Meeting notes, network team meeting, 18 Dec
- Met with David Stainton, Moritz and others. Talked about relay load
balancing and bandwidth dirauths. People are sad about the state of the
network: some relays are overloaded while others idle, many overloaded
relays cant even establish circuits to each other. Need to do something
about it: deploy bwscanner and start thinking about peerflow. What about
isis' bridge bandwidth scanner?
Can you tell us a bit more about this meeting?
What can people do if they want to be involved?
When is the next meeting, or how can people find out about it?
Hey teor,

thanks for following through.

The "meeting" was impromptu and IRL because we all happened to be at the
same place. There is no next meeting and it's up to us (Tor/network
team) to figure out what are the next steps here.

This week I did some digging to explore the various possible ways
forward also based on your email here: https://www.mail-archive.com/tor-***@lists.torproject.org/msg09912.html
Here are my findings:

Possibility a) Develop peerflow and deploy it in place of torflow

Peerflow is an exciting and secure bandwidth measurement system
published in PETS 2017: https://ohmygodel.com/publications/peerflow-popets2017.pdf

Unfortunately, it seems quite complicated to develop from scratch and
will probably require _significant_ engineering time to actually make
it a deployed reality (understand, develop, test, deploy). This is
probably the solution we would like to pursue if we had a grant and a
dedicated developer.

Possibility b) Finalize bwscanner and deploy in place of torflow

bwscanner is a project by Aaron/David/Donncha: and can be found here:
https://github.com/TheTorProject/bwscanner

It seems to implement the torflow design (2-hop circs && buckets) but
in a cleaner and better codebase. From what I understand, the main
part of the project is done, but there has been minimal testing on
the real network (there are unittests tho) and also the final output
file with the bandwidth weights has not been completely finalized.

This project is not quite there yet, and will require some
non-trivial engineering time, but it's probably a much easier task
compared to peerflow due to the design being more understood and
already coded. I think 2-3 weeks of developer time could be quite
fruitful here. I also heard that some bw auth operators are eager to
run bwscanner instead of torflow on their setup in January.

Possibility c) Adapt the bridge bw scanner that is currently being developed

Apparently isis and another developer are currently writing a bridge
bandwidth scanner for bridgedb, that could in theory be extended to
scan the whole network. They are currently writing some sort of Rust
library that will be used by the scanner, and the project is ETA
around March 2018. The whole development process is pretty opaque so
I have no idea what's going on. Also, there probably needs to be
considerable work to extend it from a simple bridge scanner to a real
relay scanner, and the final result will probably look like bwscanner
above.

Currently my intuition is to work on (b) above, while also preparing the
ground for (a) which seems to be The Right Thing.

I'm not sure what's the right way forward here in terms of project
management, since the network team seems overloaded and I haven't heard
of anyone willing to take this on...

Ideally we would probably apply for some sort of grant on this work so
that some actual developer time is allocated. I think this is definitely
fundable work since it deeply impacts the *performance* and security of
the Tor network, and basically the network has no chance of surviving in
greater loads if the status quo persists.

I'll try to think more about this problem in the future, these are just
my thoughts from a few hours of digging.

Cheers!
meejah
2017-12-19 19:06:28 UTC
Permalink
Post by George Kadianakis
The "meeting" was impromptu and IRL because we all happened to be at
the same place. There is no next meeting and it's up to us
(Tor/network team) to figure out what are the next steps here.
I want to help.
Anyone please bug me on IRC for any Python etc help required to make
bwauth/scanners better. I don't have enough volunteer cycles right now
to "take over" bwscanner entirely though.
Post by George Kadianakis
This project is not quite there yet, and will require some
non-trivial engineering time, but it's probably a much easier task
compared to peerflow due to the design being more understood and
already coded.
I'm not convinced this part is completely accurate ;) because at TorDev
MTL it seems to me the consensus was that nobody actually knows what
torflow is doing and so answering the question "is bwscanner doing the
same thing" is approximately NP-hard.
Post by George Kadianakis
I think 2-3 weeks of developer time could be quite fruitful here. I
also heard that some bw auth operators are eager to run bwscanner
instead of torflow on their setup in January.
Wooo!
(I think the best path to answering "does bwscanner do the same thing as
torflow" is to Run It And See...) If any of these parties are having
problems deploying bwscanner this is probably something I can help with.
Post by George Kadianakis
Currently my intuition is to work on (b) above, while also preparing the
ground for (a) which seems to be The Right Thing.
+1
I think the next step for a) isn't "implement it", but "write a spec for
it" instead.
Post by George Kadianakis
Ideally we would probably apply for some sort of grant on this work so
that some actual developer time is allocated. I think this is definitely
fundable work since it deeply impacts the *performance* and security of
the Tor network [..]
+5
--
meejah
teor
2017-12-19 21:00:12 UTC
Permalink
Post by meejah
Post by George Kadianakis
This project is not quite there yet, and will require some
non-trivial engineering time, but it's probably a much easier task
compared to peerflow due to the design being more understood and
already coded.
I'm not convinced this part is completely accurate ;) because at TorDev
MTL it seems to me the consensus was that nobody actually knows what
torflow is doing and so answering the question "is bwscanner doing the
same thing" is approximately NP-hard.
I have some idea what torflow is doing, in a broad sense:
* launch 2 tor clients
* repeat as often as possible, running 9 different scanners:
* split relays into buckets by bandwidth percentile
* build two hop paths with a relay and exit from relays in the bucket
* download a file from a bandwidth server, choose the size based on the bucket
* measure how long it takes
* store the results in a database
* aggregate the results hourly:
* produce a consensus weight to advertised bandwidth ratio
* using a decaying weighted average
* and some form of feedback (PID) control
* and dump it to a file
Then authorities read this file and include it in their votes.

I suspect that Mike Perry may remember more detail, or may want to
correct my summary, as he wrote most of torflow (I think?)

My conclusion at the Montreal meeting was that we don't have a
detailed spec (see below). So that makes it hard to tell if:
* torflow does what we want it to do
* the new bwauth project does what we want it to do
* they are similar enough for a staged or once-off transition
Post by meejah
Post by George Kadianakis
I think 2-3 weeks of developer time could be quite fruitful here. I
also heard that some bw auth operators are eager to run bwscanner
instead of torflow on their setup in January.
Wooo!
(I think the best path to answering "does bwscanner do the same thing as
torflow" is to Run It And See...) If any of these parties are having
problems deploying bwscanner this is probably something I can help with.
It doesn't produce an output file in the same format as torflow, so we need
to specify (see below) and implement that part first.

Otherwise, we would not have any results to compare.
Post by meejah
Post by George Kadianakis
Currently my intuition is to work on (b) above, while also preparing the
ground for (a) which seems to be The Right Thing.
+1
I think the next step for a) isn't "implement it", but "write a spec for
it" instead.
+1

Let's start by specifying what tor directory authorities expect from the
file format.

T
Mike Perry
2017-12-11 16:04:40 UTC
Permalink
Post by teor
Post by meejah
Post by George Kadianakis
This project is not quite there yet, and will require some
non-trivial engineering time, but it's probably a much easier task
compared to peerflow due to the design being more understood and
already coded.
I'm not convinced this part is completely accurate ;) because at TorDev
MTL it seems to me the consensus was that nobody actually knows what
torflow is doing and so answering the question "is bwscanner doing the
same thing" is approximately NP-hard.
* launch 2 tor clients
* split relays into buckets by bandwidth percentile
* build two hop paths with a relay and exit from relays in the bucket
* download a file from a bandwidth server, choose the size based on the bucket
* measure how long it takes
* store the results in a database
* produce a consensus weight to advertised bandwidth ratio
* using a decaying weighted average
* and some form of feedback (PID) control
* and dump it to a file
Then authorities read this file and include it in their votes.
Yes, all of this is correct.

Technically though full PID feedback is disabled right now. The
PID-based implementation itself is enabled via bwauthpid=1 in the
consensus, but the PID constants are currently set such that there is no
actual feedback happening. See Section 3 of the Bw authority spec for
more info:
https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/README.spec.txt#n353

If feedback is enabled (via consensus parameters), it drives relays to
other forms of resource exhaustion which we do not currently measure
(primarily CPU exhaustion, which we could approximate by circuit
failure, but potentially also memory pressure, which we have no signal
for).
Post by teor
I suspect that Mike Perry may remember more detail, or may want to
correct my summary, as he wrote most of torflow (I think?)
Yes.
Post by teor
Post by meejah
(I think the best path to answering "does bwscanner do the same thing as
torflow" is to Run It And See...) If any of these parties are having
problems deploying bwscanner this is probably something I can help with.
Karsten wrote some scripts that can produce CDF graphs of bw authority
votes for all of the flag combinations. This was very useful for
determining if different bw authorities were measuring the network
similarly. It will also be useful to see how closely the bwscanner is
coming to the bwauth votes:
https://trac.torproject.org/projects/tor/ticket/2394

I am not sure what repo they are in, though.
Post by teor
Let's start by specifying what tor directory authorities expect from the
file format.
This format is already specified in Sections 2.4 and 3.4 of the bwauth spec
itself:
https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/README.spec.txt#n332
https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/README.spec.txt#n447

(This output should not be confused with Section 1.6, which specifies
the intermediate sub-process format before aggregating results).
--
Mike Perry
Iain Learmonth
2017-12-19 22:57:36 UTC
Permalink
Hi,
Post by Mike Perry
https://trac.torproject.org/projects/tor/ticket/2394
I am not sure what repo they are in, though.
https://gitweb.torproject.org/metrics-tasks.git/tree/task-2394

Thanks,
Iain.

Loading...