-
Notifications
You must be signed in to change notification settings - Fork 176
Support for migration
I am implementing the support for the connection migration. The work in progress is in the simulate-migration branch. I started with the easy part:
- Create a "path" structure that holds the per path parameters for RTT, MTU, Congestion control.
- Initially, instantiate just a default path.
- Change the send processing (sender.c) so all the packet preparation functions are per path, and of course also congestion algorithm, rtt measurements, and path MTU.
- Change the retransmission process to remember the path on which a packet was sent, and use that information when processing ack or retransmissions.
The next issue is the handling of addresses, making sure that source/dest pairs of IP addresses and ports define just one path, and having some logic for creating new paths. Should we use some stateless verification on the server? We need a protection against the attacker who captures a large number of legit packets and sends them from a fake address. So maybe something like:
-
On receiving a packet from/to a different address, checks whether it passes the usual tests, sequence number, decryption, etc. If that's OK, allow the packet to proceed through the receive chain.
-
If this is the first packet on that address pair, create a challenge frame. Maybe use some "cookie" approach, e.g. hash of the addresses and a secret key to produce a 16 byte cookie. Send it back to/from exactly the specified address. Maybe use the same code path as stateless reset, etc.
-
If the packet contains a challenge response frame, and the cookie matches, create a path.
Of course, there will also need to be a way to delete a path. If we just support migration, the latest path created becomes the default path, no data is sent on the other path, and it is closed after a while. If we support multipath, we need to be smarter. Delete a path if its quality drops below a threshold, not enough activity, etc. And of course only close the connection if it all paths are closed, not just if one path behaves badly.
For true multipath, we need to change the transmission scheduling. Currently, each context contains a variable, "when will the host be ready to send the next packet". It should be updated to answer "when and on which path". I need to update a "next wake time per path" when a packet is sent or received on a path, or when data is posted on a stream. Short term, we are hoping to handle just a couple of paths, so there is no need to be too fancy, scanning all paths will work just fine.
There is a relation between path creation and connection ID. The server proposes a list of connection IDs, the client picks one of them for each new path that is deliberately created. If the server proposes a connection ID, it should be able to process it on reception, i.e. associate it to the connection context. So we have the following plausible steps in development:
-
No support for migration or multipath. This is the current state.
-
Support for involuntary client migration:
- Since this is involuntary, the client keeps using the same Connection-ID
- The server notices that packets are arriving on a new path
- The server sends a path challenge to verify connectivity on the new path
- Upon reception of the path response, the server just replaces the source/dest address and port on the default path.
-
Support for voluntary client migration:
- This can only work if the client has obtained a set of new connection ID from the server
- The client opens a new socket, or picks a new IP source address on the existing socket. (Opening a new socket seems better, since otherwise the port number allows linkability. But then if behind a NAT this is not strictly necessary.)
- The client picks a new connection ID and sends... What? Probably a path challenge, with pad to min size.
- The server notices that packets are arriving on a new path, with a new connection ID.
- The server creates a new path context, associates the connection ID to that context. (When asymmetric connection ID are used, the server should pick a new client ID for the path.)
- The server sends a path challenge to verify connectivity on the new path. If the client sent a path challenge, the path response should be sent on the same packet. Should also pad to min size, for PMTU discovery.
- Upon reception of the path response, the client validates the path. It becomes the new path.
- The client sends the path response, and whatever frames are queued.
- Upon reception of the path response, the server validates the path, makes it default path.
-
Support for voluntary server migration:
- Should be pretty much the same as the client migration, but needs to be initialized by the client because of NAT traversal.
- Client learns the alternate server address somehow. (Either a new transport extension or a new frame.)
- Client creates a new path with that new address. May or may not create a new socket, same privacy/ease of development issues as for client migration.
- Server notices that the packet is arriving to a different destination address, treats that the same as packet arriving from a different address.
-
Support for multipath:
- Same basic requirements as voluntary client or server migration, but the final step post-validation differs.
- Instead of making the new path the default path, it just becomes one of many paths. This requires some kind of scheduler.
- Each peer can send packets on any of the validated paths. This requires agreement to use multipath, which could be negotiated via a version number or via a transport extension.
- Path closure needs to be coordinated. This requires some extension frame, "PATH_CLOSE". There are potential security implications, so this needs to be studied.
- Support for peer-to-peer:
- Not in the roadmap yet, but the support for firewall and NAT traversal in P2P is very similar to the need to validate paths in multipath.
- Peers could implement the ICE functionality by sending path challenges to the tentative addresses of the other peer. The path is selected when the response to the challenge is validated.
- This works especially well if the connection starts with a globally reachable address, which could be provided by TURN or by a VPN. Start with a regular client-server connection, then once keys have been negotiated explore alternate direct paths.
- One way to do this VPN would be a QUIC-in-QUIC connection. Each peer elects a proxy "in the cloud", and the other peer connects through that proxy. But that is definitely not on the roadmap yet.
Already found several issues that will need some resolution:
-
If the MTU really is "per path", we can be on a situation where path #1 has an MTU of 1500 bytes, and path #2 1280 bytes. It will not be possible to simply "repeat the stream frame". Maybe we really need to revisit issue #16.
-
Not clear at all that the ack delay should be per path. It is probably fine if it is a function of the fastest link. In any case, packets received on link 1 can well be acked on link 2.
-
If the data is sent on one link and the ack on another, what is the effect on the RTT measurements?
-
What to do with packets queued for retransmission when a path is closed? Should we delay actual path closure until there are no such packets anymore? What RTO should be used? How to do the comparison?
From the server point of view the NAT rebinding happens when it receives a packet sent to the default connection ID but arriving from a different source address. There is a tension between two goals:
-
Smooth support for NAT rebinding
-
Avoid some kind of connection hijacking
The support is smoothest if new addresses are just accepted, and the next packets sent to that new address. Unfortunately, that creates a hijacking risk. This needs to be mitigated by placing the path in some kind of probing state, and only allowing few packets to go out before the connection can be verified. So the idea is to send a path challenge as soon as the new address shows up, keep using the latest address but with limited credits, and reopen the credits when the path response comes in.
Of course, the path challenge could be lost. We have to be able to repeat it. But we cannot use the standard ack model because the ack must be the challenge response. So we end up adding a test in the sending of packets.