Skip to content

Ephemeral source port not random #632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
owendelong opened this issue Jul 30, 2015 · 10 comments
Closed

Ephemeral source port not random #632

owendelong opened this issue Jul 30, 2015 · 10 comments

Comments

@owendelong
Copy link

When using the WiFiClient library as follows:

#include <ESP8266WiFi.h>
#define MY_SSID "ssid"
#define MY_WFPASSWORD "wpa2_key"

WiFiClient client;

void setup() {
  WiFi.begin(MY_SSID, MY_WFPASSWORD);
  while (WiFi.status() != WL_CONNECTED)
  {
    delay(50);
  }
  client.connect(HOST, HTTPPORT);
}

The source port utilized is always "patrolview" or 4097.

This can create a deadly embrace situation if the device is being used to log data and resets itself as a means to try again without delay on certain connection errors. It should be randomized so that a different port is used for each call to connect, even if the device is reset/rebooted between calls.

The problem that can occur is a web server which is in one of the wait states for closing on a previous attempt will see the new attempt to connect as matching the previous one since the SRC_ADDR,SRC_PORT,DST_ADDR,DST_PORT tuple is identical. This results in a retransmission of the previous FIN packet and a reset of the wait timer on the server side. So long as the ESP8266 continues to retry within before the timer expires (often these are ~5 minutes or so), it will be unable to initiate a new connection to the server without being mistaken for the old one.

The simplest solution is to ad code to WiFiClient::connect() that will randomize the local port number.

I will attempt to provide a workable patch shortly.

@owendelong
Copy link
Author

Additional information, looks like this problem is actually in the tcp_connect function in the underlying lwip library.

@igrr
Copy link
Member

igrr commented Jul 30, 2015

There is a static method in WiFiClient class, static void
setLocalPortStart(uint16_t port);
Find a random number generator which suits your requirements, seed it with
something (ADC reading perhaps?) and call this function somewhere in your
setup() method before you initialize WiFiClients.

On Thu, Jul 30, 2015, 07:58 owendelong [email protected] wrote:

Additional information, looks like this problem is actually in the
tcp_connect function in the underlying lwip library.


Reply to this email directly or view it on GitHub
#632 (comment).

@owendelong
Copy link
Author

Thanks, Igrr.

I tracked the problem down to this function in the LWIP library:

/**
 * Allocate a new local TCP port.
 *
 * @return a new (free) local TCP port number
 */
static u16_t
tcp_new_port(void)
{
  u8_t i;
  u16_t n = 0;
  struct tcp_pcb *pcb;

again:
  if (tcp_port++ == TCP_LOCAL_PORT_RANGE_END) {
    tcp_port = TCP_LOCAL_PORT_RANGE_START;
  }
  /* Check all PCB lists. */
  for (i = 0; i < NUM_TCP_PCB_LISTS; i++) {
    for(pcb = *tcp_pcb_lists[i]; pcb != NULL; pcb = pcb->next) {
      if (pcb->local_port == tcp_port) {
        if (++n > (TCP_LOCAL_PORT_RANGE_END - TCP_LOCAL_PORT_RANGE_START)) {
          return 0;
        }
        goto again;
      }
    }
  }
  return tcp_port;
}

I agree that your workaround is probably a good solution to my immediate problem (will require testing obviously), but I think it would make just as much sense to replace:

  if (tcp_port++ == TCP_LOCAL_PORT_RANGE_END) {
    tcp_port = TCP_LOCAL_PORT_RANGE_START;
  }

with

   tcp_port = random_port_number();

And create a function like

uint16_t random_port_number()
{
  static bool seeded;
  if (! seeded)
  {
    randomSeed(analogReed(A0));
    seeded=true;
  }
  return(random(TCP_LOCAL_PORT_RANGE_START, TCP_LOCAL_PORT_RANGE_END));
}

Since the function already has a goto again built in to deal with any duplicate port and will still terminate the search after a reasonable number of retries, the only loss is that an exhaustive search of all possible values is not possible. One simple solution to that issue would be to further modify the function to sequentially hunt for an available port from a random starting point. The reality is that collisions are probably going to be fairly rare given the large number space involved and the limit of simultaneous connections that can be supported by the available heap space on the ESP8266.

@owendelong
Copy link
Author

So on further research, I'm unable to find any such method in the version of WiFiClient class that ships with the ESP8266 board package used by the Arduino software in its current release.

Nor does it show up in the source code in this repository in either WiFiClient.h or WiFiClient.cpp.

@igrr
Copy link
Member

igrr commented Jul 30, 2015

You don't have to make any modifications to lwip to use the port you want.
Both tcp_bind and tcp_connect functions will call tcp_new_port only if the local port in the pcb was set to 0. So the way to override lwip's port allocation mechanism is through setting pcb->local_port, which is exactly what we do.
Plus, lwip is provided as a binary by Espressif, so we can't make such a modification.

The function in question is here:
https://github.com./esp8266/Arduino/blob/esp8266/hardware/esp8266com/esp8266/libraries/ESP8266WiFi/src/WiFiClient.h#L65

For a test, I have just compiled the following sketch with the latest boards manager package:

#include <ESP8266WiFi.h>
#include <WiFiClient.h>

void setup() {
  WiFiClient::setLocalPortStart(analogRead(A0));

}

void loop() {

}

@owendelong
Copy link
Author

Interesting... Somehow my boards manager isn't pulling the latest package then. Thanks!

@owendelong
Copy link
Author

So... The function you describe is in the development release, but not the stable release. I'm now using the development release and that problem is resolved. I'll mark this closed and take my LWIP issue to Espressif since as you mention, the Espressif port of LWIP is provided as a binary library.

@owendelong
Copy link
Author

FYI -- There is open source LWIP (without espconn) available:
http://www.esp8266.com/viewtopic.php?f=6&t=4295&p=24723#p24723

@igrr
Copy link
Member

igrr commented Jul 31, 2015

Thanks, unfortunately it doesn't work with recent SDKs — see this issue: https://github.com./kadamski/esp-lwip/issues/7.

@owendelong
Copy link
Author

Ugh... There's always a catch with these modules, isn't there.

d-a-v added a commit that referenced this issue Apr 5, 2019
ref: d-a-v/esp82xx-nonos-linklayer#31
origin: #5902
me-no-dev/ESPAsyncTCP#108

Following the links above is instructive.
To summarize:

    * currently and from a long time lwIP tcp client connections always uses the same tcp source port number right after boot
    * this port number is increased everytime a new one is needed (= new tcp client connection)
      (to be noted, linux has the same increasing behavior)
    * when connecting to the same server (right after boot), the triplet (esp-ip-address, source port, destination port) are the same, and may hit remote server list of sockets in time-wait-state (previous connection unproperly closed from the same esp). Consequently the new connection fails when it happens.
    * this is happening only when debugging (esp reboots often, in less time than time-wait expiration), so the nasty effect is amplified especially when bugs are being chased
    * efforts had been done when espressif's lwIP implementation wasn't open source, with WiFiClient::setLocalPortStart() #632 but it must be explicitely called with a different random number at every reboot. Efficient but not ideal.

This PR uses espressif firmware's r_rand() everytime a new local source port is needed. A different source port number is now showed by tcpdump right after boot. Source port range and duplication is verified everytime in lwIP's src/core/tcp.c:tcp_new_port(). It is implemented as a local patch for upstream lwIP so it is valid not only with WiFiClient but also with @me-no-dev's Async libraries (they don't use WiFiClient).

WiFiClient::setLocalPortStart() is still usable with the same effects as before.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants