SSH ServerAliveInterval

ServerAliveInterval is a SSH configuration knob that could make matters worse or better if enabled.

    Host *
     ServerAliveInterval 17
     ...

ServerAliveInterval could be set to some number of seconds if you have an ISP who is aggressive at clearing out "idle" network connections. A SSH connection might sit there doing nothing for hours, just the sort of thing a connection pruner would look to remove. So, if you want your SSH connection more likely to stay up given such firewalls, set a ServerAliveInterval.

ServerAliveInterval could be turned off if you have a microwave or otherwise bad network connection that causes the connection to fail because the ServerAliveInterval ping test failed, and you want the connection to try to stay alive through such rough spots.

These are contradictory use-cases, so if you have both a microwave and a firewall that prunes idle connections, well, maybe there's a ServerAliveInterval high enough to mostly ride through a microwave run while also being low enough to avoid the connection being marked as idle by the firewall.

Generally I use tmux so would connect to that with nothing lost when the connection goes down, but reconnecting can be annoying especially when a connection pruner drops your connections over and over. Another option is Mosh (mobile shell) which may make sense if that works for you, like you're using a phone on a train or some other use incompatible with a long-lived TCP connection.

Why 17 and not 15?

Prime numbers are less likely to run at the same time as other numbers, similar to the "do not run all your cron jobs at the top of the hour" which back in the day caused measurable latency for various applications at some site or the other. Since that day various cron implementations have added knobs to "run this job at some random point during a given interval" to help avoid the problem of too many things all running at the same time. It's probably not important for a keepalive interval, unless you maybe have a lot of systems doing keepalives, in which case there are probably not enough usable keepalive intervals to help spread the requests apart. Anyways I default to using some prime number as an interval to help avoid things running at the same time, wherever possible, even if it may not make much sense.

P.S. malicious ISP might inject RST (reset) packets into the TCP connection to drop it; countermeasures here require control of the firewall on the source and destination system and in those firewalls to ignore RST packets for the relevant SSH connection, which may cause other problems, such as lingering old connections that normally would have been closed by a legitimate RST. TCP connections can also be closed by a pair of FIN+ACK, so RST may not be the usual way to tear the connection down.

P.P.S. An ISP may also have bad firewall rules that sometimes result in RST being sent especially when the packets arrive in some unexpected order because the wifi or microwave garbled things. So the ISP may not be malicious, just incompetent (or both). In general I do not like firewalls that respond: a RST may drop a legtimate connection, and when an attacker is sending 6,000 packets per second you may not want your firewall helpfully sending another 6,000 packets per second back to them. "Attacker" here could be one of those special Dell systems running Windows that would go into energy savings mode and sometimes try to murder the network with huge volumes of forged IPv6 traffic. Unknown traffic? Just drop it.


Source