TCP 套接字连接是否有“保持活动”?

我听说过 HTTP keep-alive,但是现在我想打开一个与远程服务器的套接字连接。
现在,这个套接字连接将永远保持打开状态,还是有一个与 HTTP keep-alive 类似的超时限制?

192464 次浏览

TCP sockets remain open till they are closed.

That said, it's very difficult to detect a broken connection (broken, as in a router died, etc, as opposed to closed) without actually sending data, so most applications do some sort of ping/pong reaction every so often just to make sure the connection is still actually alive.

You are looking for the SO_KEEPALIVE socket option.

The Java Socket API exposes "keep-alive" to applications via the setKeepAlive and getKeepAlive methods.

EDIT: SO_KEEPALIVE is implemented in the OS network protocol stacks without sending any "real" data. The keep-alive interval is operating system dependent, and may be tuneable via a kernel parameter.

Since no data is sent, SO_KEEPALIVE can only test the liveness of the network connection, not the liveness of the service that the socket is connected to. To test the latter, you need to implement something that involves sending messages to the server and getting a response.

If you're behind a masquerading NAT (as most home users are these days), there is a limited pool of external ports, and these must be shared among the TCP connections. Therefore masquerading NATs tend to assume a connection has been terminated if no data has been sent for a certain time period.

This and other such issues (anywhere in between the two endpoints) can mean the connection will no longer "work" if you try to send data after a reasonble idle period. However, you may not discover this until you try to send data.

Using keepalives both reduces the chance of the connection being interrupted somewhere down the line, and also lets you find out about a broken connection sooner.

TCP keepalive and HTTP keepalive are very different concepts. In TCP, the keepalive is the administrative packet sent to detect stale connection. In HTTP, keepalive means the persistent connection state.

This is from TCP specification,

Keep-alive packets MUST only be sent when no data or acknowledgement packets have been received for the connection within an interval. This interval MUST be configurable and MUST default to no less than two hours.

As you can see, the default TCP keepalive interval is too long for most applications. You might have to add keepalive in your application protocol.

Here is some supplemental literature on keepalive which explains it in much finer detail.

http://www.tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO

Since Java does not allow you to control the actual keepalive times, you can use the examples to change them if you're using a Linux kernel (or proc based OS).

Does a TCP socket connection have a "keep alive"?

The short answer is yes there is a timeout enforced via TCP Keep-Alive, so no the socket won't remain open forever but will probably time out after a few hours.

If you would like to configure the Keep-Alive timeout on your machine, see the "Changing TCP Timeouts" section below. Otherwise read through the rest of the answer to learn how TCP Keep-Alive works.

Introduction

TCP connections consist of two sockets, one on each end of the connection. When one side wants to terminate the connection, it sends an FIN packet which the other side acknowledges and both close their sockets.

Until that happens, however, both sides will keep their socket open indefinitely. This leaves open the possibility that one side may close their socket, either intentionally or due to some error, without informing the other end via FIN. In order to detect this scenario and close stale connections the TCP Keep Alive process is used.

Keep-Alive Process

There are three configurable properties that determine how Keep-Alives work. On Linux they are1:

    • default 7200 seconds
    • default 9
    • default 75 seconds

The process works like this:

  1. Client opens TCP connection
  2. If the connection is silent for tcp_keepalive_time seconds, send a single empty ACK packet.1
  3. Did the server respond with a corresponding ACK of its own?
    • No
      1. Wait tcp_keepalive_intvl seconds, then send another ACK
      2. Repeat until the number of ACK probes that have been sent equals tcp_keepalive_probes.
      3. If no response has been received at this point, send a RST and terminate the connection.
    • Yes: Return to step 2

This process is enabled by default on most operating systems, and thus dead TCP connections are regularly pruned once the other end has been unresponsive for 2 hours 11 minutes (7200 seconds + 75 * 9 seconds).

Gotchas

2 Hour Default

Since the process doesn't start until a connection has been idle for two hours by default, stale TCP connections can linger for a very long time before being pruned. This can be especially harmful for expensive connections such as database connections.

Keep-Alive is Optional

According to RFC 1122 4.2.3.6, responding to and/or relaying TCP Keep-Alive packets is optional:

Implementors MAY include "keep-alives" in their TCP implementations, although this practice is not universally accepted. If keep-alives are included, the application MUST be able to turn them on or off for each TCP connection, and they MUST default to off.

...

It is extremely important to remember that ACK segments that contain no data are not reliably transmitted by TCP.

The reasoning being that Keep-Alive packets contain no data and are not strictly necessary and risk clogging up the tubes of the interwebs if overused.

In practice however, my experience has been that this concern has dwindled over time as bandwidth has become cheaper; and thus Keep-Alive packets are not usually dropped. Amazon EC2 documentation for instance gives an indirect endorsement of Keep-Alive, so if you're hosting with AWS you are likely safe relying on Keep-Alive, but your mileage may vary.

Changing TCP Timeouts

Per Socket

2022 Update: Apparently, as of Java 11, you may be able to set these on the Java TCP Socket itself.

Unfortunately since TCP connections are managed on the OS level, older versions of Java do not support configuring timeouts on a per-socket level such as in java.net.Socket. I have found some attempts3 to use Java Native Interface (JNI) to create Java sockets that call native code to configure these options, but none appear to have widespread community adoption or support.

Instead, you may be forced to apply your configuration to the operating system as a whole. Be aware that this configuration will affect all TCP connections running on the entire system.

Linux

The currently configured TCP Keep-Alive settings can be found in

  • /proc/sys/net/ipv4/tcp_keepalive_time
  • /proc/sys/net/ipv4/tcp_keepalive_probes
  • /proc/sys/net/ipv4/tcp_keepalive_intvl

You can update any of these like so:

# Send first Keep-Alive packet when a TCP socket has been idle for 3 minutes
$ echo 180 > /proc/sys/net/ipv4/tcp_keepalive_time
# Send three Keep-Alive probes...
$ echo 3 > /proc/sys/net/ipv4/tcp_keepalive_probes
# ... spaced 10 seconds apart.
$ echo 10 > /proc/sys/net/ipv4/tcp_keepalive_intvl

Such changes will not persist through a restart. To make persistent changes, use sysctl:

sysctl -w net.ipv4.tcp_keepalive_time=180 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10

Mac OS X

The currently configured settings can be viewed with sysctl:

$ sysctl net.inet.tcp | grep -E "keepidle|keepintvl|keepcnt"
net.inet.tcp.keepidle: 7200000
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepcnt: 8

Of note, Mac OS X defines keepidle and keepintvl in units of milliseconds as opposed to Linux which uses seconds.

The properties can be set with sysctl which will persist these settings across reboots:

sysctl -w net.inet.tcp.keepidle=180000 net.inet.tcp.keepcnt=3 net.inet.tcp.keepintvl=10000

Alternatively, you can add them to /etc/sysctl.conf (creating the file if it doesn't exist).

$ cat /etc/sysctl.conf
net.inet.tcp.keepidle=180000
net.inet.tcp.keepintvl=10000
net.inet.tcp.keepcnt=3

Windows

I don't have a Windows machine to confirm, but you should find the respective TCP Keep-Alive settings in the registry at

\HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\TCPIP\Parameters

Footnotes

1. See man tcp for more information.

2. This packet is often referred to as a "Keep-Alive" packet, but within the TCP specification it is just a regular ACK packet. Applications like Wireshark are able to label it as a "Keep-Alive" packet by meta-analysis of the sequence and acknowledgement numbers it contains in reference to the preceding communications on the socket.

3. Some examples I found from a basic Google search are lucwilliams/JavaLinuxNet and flonatel/libdontdie.

In JAVA Socket – TCP connections are managed on the OS level, java.net.Socket does not provide any in-built function to set timeouts for keepalive packet on a per-socket level. But we can enable keepalive option for java socket but it takes 2 hours 11 minutes (7200 sec) by default to process after a stale tcp connections. This cause connection will be availabe for very long time before purge. So we found some solution to use Java Native Interface (JNI) that call native code(c++) to configure these options.

Windows OS

In windows operating system keepalive_time & keepalive_intvl can be configurable but tcp_keepalive_probes cannot be change.By default, when a TCP socket is initialized sets the keep-alive timeout to 2 hours and the keep-alive interval to 1 second. The default system-wide value of the keep-alive timeout is controllable through the KeepAliveTime registry setting which takes a value in milliseconds.

On Windows Vista and later, the number of keep-alive probes (data retransmissions) is set to 10 and cannot be changed.

On Windows Server 2003, Windows XP, and Windows 2000, the default setting for number of keep-alive probes is 5. The number of keep-alive probes is controllable. For windows Winsock IOCTLs library is used to configure the tcp-keepalive parameters.

int WSAIoctl(
SocketFD, // descriptor identifying a socket
SIO_KEEPALIVE_VALS, // dwIoControlCode
(LPVOID) lpvInBuffer, // pointer to tcp_keepalive struct (DWORD)
cbInBuffer, // length of input buffer
NULL, // output buffer
0, // size of output buffer
(LPDWORD) lpcbBytesReturned, // number of bytes returned
NULL, // OVERLAPPED structure
NULL // completion routine
);

Linux OS

Linux has built-in support for keepalive which is need to be enabling TCP/IP networking in order to use it. Programs must request keepalive control for their sockets using the setsockopt interface.

int setsockopt(int socket, int level, int optname, const void *optval, socklen_t optlen)

Each client socket will be created using java.net.Socket. File descriptor ID for each socket will retrieve using java reflection.

For Windows according to Microsoft docs

  • KeepAliveTime (REG_DWORD, milliseconds, by default is not set which means 7,200,000,000 = 2 hours) - analogue to tcp_keepalive_time
  • KeepAliveInterval (REG_DWORD, milliseconds, by default is not set which means 1,000 = 1 second) - analogue to tcp_keepalive_intvl
  • Since Windows Vista there is no analogue to tcp_keepalive_probes, value is fixed to 10 and cannot be changed