大家好,本文结合具体的问题实例,详细讲解一下TCP/IP协议栈的心跳机制、丢包重传机制等内容,给大家提供一个借鉴和参考。
客户坚持要实现这个特殊的功能点,项目已经接近尾声,目前处于客户试用阶段,不实现该功能,项目无法通过验收,客户不给钱。
在出现网络不稳定掉会时,可能是系统TCPIP协议栈已经检测到网络异常,系统协议层已经将网络断开了;也可能软件应用层的心跳机制检测到网络故障,断开了与服务器的链接。 对于系统TCPIP协议栈自身检测出来的网络异常,则可能存在两种情况,一是TCPIP协议栈自身的心跳机制检测出来的;二是TCP连接的丢包重传机制检测出异常。
2、TCPIP协议栈的心跳机制
2.1、TCP中的ACK机制
1)网络正常时:服务器收到心跳包,会立即回复ACK包,客户端收到ACK包后,再等2个小时发送下一个心跳包。其中,心跳包发送时间间隔时间keepalivetime,Windows系统中默认是2小时,可配置。如果在2个小时的时间间隔内,客户端和服务器有数据交互,客户端会收到服务器的ACK包,也算作心跳机制的心跳包,2个小时的时间间隔会重新计时。 2)网络异常时:服务器收不到客户端发过去的心跳包,没法回复ACK,Windows系统中默认的是1秒超时,1秒后会重发心跳包。如果还收不到心跳包的ACK,则1秒后重发心跳包,如果始终收不到心跳包,则在发出10个心跳包就达到了系统的上限,就认为网络出故障了,协议栈就会直接将连接断开了。其中,发出心跳包收不到ACK的超时时间称为 keepaliveinterval,Windows系统中默认是1秒,可配置;收不到心跳包对应的ACK包的重发次数probe,Windows系统是固定的,是固定的10次,不可配置的。
SOCKET socket;
// ......(中间代码省略)
int optval = 1;
int nRet = setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, (const char *)&optval,
sizeof(optval));
if (nRet != 0)
return;
tcp_keepalive alive;
alive.onoff = TRUE;
alive.keepalivetime = 10*1000;
alive.keepaliveinterval = 2*1000;
DWORD dwBytesRet = 0;
nRet = WSAIoctl(socket, SIO_KEEPALIVE_VALS, &alive, sizeof(alive), NULL, 0,
&dwBytesRet, NULL, NULL);
if (nRet != 0)
return;
1)keepalivetime:默认2小时发送一次心跳保活包,比如发送第1个保活包之后,间隔2个小时后再发起下一个保活包。如果这期间有数据交互,也算是有效的保活包,这个时间段就不再发送保活包,发送下个保活包的时间间隔会从收发的最后一条数据的时刻开始重新从0计时。 2)keepaliveinterval:发送保活包后,没有收到对端的ack的超时时间默认为1秒。假设和对端的网络出问题了,给对端发送第1个保活包,1秒内没有收到对端的ack,则发第2个保活包,1秒内没有收到对端的保活包,再发送下一个保活包,.....,直到发送第10个保活包后,1秒钟还没收到ack回应,则达到发送10次保活包的探测次数上限,则认为网络出问题了。 3)probe探测次数:Windows系统上的探测次数被固定为10次,不可修改。
If a connection is dropped as the result of keep-alives the error code WSAENETRESET is returned to any calls in progress on the socket, and any subsequent calls will fail with WSAENOTCONN.
/**
* struct lws_context_creation_info - parameters to create context with
*
* This is also used to create vhosts.... if LWS_SERVER_OPTION_EXPLICIT_VHOSTS
* is not given, then for backwards compatibility one vhost is created at
* context-creation time using the info from this struct.
*
* If LWS_SERVER_OPTION_EXPLICIT_VHOSTS is given, then no vhosts are created
* at the same time as the context, they are expected to be created afterwards.
*
* @port: VHOST: Port to listen on... you can use CONTEXT_PORT_NO_LISTEN to
* suppress listening on any port, that's what you want if you are
* not running a websocket server at all but just using it as a
* client
* @iface: VHOST: NULL to bind the listen socket to all interfaces, or the
* interface name, eg, "eth2"
* If options specifies LWS_SERVER_OPTION_UNIX_SOCK, this member is
* the pathname of a UNIX domain socket. you can use the UNIX domain
* sockets in abstract namespace, by prepending an @ symbole to the
* socket name.
* @protocols: VHOST: Array of structures listing supported protocols and a protocol-
* specific callback for each one. The list is ended with an
* entry that has a NULL callback pointer.
* It's not const because we write the owning_server member
* @extensions: VHOST: NULL or array of lws_extension structs listing the
* extensions this context supports. If you configured with
* --without-extensions, you should give NULL here.
* @token_limits: CONTEXT: NULL or struct lws_token_limits pointer which is initialized
* with a token length limit for each possible WSI_TOKEN_***
* @ssl_cert_filepath: VHOST: If libwebsockets was compiled to use ssl, and you want
* to listen using SSL, set to the filepath to fetch the
* server cert from, otherwise NULL for unencrypted
* @ssl_private_key_filepath: VHOST: filepath to private key if wanting SSL mode;
* if this is set to NULL but sll_cert_filepath is set, the
* OPENSSL_CONTEXT_REQUIRES_PRIVATE_KEY callback is called
* to allow setting of the private key directly via openSSL
* library calls
* @ssl_ca_filepath: VHOST: CA certificate filepath or NULL
* @ssl_cipher_list: VHOST: List of valid ciphers to use (eg,
* "RC4-MD5:RC4-SHA:AES128-SHA:AES256-SHA:HIGH:!DSS:!aNULL"
* or you can leave it as NULL to get "DEFAULT"
* @http_proxy_address: VHOST: If non-NULL, attempts to proxy via the given address.
* If proxy auth is required, use format
* "username:password@server:port"
* @http_proxy_port: VHOST: If http_proxy_address was non-NULL, uses this port at
* the address
* @gid: CONTEXT: group id to change to after setting listen socket, or -1.
* @uid: CONTEXT: user id to change to after setting listen socket, or -1.
* @options: VHOST + CONTEXT: 0, or LWS_SERVER_OPTION_... bitfields
* @user: CONTEXT: optional user pointer that can be recovered via the context
* pointer using lws_context_user
* @ka_time: CONTEXT: 0 for no keepalive, otherwise apply this keepalive timeout to
* all libwebsocket sockets, client or server
* @ka_probes: CONTEXT: if ka_time was nonzero, after the timeout expires how many
* times to try to get a response from the peer before giving up
* and killing the connection
* @ka_interval: CONTEXT: if ka_time was nonzero, how long to wait before each ka_probes
* attempt
* @provided_client_ssl_ctx: CONTEXT: If non-null, swap out libwebsockets ssl
* implementation for the one provided by provided_ssl_ctx.
* Libwebsockets no longer is responsible for freeing the context
* if this option is selected.
* @max_http_header_data: CONTEXT: The max amount of header payload that can be handled
* in an http request (unrecognized header payload is dropped)
* @max_http_header_pool: CONTEXT: The max number of connections with http headers that
* can be processed simultaneously (the corresponding memory is
* allocated for the lifetime of the context). If the pool is
* busy new incoming connections must wait for accept until one
* becomes free.
* @count_threads: CONTEXT: how many contexts to create in an array, 0 = 1
* @fd_limit_per_thread: CONTEXT: nonzero means restrict each service thread to this
* many fds, 0 means the default which is divide the process fd
* limit by the number of threads.
* @timeout_secs: VHOST: various processes involving network roundtrips in the
* library are protected from hanging forever by timeouts. If
* nonzero, this member lets you set the timeout used in seconds.
* Otherwise a default timeout is used.
* @ecdh_curve: VHOST: if NULL, defaults to initializing server with "prime256v1"
* @vhost_name: VHOST: name of vhost, must match external DNS name used to
* access the site, like "warmcat.com" as it's used to match
* Host: header and / or SNI name for SSL.
* @plugin_dirs: CONTEXT: NULL, or NULL-terminated array of directories to
* scan for lws protocol plugins at context creation time
* @pvo: VHOST: pointer to optional linked list of per-vhost
* options made accessible to protocols
* @keepalive_timeout: VHOST: (default = 0 = 60s) seconds to allow remote
* client to hold on to an idle HTTP/1.1 connection
* @log_filepath: VHOST: filepath to append logs to... this is opened before
* any dropping of initial privileges
* @mounts: VHOST: optional linked list of mounts for this vhost
* @server_string: CONTEXT: string used in HTTP headers to identify server
* software, if NULL, "libwebsockets".
*/
struct lws_context_creation_info {
int port; /* VH */
const char *iface; /* VH */
const struct lws_protocols *protocols; /* VH */
const struct lws_extension *extensions; /* VH */
const struct lws_token_limits *token_limits; /* context */
const char *ssl_private_key_password; /* VH */
const char *ssl_cert_filepath; /* VH */
const char *ssl_private_key_filepath; /* VH */
const char *ssl_ca_filepath; /* VH */
const char *ssl_cipher_list; /* VH */
const char *http_proxy_address; /* VH */
unsigned int http_proxy_port; /* VH */
int gid; /* context */
int uid; /* context */
unsigned int options; /* VH + context */
void *user; /* context */
int ka_time; /* context */
int ka_probes; /* context */
int ka_interval; /* context */
#ifdef LWS_OPENSSL_SUPPORT
SSL_CTX *provided_client_ssl_ctx; /* context */
#else /* maintain structure layout either way */
void *provided_client_ssl_ctx;
#endif
short max_http_header_data; /* context */
short max_http_header_pool; /* context */
unsigned int count_threads; /* context */
unsigned int fd_limit_per_thread; /* context */
unsigned int timeout_secs; /* VH */
const char *ecdh_curve; /* VH */
const char *vhost_name; /* VH */
const char * const *plugin_dirs; /* context */
const struct lws_protocol_vhost_options *pvo; /* VH */
int keepalive_timeout; /* VH */
const char *log_filepath; /* VH */
const struct lws_http_mount *mounts; /* VH */
const char *server_string; /* context */
/* Add new things just above here ---^
* This is part of the ABI, don't needlessly break compatibility
*
* The below is to ensure later library versions with new
* members added above will see 0 (default) even if the app
* was not built against the newer headers.
*/
void *_unused[8];
};
static lws_context* CreateContext()
{
lws_set_log_level( 0xFF, NULL );
lws_context* plcContext = NULL;
lws_context_creation_info tCreateinfo;
memset(&tCreateinfo, 0, sizeof tCreateinfo);
tCreateinfo.port = CONTEXT_PORT_NO_LISTEN;
tCreateinfo.protocols = protocols;
tCreateinfo.ka_time = LWS_TCP_KEEPALIVE_TIME;
tCreateinfo.ka_interval = LWS_TCP_KEEPALIVE_INTERVAL;
tCreateinfo.ka_probes = LWS_TCP_KEEPALIVE_PROBES;
tCreateinfo.options = LWS_SERVER_OPTION_DISABLE_IPV6;
plcContext = lws_create_context(&tCreateinfo);
return plcContext;
}
LWS_VISIBLE int
lws_plat_set_socket_options(struct lws_vhost *vhost, lws_sockfd_type fd)
{
int optval = 1;
int optlen = sizeof(optval);
u_long optl = 1;
DWORD dwBytesRet;
struct tcp_keepalive alive;
int protonbr;
#ifndef _WIN32_WCE
struct protoent *tcp_proto;
#endif
if (vhost->ka_time) {
/* enable keepalive on this socket */
// 先调用setsockopt打开发送心跳包(设置)选项
optval = 1;
if (setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE,
(const char *)&optval, optlen) < 0)
return 1;
alive.onoff = TRUE;
alive.keepalivetime = vhost->ka_time*1000;
alive.keepaliveinterval = vhost->ka_interval*1000;
if (WSAIoctl(fd, SIO_KEEPALIVE_VALS, &alive, sizeof(alive),
NULL, 0, &dwBytesRet, NULL, NULL))
return 1;
}
/* Disable Nagle */
optval = 1;
#ifndef _WIN32_WCE
tcp_proto = getprotobyname("TCP");
if (!tcp_proto) {
lwsl_err("getprotobyname() failed with error %d\n", LWS_ERRNO);
return 1;
}
protonbr = tcp_proto->p_proto;
#else
protonbr = 6;
#endif
setsockopt(fd, protonbr, TCP_NODELAY, (const char *)&optval, optlen);
/* We are nonblocking... */
ioctlsocket(fd, FIONBIO, &optl);
return 0;
}
On a blocking socket, the return value indicates success or failure of the connection attempt.
With a nonblocking socket, the connection attempt cannot be completed immediately. In this case, connect will return SOCKET_ERROR, and WSAGetLastError will return WSAEWOULDBLOCK. In this case, there are three possible scenarios:
Use the select function to determine the completion of the connection request by checking to see if the socket is writeable.
select函数因为超时返回,会返回0;如果发生错误,则返回SOCKET_ERROR,所以判断时要判断select返回值,如果小于等于0,则是连接失败,立即将套接字关闭掉。 如果select返回值大于0,则该返回值是已经准备就绪的socket个数,比如连接成功的socket。我们判断套接字是否在可写集合writefds中,如果在该集合中,则表示连接成功。
bool ConnectDevice( char* pszIP, int nPort )
{
// 创建TCP套接字
SOCKET connSock = socket(AF_INET, SOCK_STREAM, 0);
if (connSock == INVALID_SOCKET)
{
return false;
}
// 填充IP和端口
SOCKADDR_IN devAddr;
memset(&devAddr, 0, sizeof(SOCKADDR_IN));
devAddr.sin_family = AF_INET;
devAddr.sin_port = htons(nPort);
devAddr.sin_addr.s_addr = inet_addr(pszIP);
// 将套接字设置为非阻塞式的,为下面的select做准备
unsigned long ulnoblock = 1;
ioctlsocket(connSock, FIONBIO, &ulnoblock);
// 发起connnect,该接口立即返回
connect(connSock, (sockaddr*)&devAddr, sizeof(devAddr));
FD_SET writefds;
FD_ZERO(&writefds);
FD_SET(connSock, &writefds);
// 设置连接超时时间为1秒
timeval tv;
tv.tv_sec = 1; //超时1s
tv.tv_usec = 0;
// The select function returns the total number of socket handles that are ready and contained
// in the fd_set structures, zero if the time limit expired, or SOCKET_ERROR(-1) if an error occurred.
if (select(0, NULL, &writefds, NULL, &tv) <= 0)
{
closesocket(connSock);
return false; //超时未连接上就退出
}
ulnoblock = 0;
ioctlsocket(connSock, FIONBIO, &ulnoblock);
closesocket(connSock);
return true;
}
原文:https://zhuanlan.zhihu.com/p/618044850
文章来源于网络,版权归原作者所有,如有侵权,请联系删除。
扫码,拉你进高质量嵌入式交流群 关注我【一起学嵌入式】,一起学习,一起成长。 觉得文章不错,点击“分享”、“赞”、“在看” 呗!