[dpdk-dev] [PATCH] vhost: fix connect hang in client mode
Ilya Maximets
i.maximets at samsung.com
Thu Jul 21 14:10:15 CEST 2016
On 21.07.2016 14:40, Yuanhan Liu wrote:
> On Thu, Jul 21, 2016 at 02:14:59PM +0300, Ilya Maximets wrote:
>>> Hmm, how about this fixup:
>>> ------------------------------------------------------------------------------
>>> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c
>>> index 8626d13..b0f45e6 100644
>>> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
>>> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
>>> @@ -537,18 +537,7 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz)
>>> errno = EINVAL;
>>>
>>> ret = connect(fd, un, sz);
>>> - if (ret == -1 && errno != EINPROGRESS)
>>> - return -1;
>>> - if (ret == 0)
>>> - goto connected;
>>> -
>>> - FD_ZERO(&fdset);
>>> - FD_SET(fd, &fdset);
>>> -
>>> - ret = select(fd + 1, NULL, &fdset, NULL, &tv);
>>> - if (!ret)
>>> - errno = ETIMEDOUT;
>>> - if (ret != 1)
>>> + if (ret < 0 && errno != EISCONN)
>>> return -1;
>>>
>>> ret = getsockopt(fd, SOL_SOCKET, SO_ERROR, &so_error, &len);
>>> @@ -558,7 +547,6 @@ vhost_user_connect_nonblock(int fd, struct sockaddr *un, size_t sz)
>>> return -1;
>>> }
>>>
>>> -connected:
>>> flags = fcntl(fd, F_GETFL, 0);
>>> if (flags < 0) {
>>> RTE_LOG(ERR, VHOST_CONFIG,
>>> ------------------------------------------------------------------------------
>>> ?
>>>
>>> We will not check the EINPROGRESS, but subsequent 'connect()' will return
>>> EISCONN if connection already established. getsockopt() is kept just in
>>> case. Subsequent 'connect()' will happen on the next iteration of
>>> reconnection cycle (1 second sleep).
>>
>> I've sent v2 with this changes.
>
> Thanks. But still, it doesn't look clean to me. I was thinking following
> might be cleaner?
>
> diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c
> b/lib/librte_vhost/vhost_user/vhost-net-user.
> index f0f92f8..c0ef290 100644
> --- a/lib/librte_vhost/vhost_user/vhost-net-user.c
> +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
> @@ -532,6 +532,10 @@ vhost_user_client_reconnect(void *arg __rte_unused)
> reconn != NULL; reconn = next) {
> next = TAILQ_NEXT(reconn, next);
>
> + if (reconn->conn_inprogress) {
> + /* do connect check here */
> + }
> +
> if (connect(reconn->fd, (struct sockaddr *)&reconn->un,
> sizeof(reconn->un)) < 0)
> continue;
> @@ -605,6 +609,7 @@ vhost_user_create_client(struct vhost_user_socket *vsocket)
> reconn->un = un;
> reconn->fd = fd;
> reconn->vsocket = vsocket;
> + reconn->conn_inprogress = errno == EINPROGRESS;
> pthread_mutex_lock(&reconn_list.mutex);
> TAILQ_INSERT_TAIL(&reconn_list.head, reconn, next);
> pthread_mutex_unlock(&reconn_list.mutex);
>
> It's just a rough diff, hopefully it shows my idea clearly. And of
> course, we should not call connect() anymore when conn_inprogress
> is set.
>
> What do you think of it?
I found that we can't check connection status without select/poll
on it. 'getsockopt()' will return 0 with no errors if connection
is not still established just like if it was.
So, I think, the first version of this patch is the only
acceptable solution.
Best regards, Ilya Maximets.
More information about the dev
mailing list