[dpdk-dev] [PATCH v6] vhost: fix crash on port deletion

Xia, Chenbo chenbo.xia at intel.com
Mon Sep 6 05:32:32 CEST 2021


Hi Gaoxiang,

> -----Original Message-----
> From: dev <dev-bounces at dpdk.org> On Behalf Of Xia, Chenbo
> Sent: Monday, September 6, 2021 11:18 AM
> To: Gaoxiang Liu <gaoxiangliu0 at 163.com>
> Cc: maxime.coquelin at redhat.com; dev at dpdk.org; liugaoxiang at huawei.com
> Subject: Re: [dpdk-dev] [PATCH v6] vhost: fix crash on port deletion
> 
> Hi Gaoxiang,
> 
> >
> >
> >From: Gaoxiang Liu <gaoxiangliu0 at 163.com>
> >Sent: Thursday, September 2, 2021 11:38 PM
> >To: Xia, Chenbo <chenbo.xia at intel.com>
> >Cc: maxime.coquelin at redhat.com; dev at dpdk.org; liugaoxiang at huawei.com
> >Subject: Re:RE: [PATCH v6] vhost: fix crash on port deletion
> >
> >
> >Hi chenbo,
> >why this is not moved up?
> >>> +		if (vsocket->is_server) {
> >>> +			close(vsocket->socket_fd);
> >>> +			unlink(path);
> >>>  		}
> >==>Because if this is moved up, and if deleting conn fd from fdsets failed,
> >it will arrive the "again" label, then close vsocket->socket_fd and uplink
> "path" again.
> >it's not necessary.
> >And closing socket_fd does not  trigger vhost_user_server_new_connection.
> 
> But same issue happens when you deleted 'vsocket->socket_fd' but failed to
> delete one
> of the conn_fd: you will goto again and try to delete socket_fd again and then
> loop
> forever. So anyway you need to prevent this from happening.

Please ignore this. I thought delete socket_fd again will return -1.
So I am fine with putting it later otherwise we have to introduce a flag to know if
Socket_fd deletion happened once.

Thanks,
Chenbo

> 
> Thanks,
> Chenbo
> 
> >
> >Thanks.
> >Gaoxiang
> 
> At 2021-08-31 13:37:38, "Xia, Chenbo" <mailto:chenbo.xia at intel.com> wrote:
> >Hi Gaoxiang,
> >
> >> -----Original Message-----
> >> From: Gaoxiang Liu <mailto:gaoxiangliu0 at 163.com>
> >> Sent: Friday, August 27, 2021 10:19 PM
> >> To: mailto:maxime.coquelin at redhat.com; Xia, Chenbo
> <mailto:chenbo.xia at intel.com>
> >> Cc: mailto:dev at dpdk.org; mailto:liugaoxiang at huawei.com; Gaoxiang Liu
> <mailto:gaoxiangliu0 at 163.com>
> >> Subject: [PATCH v6] vhost: fix crash on port deletion
> >>
> >> The rte_vhost_driver_unregister() and vhost_user_read_cb()
> >> can be called at the same time by 2 threads.
> >> when memory of vsocket is freed in rte_vhost_driver_unregister(),
> >> the invalid memory of vsocket is accessed in vhost_user_read_cb().
> >> It's a bug of both mode for vhost as server or client.
> >>
> >> E.g.,vhostuser port is created as server.
> >
> >Put a space after ','
> >
> >> Thread1 calls rte_vhost_driver_unregister().
> >> Before the listen fd is deleted from poll waiting fds,
> >> "vhost-events" thread then calls vhost_user_server_new_connection(),
> >> then a new conn fd is added in fdset when trying to reconnect.
> >> "vhost-events" thread then calls vhost_user_read_cb() and
> >> accesses invalid memory of socket while thread1 frees the memory of
> >> vsocket.
> >>
> >> E.g.,vhostuser port is created as client.
> >
> >Same here.
> >
> >> Thread1 calls rte_vhost_driver_unregister().
> >> Before vsocket of reconn is deleted from reconn list,
> >> "vhost_reconn" thread then calls vhost_user_add_connection()
> >> then a new conn fd is added in fdset when trying to reconnect.
> >> "vhost-events" thread then calls vhost_user_read_cb() and
> >> accesses invalid memory of socket while thread1 frees the memory of
> >> vsocket.
> >>
> >> The fix is to move the "fdset_try_del" in front of free memory of conn,
> >> then avoid the race condition.
> >>
> >> The core trace is:
> >> Program terminated with signal 11, Segmentation fault.
> >>
> >> Fixes: 52d874dc6705 ("vhost: fix crash on closing in client mode")
> >>
> >> Signed-off-by: Gaoxiang Liu <mailto:liugaoxiang at huawei.com>
> >> ---
> >>
> >> v2:
> >> * Fix coding style issues.
> >>
> >> v3:
> >> * Add detailed log.
> >>
> >> v4:
> >> * Add the reason, when vhostuser port is created as server.
> >>
> >> v5:
> >> * Add detailed log when vhostuser port is created as client
> >>
> >> v6:
> >> * Add 'path' check before deleting listen fd
> >> * Fix spelling issues
> >> ---
> >>  lib/vhost/socket.c | 108 ++++++++++++++++++++++-----------------------
> >>  1 file changed, 54 insertions(+), 54 deletions(-)
> >>
> >> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> >> index 5d0d728d5..27d5e8695 100644
> >> --- a/lib/vhost/socket.c
> >> +++ b/lib/vhost/socket.c
> >> @@ -1023,66 +1023,66 @@ rte_vhost_driver_unregister(const char *path)
> >>
> >>  	for (i = 0; i < vhost_user.vsocket_cnt; i++) {
> >>  		struct vhost_user_socket *vsocket = vhost_user.vsockets[i];
> >> +		if (strcmp(vsocket->path, path)) {
> >> +			continue;
> >> +		}
> >
> >braces {} are not necessary for single statement blocks
> >
> >>
> >> -		if (!strcmp(vsocket->path, path)) {
> >> -			pthread_mutex_lock(&vsocket->conn_mutex);
> >> -			for (conn = TAILQ_FIRST(&vsocket->conn_list);
> >> -			     conn != NULL;
> >> -			     conn = next) {
> >> -				next = TAILQ_NEXT(conn, next);
> >> -
> >> -				/*
> >> -				 * If r/wcb is executing, release vsocket's
> >> -				 * conn_mutex and vhost_user's mutex locks, and
> >> -				 * try again since the r/wcb may use the
> >> -				 * conn_mutex and mutex locks.
> >> -				 */
> >> -				if (fdset_try_del(&vhost_user.fdset,
> >> -						  conn->connfd) == -1) {
> >> -					pthread_mutex_unlock(
> >> -							&vsocket->conn_mutex);
> >> -					pthread_mutex_unlock(&vhost_user.mutex);
> >> -					goto again;
> >> -				}
> >> -
> >> -				VHOST_LOG_CONFIG(INFO,
> >> -					"free connfd = %d for device '%s'\n",
> >> -					conn->connfd, path);
> >> -				close(conn->connfd);
> >> -				vhost_destroy_device(conn->vid);
> >> -				TAILQ_REMOVE(&vsocket->conn_list, conn, next);
> >> -				free(conn);
> >> -			}
> >> -			pthread_mutex_unlock(&vsocket->conn_mutex);
> >> -
> >> -			if (vsocket->is_server) {
> >> -				/*
> >> -				 * If r/wcb is executing, release vhost_user's
> >> -				 * mutex lock, and try again since the r/wcb
> >> -				 * may use the mutex lock.
> >> -				 */
> >> -				if (fdset_try_del(&vhost_user.fdset,
> >> -						vsocket->socket_fd) == -1) {
> >> -					pthread_mutex_unlock(&vhost_user.mutex);
> >> -					goto again;
> >> -				}
> >> -
> >> -				close(vsocket->socket_fd);
> >> -				unlink(path);
> >> -			} else if (vsocket->reconnect) {
> >> -				vhost_user_remove_reconnect(vsocket);
> >> +		if (vsocket->is_server) {
> >> +			/*
> >> +			 * If r/wcb is executing, release vhost_user's
> >> +			 * mutex lock, and try again since the r/wcb
> >> +			 * may use the mutex lock.
> >> +			 */
> >> +			if (fdset_try_del(&vhost_user.fdset, vsocket->socket_fd) ==
> >> -1) {
> >> +				pthread_mutex_unlock(&vhost_user.mutex);
> >> +				goto again;
> >>  			}
> >> +		} else if (vsocket->reconnect) {
> >> +			vhost_user_remove_reconnect(vsocket);
> >> +		}
> >>
> >> -			pthread_mutex_destroy(&vsocket->conn_mutex);
> >> -			vhost_user_socket_mem_free(vsocket);
> >> +		pthread_mutex_lock(&vsocket->conn_mutex);
> >> +		for (conn = TAILQ_FIRST(&vsocket->conn_list);
> >> +			 conn != NULL;
> >> +			 conn = next) {
> >> +			next = TAILQ_NEXT(conn, next);
> >>
> >> -			count = --vhost_user.vsocket_cnt;
> >> -			vhost_user.vsockets[i] = vhost_user.vsockets[count];
> >> -			vhost_user.vsockets[count] = NULL;
> >> -			pthread_mutex_unlock(&vhost_user.mutex);
> >> +			/*
> >> +			 * If r/wcb is executing, release vsocket's
> >> +			 * conn_mutex and vhost_user's mutex locks, and
> >> +			 * try again since the r/wcb may use the
> >> +			 * conn_mutex and mutex locks.
> >> +			 */
> >> +			if (fdset_try_del(&vhost_user.fdset,
> >> +					  conn->connfd) == -1) {
> >> +				pthread_mutex_unlock(&vsocket->conn_mutex);
> >> +				pthread_mutex_unlock(&vhost_user.mutex);
> >> +				goto again;
> >> +			}
> >> +
> >> +			VHOST_LOG_CONFIG(INFO,
> >> +				"free connfd = %d for device '%s'\n",
> >> +				conn->connfd, path);
> >> +			close(conn->connfd);
> >> +			vhost_destroy_device(conn->vid);
> >> +			TAILQ_REMOVE(&vsocket->conn_list, conn, next);
> >> +			free(conn);
> >> +		}
> >> +		pthread_mutex_unlock(&vsocket->conn_mutex);
> >>
> >> -			return 0;
> >> +		if (vsocket->is_server) {
> >> +			close(vsocket->socket_fd);
> >> +			unlink(path);
> >>  		}
> >
> >I think you miss my comment in V5 of asking why this is not moved up after
> >fdset_try_del server socket fd.
> >
> >Thanks,
> >Chenbo
> >
> >> +
> >> +		pthread_mutex_destroy(&vsocket->conn_mutex);
> >> +		vhost_user_socket_mem_free(vsocket);
> >> +
> >> +		count = --vhost_user.vsocket_cnt;
> >> +		vhost_user.vsockets[i] = vhost_user.vsockets[count];
> >> +		vhost_user.vsockets[count] = NULL;
> >> +		pthread_mutex_unlock(&vhost_user.mutex);
> >> +		return 0;
> >>  	}
> >>  	pthread_mutex_unlock(&vhost_user.mutex);
> >>
> >> --
> >> 2.32.0
> >>
> >
> 
> 


More information about the dev mailing list