[dpdk-dev] [1/5] vhost: enforce avail index and desc read ordering

Michael S. Tsirkin mst at redhat.com
Thu Dec 6 14:48:14 CET 2018


On Thu, Dec 06, 2018 at 12:17:38PM +0800, Jason Wang wrote:
> 
> On 2018/12/5 下午7:30, Ilya Maximets wrote:
> > On 05.12.2018 12:49, Maxime Coquelin wrote:
> > > A read barrier is required to ensure the ordering between
> > > available index and the descriptor reads is enforced.
> > > 
> > > Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application")
> > > Cc: stable at dpdk.org
> > > 
> > > Reported-by: Jason Wang <jasowang at redhat.com>
> > > Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> > > ---
> > >   lib/librte_vhost/virtio_net.c | 12 ++++++++++++
> > >   1 file changed, 12 insertions(+)
> > > 
> > > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> > > index 5e1a1a727..f11ebb54f 100644
> > > --- a/lib/librte_vhost/virtio_net.c
> > > +++ b/lib/librte_vhost/virtio_net.c
> > > @@ -791,6 +791,12 @@ virtio_dev_rx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
> > >   	rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]);
> > >   	avail_head = *((volatile uint16_t *)&vq->avail->idx);
> > > +	/*
> > > +	 * The ordering between avail index and
> > > +	 * desc reads needs to be enforced.
> > > +	 */
> > > +	rte_smp_rmb();
> > > +
> > Hmm. This looks weird to me.
> > Could you please describe the bad scenario here? (It'll be good to have it
> > in commit message too)
> > 
> > As I understand, you're enforcing the read of avail->idx to happen before
> > reading the avail->ring[avail_idx]. Is it correct?
> > 
> > But we have following code sequence:
> > 
> > 1. read avail->idx (avail_head).
> > 2. check that last_avail_idx != avail_head.
> > 3. read from the ring using last_avail_idx.
> > 
> > So, there is a strict dependency between all 3 steps and the memory
> > transaction will be finished at the step #2 in any case. There is no
> > way to read the ring before reading the avail->idx.
> > 
> > Am I missing something?
> 
> 
> Nope, I kind of get what you meaning now. And even if we will
> 
> 4. read descriptor from descriptor ring using the id read from 3
> 
> 5. read descriptor content according to the address from 4
> 
> They still have dependent memory access. So there's no need for rmb.

I am pretty sure on some architectures there is a need for a barrier
here.  This is an execution dependency since avail_head is not used as an
index. And reads can be speculated.  So the read from the ring can be
speculated and execute before the read of avail_head and the check.

However SMP rmb is/should be free on x86.  So unless someone on this
thread is actually testing performance on non-x86, you are both wasting
cycles discussing removal of nop macros and also risk pushing untested
software on users.


> 
> > 
> > >   	for (pkt_idx = 0; pkt_idx < count; pkt_idx++) {
> > >   		uint32_t pkt_len = pkts[pkt_idx]->pkt_len + dev->vhost_hlen;
> > >   		uint16_t nr_vec = 0;
> > > @@ -1373,6 +1379,12 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
> > >   	if (free_entries == 0)
> > >   		return 0;
> > > +	/*
> > > +	 * The ordering between avail index and
> > > +	 * desc reads needs to be enforced.
> > > +	 */
> > > +	rte_smp_rmb();
> > > +
> > This one is strange too.
> > 
> > 	free_entries = *((volatile uint16_t *)&vq->avail->idx) -
> > 			vq->last_avail_idx;
> > 	if (free_entries == 0)
> > 		return 0;
> > 
> > The code reads the value of avail->idx and uses the value on the next
> > line even with any compiler optimizations. There is no way for CPU to
> > postpone the actual read.
> 
> 
> Yes.
> 
> Thanks
> 
> 
> > 
> > >   	VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
> > >   	count = RTE_MIN(count, MAX_PKT_BURST);
> > > 


More information about the dev mailing list