[dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.

Ilya Maximets i.maximets at samsung.com
Wed Jul 6 13:19:12 CEST 2016


On 01.07.2016 10:35, Yuanhan Liu wrote:
> Hi,
> 
> Sorry for the long delay.
> 
> On Fri, May 20, 2016 at 03:50:04PM +0300, Ilya Maximets wrote:
>> In current implementation guest application can reinitialize vrings
>> by executing start after stop. In the same time host application
>> can still poll virtqueue while device stopped in guest and it will
>> crash with segmentation fault while vring reinitialization because
>> of dereferencing of bad descriptor addresses.
> 
> Yes, you are right that vring will be reinitialized after restart.
> But even though, I don't see the reason it will cause a vhost crash,
> since the reinitialization will reset all the vring memeory by 0:
> 
>     memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size);
> 
> That means those bad descriptors will be skipped, safely, at vhost
> side by:
> 
> 	if (unlikely(desc->len < dev->vhost_hlen))
> 		return -1;
> 
>>
>> OVS crash for example:
>> <------------------------------------------------------------------------>
>> [test-pmd inside guest VM]
>>
>> 	testpmd> port stop all
>> 	    Stopping ports...
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
>> 	testpmd> port config all rxq 2
>> 	testpmd> port config all txq 2
>> 	testpmd> port start all
>> 	    Configuring Port 0 (socket 0)
>> 	    Port 0: 52:54:00:CB:44:C8
>> 	    Checking link statuses...
>> 	    Port 0 Link Up - speed 10000 Mbps - full-duplex
>> 	    Done
>>
>> [OVS on host]
>> 	Program received signal SIGSEGV, Segmentation fault.
>> 	rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000) at rte_memcpy.h
> 
> Interesting, so it bypasses the above check since desc->len is non-zero
> while desc->addr is zero. The size (2056) also looks weird.
> 
> Do you mind to check this issue a bit deeper, say why desc->addr is
> zero, however, desc->len is not?

OK. I checked this few more times. Actually, I see, that desc->addr is
not zero. All desc memory looks like some rubbish:

<------------------------------------------------------------------------------>
(gdb)
#3 copy_desc_to_mbuf (mbuf_pool=0x7fe9da9f4480, desc_idx=65363,
                      m=0x7fe9db269400, vq=0x7fe9fff7bac0, dev=0x7fe9fff7cbc0)
        desc = 0x2aabc00ff530
        desc_addr = 0
        mbuf_offset = 0
        prev = 0x7fe9db269400
        nr_desc = 1
        desc_offset = 12
        cur = 0x7fe9db269400
        hdr = 0x0
        desc_avail = 1012591375
        mbuf_avail = 1526
        cpy_len = 1526

(gdb) p *desc
$2 = {addr = 8507655620301055744, len = 1012591387, flags = 3845, next = 48516}

<------------------------------------------------------------------------------>

And 'desc_addr' equals zero because 'gpa_to_vva' just can't map this huge
address to host's.

Scenario was the same. SIGSEGV received right after 'port start all'.

Another thought:

	Actually, there is a race window between 'memset' in guest and reading
	of 'desc->len' and 'desc->addr' on host. So, it's possible to read non
	zero 'len' and zero 'addr' right after that. But you're right, this
	case should be very rare.

> 
>> 	(gdb) bt
>> 	    #0  rte_memcpy (n=2056, src=0xc, dst=0x7ff4d5247000)
>> 	    #1  copy_desc_to_mbuf
>> 	    #2  rte_vhost_dequeue_burst
>> 	    #3  netdev_dpdk_vhost_rxq_recv
>> 	    ...
>>
>> 	(gdb) bt full
>> 	    #0  rte_memcpy
>> 	        ...
>> 	    #1  copy_desc_to_mbuf
>> 	        desc_addr = 0
>> 	        mbuf_offset = 0
>> 	        desc_offset = 12
>> 	        ...
>> <------------------------------------------------------------------------>
>>
>> Fix that by checking addresses of descriptors before using them.
>>
>> Note: For mergeable buffers this patch checks only guest's address for
>> zero, but in non-meargeable case host's address checked. This is done
>> because checking of host's address in mergeable case requires additional
>> refactoring to keep virtqueue in consistent state in case of error.
>>
>> Signed-off-by: Ilya Maximets <i.maximets at samsung.com>
>> ---
>>
>> Actually, current virtio implementation looks broken for me. Because
>> 'virtio_dev_start' breaks virtqueue while it still available from the vhost
>> side.
> 
> Yes, this sounds buggy. Maybe we could not reset the avail idx, in such
> case vhost dequeue/enqueue will just return as there are no more packets
> to dequeue and no more space to enqueue, respectively?

Maybe this will be a good fix for virtio because vhost will not try to receive
from wrong descriptors. But this will not help if vhost already tries to receive
something in time of guest's reconfiguration.

Best regards, Ilya Maximets.


More information about the dev mailing list