[dpdk-dev] vhost: no protection against malformed queue descriptors in rte_vhost_dequeue_burst()

Patrik Andersson R patrik.r.andersson at ericsson.com
Thu Mar 17 07:37:16 CET 2016


Hi Huawei,

thank you for the quick response and for the pointer to  the 16.04-rc1
version. Nice!

I think it would be great also to have a sanity check on the gpa_to_vva().
Although nothing recent has hit it we had some problems in that area
in the past.

Regards,

Patrik

On 03/17/2016 02:35 AM, Xie, Huawei wrote:
> On 3/16/2016 8:53 PM, Patrik Andersson R wrote:
>> Hello,
>>
>> When taking a snapshot of a running VM instance, using OpenStack
>> "nova image-create", I noticed that one OVS pmd-thread eventually
>> failed in DPDK rte_vhost_dequeue_burst() with repeating log entries:
>>
>>     compute-0-6 ovs-vswitchd[38172]: VHOST_DATA: Failed to allocate
>> memory for mbuf.
>>
>>
>> Debugging (data included further down) this issue lead to the
>> observation that there is no protection against malformed vhost
>> queue descriptors, thus tenant separation might be violated as a
>> single faulty VM might bring down the connectivity of all VMs
>> connected to the same virtual switch.
>>
>> To avoid this, validation would be needed at some points in the
>> rte_vhost_dequeue_burst() code:
>>
>>    1) when the queue descriptor is picked up for processing,
>>        desc->flags and desc->len might both be 0
>>
>>         ...
>>         desc = &vq->desc[head[entry_success]];
>>         ...
>>         /* Discard first buffer as it is the virtio header */
>>         if (desc->flags & VRING_DESC_F_NEXT) {
>>              desc = &vq->desc[desc->next];
>>              vb_offset = 0;
>>              vb_avail = desc->len;
>>         } else {
>>              vb_offset = vq->vhost_hlen;
>>              vb_avail = desc->len - vb_offset;
>>         }
>>          ....
>>
>>    2) at buffer address translation gpa_to_vva(), might fail
>>        returning NULL as indication
>>
>>         vb_addr = gpa_to_vva(dev, desc->addr);
>>         ...
>>         while (cpy_len != 0) {
>>              rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, seg_offset),
>>                  (void *)((uintptr_t)(vb_addr + vb_offset)),
>>                  cpy_len);
>>         ...
>>         }
>>         ...
>>
>>
>> Wondering if there are any plans of adding any kind of validation in
>> DPDK, or if it would be useful to suggest specific implementation of
>> such validations in the DPDK code?
>>
>> Or is there some mechanism that gives us the confidence to trust
>> the vhost queue content absolutely?
>>
>>
>>
>> Debugging data:
>>
>> For my scenario the problem occurs in DPDK rte_vhost_dequeue_burst()
>> due to use of a vhost queue descriptor that has all fields 0:
>>
>>    (gdb) print *desc
>>             {addr = 0, len = 0, flags = 0, next = 0}
>>
>>
>> Subsequent use of desc->len to compute vb_avail = desc->len - vb_offset,
>> leads to the problem observed. What happens is that the packet needs to
>> be segmented -- on my system it fails roughly at segment 122000 when
>> memory available for mbufs run out.
>>
>> The relevant local variables for rte_vhost_dequeue_burst() when breaking
>> on the condition desc->len == 0:
>>
>>     vb_avail = 4294967284  (0xfffffff4)
>>     seg_avail = 2608
>>     vb_offset = 12
>>     cpy_len = 2608
>>     seg_num = 1
>>     desc = 0x2aadb6e5c000
>>     vb_addr = 46928960159744
>>     entry_success = 0
>>
>> Note also that there is no crash despite to the desc->addr being zero,
>> it is a valid address in the regions mapped to the device. Although, the
>> 3 regions mapped does not seem to be correct either at this stage.
>>
>>
>> The versions that I'm running are OVS 2.4.0, with corrections from the
>> 2.4 branch, and DPDK 2.1.0. QEMU emulator version 2.2.0 and
>> libvirt version 1.2.12.
>>
>>
>> Regards,
>>
>> Patrik
> Thanks Patrik. You are right. We had planned to enhance the robustness
> of vhost so that neither malicious nor buggy guest virtio driver could
> corrupt vhost. Actually the 16.04 RC1 has fixed some issues (the return
> of gpa_to_vva isn't checked).
>



More information about the dev mailing list