[dpdk-dev] vhost: no protection against malformed queue descriptors in rte_vhost_dequeue_burst()
Patrik Andersson R
patrik.r.andersson at ericsson.com
Thu Mar 17 07:37:16 CET 2016
Hi Huawei,
thank you for the quick response and for the pointer to the 16.04-rc1
version. Nice!
I think it would be great also to have a sanity check on the gpa_to_vva().
Although nothing recent has hit it we had some problems in that area
in the past.
Regards,
Patrik
On 03/17/2016 02:35 AM, Xie, Huawei wrote:
> On 3/16/2016 8:53 PM, Patrik Andersson R wrote:
>> Hello,
>>
>> When taking a snapshot of a running VM instance, using OpenStack
>> "nova image-create", I noticed that one OVS pmd-thread eventually
>> failed in DPDK rte_vhost_dequeue_burst() with repeating log entries:
>>
>> compute-0-6 ovs-vswitchd[38172]: VHOST_DATA: Failed to allocate
>> memory for mbuf.
>>
>>
>> Debugging (data included further down) this issue lead to the
>> observation that there is no protection against malformed vhost
>> queue descriptors, thus tenant separation might be violated as a
>> single faulty VM might bring down the connectivity of all VMs
>> connected to the same virtual switch.
>>
>> To avoid this, validation would be needed at some points in the
>> rte_vhost_dequeue_burst() code:
>>
>> 1) when the queue descriptor is picked up for processing,
>> desc->flags and desc->len might both be 0
>>
>> ...
>> desc = &vq->desc[head[entry_success]];
>> ...
>> /* Discard first buffer as it is the virtio header */
>> if (desc->flags & VRING_DESC_F_NEXT) {
>> desc = &vq->desc[desc->next];
>> vb_offset = 0;
>> vb_avail = desc->len;
>> } else {
>> vb_offset = vq->vhost_hlen;
>> vb_avail = desc->len - vb_offset;
>> }
>> ....
>>
>> 2) at buffer address translation gpa_to_vva(), might fail
>> returning NULL as indication
>>
>> vb_addr = gpa_to_vva(dev, desc->addr);
>> ...
>> while (cpy_len != 0) {
>> rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, seg_offset),
>> (void *)((uintptr_t)(vb_addr + vb_offset)),
>> cpy_len);
>> ...
>> }
>> ...
>>
>>
>> Wondering if there are any plans of adding any kind of validation in
>> DPDK, or if it would be useful to suggest specific implementation of
>> such validations in the DPDK code?
>>
>> Or is there some mechanism that gives us the confidence to trust
>> the vhost queue content absolutely?
>>
>>
>>
>> Debugging data:
>>
>> For my scenario the problem occurs in DPDK rte_vhost_dequeue_burst()
>> due to use of a vhost queue descriptor that has all fields 0:
>>
>> (gdb) print *desc
>> {addr = 0, len = 0, flags = 0, next = 0}
>>
>>
>> Subsequent use of desc->len to compute vb_avail = desc->len - vb_offset,
>> leads to the problem observed. What happens is that the packet needs to
>> be segmented -- on my system it fails roughly at segment 122000 when
>> memory available for mbufs run out.
>>
>> The relevant local variables for rte_vhost_dequeue_burst() when breaking
>> on the condition desc->len == 0:
>>
>> vb_avail = 4294967284 (0xfffffff4)
>> seg_avail = 2608
>> vb_offset = 12
>> cpy_len = 2608
>> seg_num = 1
>> desc = 0x2aadb6e5c000
>> vb_addr = 46928960159744
>> entry_success = 0
>>
>> Note also that there is no crash despite to the desc->addr being zero,
>> it is a valid address in the regions mapped to the device. Although, the
>> 3 regions mapped does not seem to be correct either at this stage.
>>
>>
>> The versions that I'm running are OVS 2.4.0, with corrections from the
>> 2.4 branch, and DPDK 2.1.0. QEMU emulator version 2.2.0 and
>> libvirt version 1.2.12.
>>
>>
>> Regards,
>>
>> Patrik
> Thanks Patrik. You are right. We had planned to enhance the robustness
> of vhost so that neither malicious nor buggy guest virtio driver could
> corrupt vhost. Actually the 16.04 RC1 has fixed some issues (the return
> of gpa_to_vva isn't checked).
>
More information about the dev
mailing list