[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

Tan, Jianfeng jianfeng.tan at intel.com
Mon Jun 6 12:50:08 CEST 2016


Hi,


On 6/6/2016 5:30 PM, Tetsuya Mukawa wrote:
> On 2016/06/06 17:49, Yuanhan Liu wrote:
>> On Mon, Jun 06, 2016 at 05:33:31PM +0900, Tetsuya Mukawa wrote:
>>>>> [My solution]
>>>>> - Pros
>>>>> Basic principle of my implementation is not to reinvent the wheel.
>>>> Yes, that's a good point. However, it's not that hard as we would have
>>>> thought in the first time: the tough part that dequeue/enqueue packets
>>>> from/to vring is actually offloaded to DPDK vhost-user. That means we
>>>> only need re-implement the control path of virtio-net device, plus the
>>>> vhost-user frontend. If you have a detailed look of your patchset as
>>>> well Jianfeng's, you might find that the two patchset are actually with
>>>> same code size.
>>> Yes, I know this.
>>> So far, the amount of code is almost same, but in the future we may need
>>> to implement more, if virtio-net specification is revised.
>> It didn't take too much effort to implement from scratch, I doubt it
>> will for future revise. And, virtio-net spec is unlikely revised, or
>> to be precisely, unlikely revised quite often. Therefore, I don't see
>> big issues here.
>>
>>>>> We can use a virtio-net device of QEMU implementation, it means we don't
>>>>> need to maintain virtio-net device by ourselves, and we can use all of
>>>>> functions supported by QEMU virtio-net device.
>>>>> - Cons
>>>>> Need to invoke QEMU process.
>>>> Another thing is that it makes the usage a bit harder: look at the
>>>> long qemu cli options of your example usage. It also has some traps,
>>>> say, "--enable-kvm" is not allowed, which is a default option used
>>>> with QEMU.
>>> Probably a kind of shell script will help the users.
>> Yeah, that would help. But if we have a choice to make it simpler in the
>> beginning, why not then? :-)
>>
>>>> And judging that we actually don't take too much effort to implement
>>>> a virtio device emulation, I'd prefer it slightly. I guess something
>>>> light weight and easier for use is more important here.
>>> This is very important point.
>>> If so, we don't need much effort when virtio-spec is changed.
>> I'd assume so.
>>
>>>> Actually, I have foreseen another benefit of adding virtio-user device
>>>> emulation: we now might be able to add a rte_vhost_dequeue/enqueue_burst()
>>>> unit test case. We simply can't do it before, since we depend on QEMU
>>>> for testing, which is not acceptable for a unit test case. Making it
>>>> be a unit test case would help us spotting any bad changes that would
>>>> introduce bugs easily and automatically.
>>> As you mentioned above, QEMU process is not related with
>>> dequeuing/enqueuing.
>>> So I guess we may have a testing for rte_vhost_dequeue/enqueue_burst()
>>> regardless of choice.
>> Yes, we don't need the dequeue/enqueue part, but we need the vhost-user
>> initialization part from QEMU vhost-user. Now that we have vhost-user
>> frontend from virtio-user, we have no dependency on QEMU any more.
>>
>>>>> Anyway, we can choose one of belows.
>>>>> 1. Take advantage of invoking less processes.
>>>>> 2. Take advantage of maintainability of virtio-net device.
>>> If container usage that DPDK assumes is to invoke hundreds containers in
>>> one host,
>> I barely know about container, but I would assume that's not rare.
> Hi Yuanhan,
>
> It's great to hear it's not so hard to maintain Jiangfeng's virtio-net
> device features.
>
> Please let me make sure how we can invoke many DPDK applications in
> hundreds containers.
> (Do we have a way to do? Or, will we have it in the future?)

Just to add some option here, we cannot say no to that kind of use case. 
To have many instances, we can:

(1) add a restriction of "cpu share" on each instance, relying on kernel 
to schedule.
(2) enable interrupt mode, so that one instance can go to sleep when it 
has no pkts to receive and awoke by vhost backend when pkts come.

Option 2 is my choice.

Thanks,
Jianfeng

>
> Thanks,
> Tetsuya




More information about the dev mailing list