[dpdk-dev] virtio "how to restart applications" - //dpdk.org/doc/virtio-net-pmd
Gopakumar Choorakkot Edakkunni
gopakumar.c.e at gmail.com
Sat Mar 18 22:32:35 CET 2017
Hi Yuan,
As a "hack"/"workaround", in rte_eal_init(), if I can call vtpci_reset()
just before rte_eal_memory_init(), that should take care of the problem of
host zeroing out hugepages right ? As of today vtpci_reset() is called in
rte_eal_dev_init() which comes *after* rte_eal_memory_init()
Rgds,
Gopa.
On Thu, Mar 16, 2017 at 10:50 PM, Gopakumar Choorakkot Edakkunni <
gopakumar.c.e at gmail.com> wrote:
> Thanks again Yuanhan, you are the true expert!!
>
> Rgds,
> Gopa.
>
> On Thu, Mar 16, 2017 at 10:40 PM, Yuanhan Liu <yuanhan.liu at linux.intel.com
> > wrote:
>
>> On Thu, Mar 16, 2017 at 10:30:09PM -0700, Gopakumar Choorakkot Edakkunni
>> wrote:
>> > Thanks for the confirmation, glad I reached the person who knows the
>> nuts and
>> > bolts of virtio :-). So if the host is not in our control (ie if I am
>> just
>> > running as a VM on host provided by thirdparty vendor), is there any
>> workaround
>> > I can do from the guest side to prevent problems from happening on a
>> guest
>> > restart ?
>>
>> Not too much. You might want to hack the guest DPDK EAL memory initiation
>> part though, to not reset the hugepage memory on start. But that's too
>> hacky
>> that I will not recommend you to do so!
>>
>> > And if theres no workarounds at all and the host has to change, instead
>> of
>> > asking the third party vendor to do a wholesale upgrade to 16.04, is
>> there one/
>> > few commits that can be added to the host ovs-dpdk to take care of this
>> guest
>> > restart virtio-reset-before opening case ?
>>
>> Yes, backporting the commits I have mentioned should be able to fix it.
>> But please note that I did some code refactorings before those fixes: it
>> won't apply cleanly to DPDK v2.2.
>>
>> And if you want to upgrade, I'd suggest to upgrade to v16.11, which is
>> LTS release.
>>
>> --yliu
>> >
>> > Rgds,
>> > Gopa.
>> >
>> > On Thu, Mar 16, 2017 at 10:24 PM, Yuanhan Liu <
>> yuanhan.liu at linux.intel.com>
>> > wrote:
>> >
>> > On Thu, Mar 16, 2017 at 10:20:30PM -0700, Gopakumar Choorakkot
>> Edakkunni
>> > wrote:
>> > > >> When I was saying dpdk version, I meant the DPDK version with
>> OVS.
>> > >
>> > > Oh I see! My apologies for the misuderstanding. The dpdk version
>> used by
>> > host
>> > > ovs should be dpdk2.2, the guest process uses dpdk16.07. The OVS
>> process
>> > is not
>> > > getting restarted, what is getting restarted is the guest process
>> using
>> > > dpdk16.07 - so the above clarifications you had about virtio being
>> > > reset-before-opened on guest restart - does that still hold good
>> or does
>> > that
>> > > need the HOST side dpdk to be 16.04 or above ?
>> >
>> > Yes, the HOST dpdk should be >= v16.04.
>> >
>> > --yliu
>> > >
>> > > >> And yes, the fixes are not included in the DPDK required for
>> OVS 2.4.
>> > >
>> > > Thanks for the info.
>> > >
>> > > Rgds,
>> > > Gopa.
>> > >
>> > > On Thu, Mar 16, 2017 at 10:13 PM, Yuanhan Liu <
>> > yuanhan.liu at linux.intel.com>
>> > > wrote:
>> > >
>> > > On Thu, Mar 16, 2017 at 09:56:01PM -0700, Gopakumar Choorakkot
>> > Edakkunni
>> > > wrote:
>> > > > Hi Yuanhan,
>> > > >
>> > > > Thanks for the confirmation about not having to do anything
>> special
>> > to
>> > > close
>> > > > the ports on dpdk going down or coming up.
>> > > >
>> > > > As for the question about if I met any issue of ovs getting
>> stuck -
>> > yes,
>> > > my
>> > > > guest process runs dpdk 16.07 as I mentioned earlier - and
>> if I
>> > kill my
>> > > guest
>> > > > process, then the host OVS-dpdk on the host reports stall !
>> The
>> > OVS-dpdk
>> > > and
>> > > > emu versions I use are as below. But maybe that is because
>> of the
>> > ovs
>> > > missing
>> > > > the fixes you mentioned ?
>> > >
>> > > When I was saying dpdk version, I meant the DPDK version with
>> OVS.
>> > >
>> > > > ~# ovs-vswitchd --version
>> > > > ovs-vswitchd (Open vSwitch) 2.4.1
>> > >
>> > > And yes, the fixes are not included in the DPDK required for
>> OVS 2.4.
>> > >
>> > > --yliu
>> > >
>> > > > Compiled Nov 14 2016 06:53:31
>> > > > # kvm --version
>> > > > QEMU emulator version 2.2.0, Copyright (c) 2003-2008 Fabrice
>> > Bellard
>> > > > ~#
>> > > >
>> > > >
>> > > > Rgds,
>> > > > Gopa.
>> > > >
>> > > > On Thu, Mar 16, 2017 at 9:35 PM, Yuanhan Liu <
>> > yuanhan.liu at linux.intel.com
>> > > >
>> > > > wrote:
>> > > >
>> > > > On Thu, Mar 16, 2017 at 07:48:28PM -0700, Gopakumar
>> Choorakkot
>> > > Edakkunni
>> > > > wrote:
>> > > > > Thanks a lot for the response Yuanhan. I am using dpdk
>> > v16.07. So
>> > > what
>> > > > you are
>> > > > > saying is that in 16.07, we dont really need to call
>> > > rte_eth_dev_close()
>> > > > on
>> > > > > exit,
>> > > >
>> > > > It's not about "don't really need", it's more like
>> "it's hard
>> > to".
>> > > Just
>> > > > think that it may crash at any time.
>> > > >
>> > > > > because dpdk will ensure that it will do virtio reset
>> before
>> > init
>> > > when it
>> > > > > comes up right ?
>> > > >
>> > > > No, It just handles the abnormal case well when guest
>> APP
>> > restarts.
>> > > >
>> > > > > Regarding the vhost commits you mentioned - do we
>> still need
>> > those
>> > > fixes
>> > > > if we
>> > > > > have the "virtio reset before init" mechanism ?
>> > > >
>> > > > Yes, we still need them: just think some malicious
>> guest may
>> > also
>> > > forge
>> > > > data like that.
>> > > >
>> > > > I'm a bit confused then. Have you actually met any
>> issue (like
>> > got
>> > > stucked)
>> > > > with DPDK v16.07?
>> > > >
>> > > > --yliu
>> > > >
>> > > > > Or that is a seperate problem
>> > > > > altogether (and hence we would need those fixes) ?
>> > > > >
>> > > > > Rgds,
>> > > > > Gopa.
>> > > > >
>> > > > > On Thu, Mar 16, 2017 at 7:06 PM, Yuanhan Liu <
>> > > yuanhan.liu at linux.intel.com
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > On Thu, Mar 16, 2017 at 12:39:16PM -0700,
>> Gopakumar
>> > Choorakkot
>> > > > Edakkunni
>> > > > > wrote:
>> > > > > > So the doc says we should call
>> rte_eth_dev_close()
>> > *before*
>> > > going
>> > > > down.
>> > > > > And I
>> > > > > > know that especially in dpdk-virtionet in the
>> guest +
>> > > ovs-dpdk in
>> > > > the
>> > > > > host,
>> > > > > > the ovs ends up getting stalled/stuck (!!) if I
>> dont
>> > close
>> > > the port
>> > > > > before
>> > > > > > starting() it when the guest dpdk process comes
>> back
>> > up.
>> > > > >
>> > > > > I'm assuming you were using an old version,
>> something
>> > like dpdk
>> > > v2.2?
>> > > > > IIRC, DPDK v16.04 should have fixed your issue.
>> > > > >
>> > > > > > Considering that this not done properly can
>> screw up
>> > the HOST
>> > > ovs,
>> > > > and I
>> > > > > want
>> > > > > > to do everything possible to avoid that, I want
>> to be
>> > 200%
>> > > sure
>> > > > that I
>> > > > > call
>> > > > > > close even if my process gets a kill -9 .. So
>> obviously
>> > the
>> > > only
>> > > > way of
>> > > > > doing
>> > > > > > that is to close the port when the dpdk process
>> comes
>> > back up
>> > > and
>> > > > > *before* we
>> > > > > > init the port. rte_eth_dev_close() is not
>> capable of
>> > doing
>> > > that as
>> > > > it
>> > > > > expects
>> > > > > > the port parameters to be initialized etc..
>> before it
>> > can be
>> > > > called.
>> > > > >
>> > > > > We do virtio reset before init, which is
>> basically what
>> > > > rte_eth_dev_close()
>> > > > > mainly does. So I see no big issue here.
>> > > > >
>> > > > > The stuck issue is due to hugepage reset by the
>> guest
>> > DPDK
>> > > > application,
>> > > > > leading all virtio vring elements being mem
>> zeroed. The
>> > old
>> > > vhost
>> > > > doesn't
>> > > > > handle it well, as a result, it got stuck. And
>> here are
>> > some
>> > > relevant
>> > > > > commits:
>> > > > >
>> > > > > a436f53 vhost: avoid dead loop chain
>> > > > > c687b0b vhost: check for ring descriptors
>> overflow
>> > > > > 623bc47 vhost: do sanity check for ring
>> descriptor
>> > length
>> > > > >
>> > > > > --yliu
>> > > > >
>> > > > > > Any other
>> > > > > > suggestions on what can be done to close on
>> restart
>> > rather
>> > > than
>> > > > close on
>> > > > > going
>> > > > > > down ? Thought of bouncing this by the alias
>> before I
>> > add a
>> > > version
>> > > > of
>> > > > > close
>> > > > > > myself that can do this close-on-restart
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>>
>
>
More information about the dev
mailing list