[dpdk-dev] [PATCH v11 0/7] hot-unplug failure handle mechanism

Jerin Jacob jerin.jacob at caviumnetworks.com
Mon Oct 1 11:55:15 CEST 2018


-----Original Message-----
> Date: Mon, 1 Oct 2018 11:00:12 +0200
> From: Stephen Hemminger <stephen at networkplumber.org>
> To: Jeff Guo <jia.guo at intel.com>
> Cc: bruce.richardson at intel.com, ferruh.yigit at intel.com,
>  konstantin.ananyev at intel.com, gaetan.rivet at 6wind.com,
>  jingjing.wu at intel.com, thomas at monjalon.net, motih at mellanox.com,
>  matan at mellanox.com, harry.van.haaren at intel.com, qi.z.zhang at intel.com,
>  shaopeng.he at intel.com, bernard.iremonger at intel.com,
>  arybchenko at solarflare.com, wenzhuo.lu at intel.com,
>  anatoly.burakov at intel.com, jblunck at infradead.org, shreyansh.jain at nxp.com,
>  dev at dpdk.org, helin.zhang at intel.com
> Subject: Re: [dpdk-dev] [PATCH v11 0/7] hot-unplug failure handle mechanism
> 
> 
> On Sun, 30 Sep 2018 19:29:56 +0800
> Jeff Guo <jia.guo at intel.com> wrote:
> 
> > Hotplug is an important feature for use-cases like the datacenter device's
> > fail-safe and for SRIOV Live Migration in SDN/NFV. It could bring higher
> > flexibility and continuality to networking services in multiple use-cases
> > in the industry. So let's see how DPDK can help users implement hotplug
> > solutions.
> >
> > We already have a general device-event monitor mechanism, failsafe driver,
> > and hot plug/unplug API in DPDK. We have already got the solution of
> > “ethdev event + kernel PMD hotplug handler + failsafe”, but we still not
> > got “eal event + hotplug handler for pci PMD + failsafe” implement, and we
> > need to considerate 2 different solutions between uio pci and vfio pci.
> >
> > In the case of hotplug for igb_uio, when a hardware device be removed
> > physically or disabled in software, the application needs to be notified
> > and detach the device out of the bus, and then make the device invalidate.
> > The problem is that, the removal of the device is not instantaneous in
> > software. If the application data path tries to read/write to the device
> > when removal is still in process, it will cause an MMIO error and
> > application will crash.
> >
> > In this patch set, we propose a PCIe bus failure handler mechanism for
> > hot-unplug in igb_uio. It aims to guarantee that, when a hot-unplug occurs,
> > the application will not crash.
> >
> > The mechanism should work as below:
> >
> > First, the application enables the device event monitor, registers the
> > hotplug event’s callback and enable hotplug handling before running the
> > data path. Once the hot-unplug occurs, the mechanism will detect the
> > removal event and then accordingly do the failure handling. In order to
> > do that, the below functionality will be required:
> >  - Add a new bus ops “hot_unplug_handler” to handle hot-unplug failure.
> >  - Implement pci bus specific ops “pci_hot_unplug_handler”. For uio pci,
> >    it will be based on the failure address to remap memory for the corresponding
> >    device that unplugged. For vfio pci, could seperate implement case by case.
> >
> > For the data path or other unexpected behaviors from the control path
> > when a hot unplug occurs:
> >  - Add a new bus ops “sigbus_handler”, that is responsible for handling
> >    the sigbus error which is either an original memory error, or a specific
> >    memory error that is caused by a hot unplug. When a sigbus error is
> >    captured, it will call this function to handle sigbus error.
> >  - Implement PCI bus specific ops “pci_sigbus_handler”. It will iterate all
> >    device on PCI bus to find which device encounter the failure.
> >  - Implement a "rte_bus_sigbus_handler" to iterate all buses to find a bus
> >    to handle the failure.
> >  - Add a couple of APIs “rte_dev_hotplug_handle_enable” and
> >    “rte_dev_hotplug_handle_diable” to enable/disable hotplug handling.
> >    It will monitor the sigbus error by a handler which is per-process.
> >    Based on the signal event principle, the control path thread and the
> >    data path thread will randomly receive the sigbus error, but will call the
> >    common sigbus process. When sigbus be captured, it will call the above API
> >    to find bus to handle it.
> >
> > The mechanism could be used by app or PMDs. For example, the whole process
> > of hotplug in testpmd is:
> >  - Enable device event monitor->Enable hotplug handle->Register event callback
> >    ->attach port->start port->start forwarding->Device unplug->failure handle
> >    ->stop forwarding->stop port->close port->detach port.
> >
> > This patch set would not cover hotplug insert and binding, and it is only
> > implement the igb_uio failure handler, the vfio hotplug failure handler
> > will be in next coming patch set.
> >
> > patchset history:
> > v11->v10:
> > change the ops name, since both uio and vfio will use the hot-unplug ops.
> > add experimental tag.
> > since we plan to abandon RTE_ETH_EVENT_INTR_RMV, change to use
> > RTE_DEV_EVENT_REMOVE, so modify the hotplug event and callback usage.
> > move the igb_uio fixing part, since it is random issue and should be considarate
> > as kernel driver defect but not include as this failure handler mechanism.
> >
> > v10->v9:
> > modify the api name and exposure out for public use.
> > add hotplug handle enable/disable APIs
> > refine commit log
> >
> > v9->v8:
> > refine commit log to be more readable.
> >
> > v8->v7:
> > refine errno process in sigbus handler.
> > refine igb uio release process
> >
> > v7->v6:
> > delete some unused part
> >
> > v6->v5:
> > refine some description about bus ops
> > refine commit log
> > add some entry check.
> >
> > v5->v4:
> > split patches to focus on the failure handle, remove the event usage
> > by testpmd to another patch.
> > change the hotplug failure handler name.
> > refine the sigbus handle logic.
> > add lock for udev state in igb uio driver.
> >
> > v4->v3:
> > split patches to be small and clear.
> > change to use new parameter "--hotplug-mode" in testpmd to identify
> > the eal hotplug and ethdev hotplug.
> >
> > v3->v2:
> > change bus ops name to bus_hotplug_handler.
> > add new API and bus ops of bus_signal_handler distingush handle generic.
> > sigbus and hotplug sigbus.
> >
> > v2->v1(v21):
> > refine some doc and commit log.
> > fix igb uio kernel issue for control path failure rebase testpmd code.
> >
> > Since the hot plug solution be discussed serval around in the public,
> > the scope be changed and the patch set be split into many times. Coming
> > to the recently RFC and feature design, it just focus on the hot unplug
> > failure handler at this patch set, so in order let this topic more clear
> > and focus, summarize privours patch set in history “v1(v21)”, the v2 here
> > go ahead for further track.
> >
> > "v1(21)" == v21 as below:
> > v21->v20:
> > split function in hot unplug ops.
> > sync failure hanlde to fix multiple process issue fix attach port issue for multiple devices case.
> > combind rmv callback function to be only one.
> >
> > v20->v19:
> > clean the code.
> > refine the remap logic for multiple device.
> > remove the auto binding.
> >
> > v19->18:
> > note for limitation of multiple hotplug, fix some typo, sqeeze patch.
> >
> > v18->v15:
> > add document, add signal bus handler, refine the code to be more clear.
> >
> > the prior patch history please check the patch set "add device event monitor framework".
> >
> > Jeff Guo (7):
> >   bus: add hot-unplug handler
> >   bus/pci: implement hot-unplug handler ops
> >   bus: add sigbus handler
> >   bus/pci: implement sigbus handler ops
> >   bus: add helper to handle sigbus
> >   eal: add failure handle mechanism for hot-unplug
> >   testpmd: use hot-unplug failure handle mechanism
> >
> >  app/test-pmd/testpmd.c                  |  39 ++++++--
> >  doc/guides/rel_notes/release_18_08.rst  |   5 +
> >  drivers/bus/pci/pci_common.c            |  81 ++++++++++++++++
> >  drivers/bus/pci/pci_common_uio.c        |  33 +++++++
> >  drivers/bus/pci/private.h               |  12 +++
> >  lib/librte_eal/bsdapp/eal/eal_dev.c     |  14 +++
> >  lib/librte_eal/common/eal_common_bus.c  |  43 +++++++++
> >  lib/librte_eal/common/eal_private.h     |  39 ++++++++
> >  lib/librte_eal/common/include/rte_bus.h |  34 +++++++
> >  lib/librte_eal/common/include/rte_dev.h |  26 +++++
> >  lib/librte_eal/linuxapp/eal/eal_dev.c   | 162 +++++++++++++++++++++++++++++++-
> >  lib/librte_eal/rte_eal_version.map      |   2 +
> >  12 files changed, 481 insertions(+), 9 deletions(-)
> >
> 
> I am glad to see this, hotplug is needed. But have a somewhat controversial
> point of view. The DPDK project needs to do more to force users to go to
> more modern kernels and API's; there has been too much effort already to
> support new DPDK on older kernels and distributions. This leads to higher
> testing burden, technical debt and multiple API's.
> 
> To take the extreme point of view.
>         * igb_uio should be deprecated and all new work only use vfio and vfio-ionommu only
>         * kni should be deprecated and replaced by virtio

+1

I think, The only feature missing in upstream kernel for DPDK may be to
enable SRIOV on PF VFIO devices controlled by DPDK PMD.
I think, Binding a PF device to DPDK along with VFs(VF can be bound to netdev or DPDK
PMDs, Though binding VF to netdev considered as security breach) will be useful for 
a) rte_flow actions like redirecting the traffic to PF or VF on the given pattern
b) Some NICs can support promiscuous mode only on PF
c) Enable Switch representation devices
https://doc.dpdk.org/guides/prog_guide/switch_representation.html 

I think, igb_uio mainly used as the backdoor for this use case.

I think, there was some work in this area but it is not upstreamed due
to various reasons.
https://lkml.org/lkml/2018/3/8/1122

> 
> When there are N ways of doing things against X kernel versions,
> and Y distributions, and multiple device vendors; the combinational explosion of cases means
> that interfaces don't get the depth of testing they deserve.
> 
> That means why not support hotplug on VFIO only?
> 


More information about the dev mailing list