[dpdk-dev] [PATCH v1 2/3] net/hyperv: implement core functionality

Adrien Mazarguil adrien.mazarguil at 6wind.com
Thu Dec 21 17:19:15 CET 2017
Previous message: [dpdk-dev] [PATCH v1 2/3] net/hyperv: implement core functionality
Next message: [dpdk-dev] [PATCH v1 3/3] net/hyperv: add "force" parameter
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Disclaimer: I agree with Thomas's suggestions in his reply [1] to your
message, I'm replying below as well to provide more details of my own and
clarify the motivations behind this approach a bit more.

On Tue, Dec 19, 2017 at 12:44:35PM -0800, Ferruh Yigit wrote:
> On 12/19/2017 7:06 AM, Adrien Mazarguil wrote:
> > On Mon, Dec 18, 2017 at 05:54:45PM -0800, Ferruh Yigit wrote:
> >> On 12/18/2017 8:46 AM, Adrien Mazarguil wrote:
> >>> As described in more details in the attached documentation (see patch
> >>> contents), this virtual device driver manages NetVSC interfaces in virtual
> >>> machines hosted by Hyper-V/Azure platforms.
> >>>
> >>> This driver does not manage traffic nor Ethernet devices directly; it acts
> >>> as a thin configuration layer that automatically instantiates and controls
> >>> fail-safe PMD instances combining tap and PCI sub-devices, so that each
> >>> NetVSC interface is exposed as a single consolidated port to DPDK
> >>> applications.
> >>>
> >>> PCI sub-devices being hot-pluggable (e.g. during VM migration),
> >>> applications automatically benefit from increased throughput when present
> >>> and automatic fallback on NetVSC otherwise without interruption thanks to
> >>> fail-safe's hot-plug handling.
> >>>
> >>> Once initialized, the sole job of the hyperv driver is to regularly scan
> >>> for PCI devices to associate with NetVSC interfaces and feed their
> >>> addresses to corresponding fail-safe instances.
> >>>
> >>> Signed-off-by: Adrien Mazarguil <adrien.mazarguil at 6wind.com>
> >>
> >> <...>
> >>
> >>> +	RTE_ETH_FOREACH_DEV(port_id) {
> >> <..>
> >>> +			ret = rte_eal_hotplug_remove(bus->name, dev->name);
> >> <..>
> >>> +	ret = rte_eal_hotplug_add("vdev", ctx->devname, ctx->devargs);
> >>
> >> Overall why this logic implemented as network PMD?
> >> Yes technically you can implement *anything* as PMD :), but should we?
> >>
> >> This code does eal level work (scans bus, add/remove devices), and for control
> >> path, and not a generic solution either (specific to netvsc and failsafe).
> >>
> >> Only device argument part of a PMD seems used, rest is unrelated to being a PMD.
> >> Scans netvsc changes in background and reflects them into failsafe PMD...
> >>
> >> Why this is implemented as PMD, not another entity, like bus driver perhaps?
> >>
> >> Or indeed why this in DPDK instead of being in application?
> > 
> > I'll address that last question first: the point of this driver is enabling
> > existing applications to run within a Hyper-V environment unmodified,
> > because they'd otherwise need to manage two driver instances correctly on
> > their own in addition to hot-plug events during VM migration.
> > 
> > Some kind of driver generating a front end to what otherwise appears as two
> > distinct ethdev to applications is therefore necessary.
> > 
> > Currently without it, users have to manually configure failsafe properly for
> > each NetVSC interface on their system. Besides the inconvenience, it's not
> > even a possibility with DPDK applications that don't rely on EAL
> > command-line arguments.
> > 
> > As such it's more correctly defined as a "platform" driver rather than a
> > true PMD. It leaves VF device handling to their respective PMDs while
> > automatically managing the platform-specific part itself. There's no simpler
> > alternative when running in blacklist mode (i.e. not specifying any device
> > parameters on the command line).
> > 
> > Regarding its presence in drivers/net rather than drivers/bus, the end
> > result from an application standpoint is that each instance exposes a single
> > ethdev, even if not its own (failsafe's). Busses don't do that. It also
> > allows passing arguments to individual devices through --vdev if needed.
> > 
> > You're right about putting device detection at the bus level though, and I
> > think there's work in progress to do just that, this driver will be updated
> > to benefit from it once applied. In the meantime, the code as submitted
> > works fine with the current DPDK code base and addresses an existing use
> > case for which there is no solution at this point.
> 
> This may be working but this looks like a hack to me.
> 
> If we need a platform driver why not properly work on it. If we need to improve
> eal hotplug, this is a good motivation to improve it.

Hotplug surely can be improved but I don't think that alone will be enough
for what this driver does. Here's how things are sequenced as currently
implemented:

1. DPDK application starts.

2. EAL scans for PCI devices, ethdev ports are created for relevant ones.

3. hyperv vdev scans the system for appropriate NetVSC netdevices,
   instantiates failsafe PMD accordingly to create ethdev ports for each of
   them.

   At this stage, rte_eal_hotplug_remove() is also called on physical
   devices found in 2. that will be given to failsafe (see 4.), since
   they're not supposed to be seen or owned by the application (keep in mind
   this happens on Hyper-V platforms only).

4. From this point on, application can use the remaining ports normally.

5. A PCI device gets plugged in, kernel recognizes it and creates a
   netdevice for it.

6. hyperv's timer callback detects the new netdevice, if its properties
   match NetVSC's then it proceeds to tell failsafe its location.

7. failsafe probes the given address on the appropriate bus to instantiate
   another hidden ethdev out of it and primarily uses that device for TX
   until it gets unplugged. Meanwhile, RX is still performed on both
   underlying devices.

Let's now assume hot-plug is perfectly implemented in DPDK along with
Gaetan's netdevice bus [2] (or equivalent) with hotplug properties as well:

1. DPDK application starts.

2. EAL scans for PCI devices, ethdev ports are created for relevant ones.

3. EAL scans for net_bus devices, ethdev ports are created for relevant
   ones.

4. The piece of code formerly known as the hyperv driver looks at detected
   net_bus devices, finds relevant ones with NetVSC properties and promptly
   kicks them out through rte_eal_hotplug_remove() (or equivalent) so that
   the application doesn't get a chance to "see" them.

   It then instantiates fail-safe PMD like before, with fail-safe
   re-discovering devices as its own.

5. From this point on, application can use the remaining ports normally.

6. A PCI device gets plugged in, kernel recognizes it and creates a
   netdevice for it.

7. EAL's net_bus hotplug handler kicks in, automatically creates a new
   ethdev port out of it (note: device properties such as MAC addresses are
   not known before the associated PMD is initialized and an ethdev
   created).

8. The piece of code formerly known as the hyperv driver that happens to
   also be listening for hotplug events sees that new ethdev port; if its
   properties match NetVSC's then it proceeds to hide it before telling
   failsafe its location.

9. failsafe probes the given address on the appropriate bus to instantiate
   another hidden ethdev out of it and primarily uses that device for TX
   until it gets unplugged. Meanwhile, RX is still performed on both
   underlying devices.

Hotplug basically removes the timer callback and some of the probing code.
I agree it's perfectly fine to update this PMD once hotplug is implemented
that way. Now what about the rest?

Without a driver there's no way to orchestrate all the above. A separate
layer between applications and PMDs is necessary for that; the handover of
ethdev ports to failsafe is mandatory.

> And if this logic needs to be in application let it be, your argument is to not
> change the existing application but this logic may lead implementing many
> unrelated things as PMD to not change application, what is the line here.

Well, for this particular case I don't think many applications want to
retrieve multicast and some other traffic out of one ethdev and the rest
from another only when the latter is present. This complexity must be
handled by the framework, not by applications, which ideally are not
supposed to know much about the environment they're running in.

For this reason, even a specific API is out of the question.

> What is the work in progress, exact list, that will replace this solution? If
> this hackish solution will prevent that real work, I am against this solution.
> Is there a way to ensure this will be a temporary solution and that real work
> will happen?

I think Thomas answers this question [1], I'll just add that the current
approach was developed and submitted in a way that doesn't have any impact
on public APIs precisely to avoid conflicts with other work on EAL in the
meantime.

If the hotplug subsystem evolves, this driver will catch up, particularly
since it's small and shouldn't be too complex to adapt. I volunteer for that
work once APIs are ready in any case; failing that, the experimental tag
(I'll add it for v2) means its pure and simple removal.

I'd like your opinion on the current approach to determine the next steps:

- Do you agree with the fact hotplug and platform-related functionality are
  two separate problems, that the approach to implement the former doesn't
  address the latter?

- About implementing the latter in DPDK as a kind of platform driver so that
  applications don't need to be modified?

- If you had to choose between drivers/bus and drivers/net for it? (keep in
  mind the ability to provide per-device options would be great)

[1] http://dpdk.org/ml/archives/dev/2017-December/084558.html
[2] http://dpdk.org/ml/archives/dev/2017-June/067546.html

-- 
Adrien Mazarguil
6WIND
Previous message: [dpdk-dev] [PATCH v1 2/3] net/hyperv: implement core functionality
Next message: [dpdk-dev] [PATCH v1 3/3] net/hyperv: add "force" parameter
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list