[dpdk-dev] [PATCH 2/5] ethdev: add port ownership

Matan Azrad matan at mellanox.com
Sat Dec 23 23:36:34 CET 2017


Hi 
> -----Original Message-----
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Friday, December 22, 2017 4:27 PM
> To: Matan Azrad <matan at mellanox.com>
> Cc: Thomas Monjalon <thomas at monjalon.net>; dev at dpdk.org; Bruce
> Richardson <bruce.richardson at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; Gaëtan Rivet <gaetan.rivet at 6wind.com>;
> Wu, Jingjing <jingjing.wu at intel.com>
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Dec 21, 2017 at 09:57:43PM +0000, Matan Azrad wrote:
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > Sent: Thursday, December 21, 2017 10:14 PM
> > > To: Matan Azrad <matan at mellanox.com>
> > > Cc: Thomas Monjalon <thomas at monjalon.net>; dev at dpdk.org; Bruce
> > > Richardson <bruce.richardson at intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev at intel.com>; Gaëtan Rivet
> > > <gaetan.rivet at 6wind.com>; Wu, Jingjing <jingjing.wu at intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > > > Hi
> > > >
> > <snip>
> > > > > > > > I think we need to clearly describe what is the
> > > > > > > > tread-safety policy in DPDK (especially in ethdev as a first
> example).
> > > > > > > > Let's start with obvious things:
> > > > > > > >
> > > > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > > > 			- no planned change because of performance
> > > > > purpose
> > > > > > > > 	2/ The list of devices is racy
> > > > > > > > 			- to be fixed with atomics
> > > > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > > > 			- the configurations are different per-device
> > > > > > > > 	4/ The configuration of a given device is racy
> > > > > > > > 			- can be managed by the owner of the device
> > > > > > > > 	5/ The device ownership is racy
> > > > > > > > 			- to be fixed with atomics
> > > > > > > >
> > > > > > > > What am I missing?
> > > > > > > >
> > > >
> > > > Thank you Thomas for this order.
> > > > Actually the port ownership is a good opportunity to redefine the
> > > > synchronization rules in ethdev :)
> > > >
> > > > > > > There is fan out to consider here:
> > > > > > >
> > > > > > > 1) Is device configuration racy with ownership?  That is to
> > > > > > > say, can I change ownership of a device safely while another
> > > > > > > thread that currently owns it modifies its configuration?
> > > > > >
> > > > > > If an entity steals ownership to another one, either it is
> > > > > > agreed earlier, or it is done by a central authority.
> > > > > > When it is acked that ownership can be moved, there should not
> > > > > > be any configuration in progress.
> > > > > > So it is more a communication issue than a race.
> > > > > >
> > > > > But if thats the case (specifically that mutual exclusion
> > > > > between port ownership and configuration is an exercize left to
> > > > > an application developer), then port ownership itself is largely
> > > > > meaningless within the dpdk, because the notion of who owns the
> > > > > port needs to be codified within the application anyway.
> > > > >
> > > >
> > > > Bruce, As I understand it, only the dpdk entity who took ownership
> > > > of a
> > > port successfully can configure the device by default, if other dpdk
> > > entities want to configure it too they must to be synchronized with
> > > the port owner while it is not recommended after the port ownership
> integration.
> > > >
> > > Can you clarify what you mean by "it is not recommended after the
> > > port ownership integration"?
> >
> > Sure,
> > The new defining of ethdev synchronization doesn't recommend to
> manage a port by 2 different dpdk entities, it can be done but not
> recommended.
> >
> Ok, thats just not what you said above.  Your suggestion made it sound like
> you thought that  after the integration of a port ownership model, that
> multiple dpdk entries should not synchronize with one another, which made
> no sense.
> 
Ok, I can see a dual meaning in my sentence, sorry for that, I think we agree here.

> > >  I think there is consensus that the port owner must be the only
> > > entitiy to operate on a port (be that configuration/frame rx/tx, or
> > > some other operation).
> >
> > Your question above caused me to think that you don't understand it, How
> can someone who is not the port owner to change the port owner?
> > Changing the port owner, like port configuration and port release must be
> done by the owner itself except the case that there is no owner to the port.
> > See the API rte_eth_dev_owner_remove.
> >
> See above, your phrasing I don't think accurately reflected what you meant
> to convey. Or at least thats not how I read it
> 
> > > Multithreaded operation on a port always means some level of
> > > synchronization between application threads and the dpdk library,
> > Yes.
> >  >but I'm not sure why that would be different if we introduced a more
> > > concrete notion of port ownership via a new library.
> > >
> >
> > What do you mean by "new library"?, port is an ethdev instance and should
> be managed by ethdev.
> >
> I'm referring to the port ownership api that you proposed.  Apologies, I
> should not have used the term "new library", but rather "new api".
> 
> >  > > So, for example,  if the dpdk entity is an application, the
> > application should
> > >> take ownership of the port and manage the synchronization of this
> > >> port configuration between the application threads and its EAL host
> > >> thread callbacks, no other dpdk entity should configure the same
> > >> port because they should fail when they try to take ownership of the
> same port too.
> >
> > > Well, failing is one good approach, yes, blocking on port
> > > relenquishment could be another.  I'd recommend an API with the
> following interface:
> > >
> > > rte_port_ownership_claim(int port_id) - blocks execution of the
> > > calling thread until the previous owner releases ownership, then
> > > claims it and returns
> > >
> > > rte_port_ownership_release(int port_id) - releases ownership of
> > > port, or returns error if the port was not owned by this execution
> > > context
> > >
> > > rte_port_owernship_try_claim(int port_id) - same as
> > > rte_port_ownership_claim, but fails if the port is already owned.
> > >
> > > That would give the option for both semantics.
> >
> > I think the current APIs are better because of the next reasons:
> > - It defines well who is the owner.
> Theres no reason you can't integrate some ownership nonce to the above
> API as well, thats easy to add.  The relevant part is the ability to exclude
> those who are not owners (that is to say, block their progress until ownership
> is released by a preceding entity).
> 
> > - The owner structure includes string to allow better debug and printing for
> humans.
> I've got no problem with any such internals, its really the synchronization that
> I'm after.
> 
> > Did you read it?
> Yes, I don't see why you would think I hadn't, I think I've been very clear in
> my understanding of you initial patch.  Have you taken the time to
> understand my concerns?
>
OK, Just it looks like you suggested a new APIs instead of V1 APIs.

Your concerns are about the races in port ownership management.
I agree with it only after Thomas redefining of port synchronization rules.
Mean that if the port creation will be race safe and the new rules will be documented, the port ownership  should be race safe too.
 
> > I can add there an API that wait until the port ownership is released as you
> suggested in V2.
> >
> I think that would be good.
> 
> > > > Each dpdk entity which wants to take ownership must to be able to
> > > >synchronize the port configuration in its level.
> >
> > > Can you elaborate on what you mean by level here?  Are you
> > > envisioning a scheme in which multiple execution contexts might own
> > > a port for various non-conflicting purposes?
> >
> > Sure,
> > 1) Application with 2 threads wanting to configure the same port:
> > 	level = application code.
> >
> > 	a. The main thread should create owner
> identifier(rte_eth_dev_owner_new).
> > 	b. The main thread should take the port
> ownership(rte_eth_dev_owner_set).
> > 	c. Synchronization between the two threads should be done for the
> conflicted 		configurations by the application.
> > 	d. when the application finishes the port usage it should release the
> owner(rte_eth_dev_owner_remove).
> >
> > 2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for
> hotplug detections which can configure the 2 ports(by the host thread).
> > 	Level = fail-safe code.
> > 	a. Application starts the eal and the fail-safe driver probing function is
> called.
> > 	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes
> ownership for them.
> > 	c. Failsafe creates itself port and leaves it ownerless.
> > 	d. Failsafe starts the hotplug alarm mechanism.
> > 	e. Application tries to take ownership for all ports and success only
> for failsafe port.
> > 	f. Application start to configure the failsafe port asynchronously to
> failsafe hotplug alarm.
> > 	g. Failsafe must use synchronization between failsafe alarm callback
> code and failsafe configuration APIs called by the application because they
> both try to configure the same sub-devices ports.
> > 	h. When fail-safe finishes with the two sub devices it should release
> the ports owner.
> >
> Ok, this I would describe as different use cases rather than parallel
> ownership, in that in both cases there is still a single execution context which
> is responsible for all aspects of a given port (which is fine with me, I'm just
> trying to be clear in our description of the model).
> 
Agree.
Can you find a realistic scenario that a non-single execution entity must to manage a port and have problems with the port races synchronization management? 
 
> 
> > > >
> > > > >
> > > > > > > 2) Is device configuration racy with device addition/removal?
> > > > > > > That is to say, can one thread remove a device while another
> > > configures it.
> > > > > >
> > > > > > I think it is the same as two threads configuring the same
> > > > > > device (item 4/ above). It can be managed with port ownership.
> > > > > >
> > > > > Only if you assert that application is required to have the
> > > > > owning port be responsible for the ports deletion, which we can
> > > > > say, but that leads to the issue above again.
> > > > >
> > > > >
> > > > As Thomas said in item 2 the port creation must be synchronized by
> > > > ethdev
> > > and we need to add it there.
> > > > I think it is obvious that port removal must to be done only by
> > > > the port
> > > owner.
> > > >
> > > You say that, but its obvious to you as a developer who has looked
> > > extensively at the code.  It may well be less so to a consumer who
> > > is not an active member of the community, for instance one who
> > > obtains the dpdk via pre-built package.
> > >
> >
> > Yes I can understand, but new rules should be documented and be
> adjusted easy easy by the customers, no?
> Ostensibly, it should be easy, yes, but in practice its a bit murkier.  For
> instance, What if an application wants to enable packet capture on an
> interface via rte_pdump_enable?  Does preforming that action require that
> the execution context which calls that function own the port before doing
> so?  Digging through the code suggests to me that it (suprisingly) does not,
> because all that function does is set a socket to record packets too, but I
> would have intuitively thought that enabling packet capture would require
> turning off the mac filter table in the hardware, and so would have required
> ownership
> 
> Conversely, using the same example, calling rte_pdump_init, using the
> model from your last patch, would require that the calling execution context
> ensured that , at the time the cli application issued the monnitor request,
> that the port be unowned, because the pdump main thread needs to set
> rx_tx callbacks on the requested port, which I belive constitutes a
> configuration change needing port ownership.
> 
> My point being, I think saying that ownership is easy and obvious isn't
> accurate.

Agree, as a finger rule all the port relation APIs should require ownership taking, but it will take time to learn when we don't need to take ownership.

>  If we are to leave proper synchrnization of access to devices up to
> the application, we either need to:
> 
> 1) Assume downstream users are intimately familiar with the code or
> 2) Exhaustively document the conditions under which ownership needs to be
> held
> 
> (1) is a non starter, and 2 I think is a fairly large undertaking, but unless we
> are willing to codify synchronization in the code explicitly (via locking), (2) is
> what we have to do.
> 
Agree.
Maybe it will be good to document each relevant API if it requires ownership taking or not in .h files, what do you think?  

> Neil



More information about the dev mailing list