[dpdk-dev] [PATCH 2/5] ethdev: add port ownership

Neil Horman nhorman at tuxdriver.com
Fri Dec 22 15:26:51 CET 2017


On Thu, Dec 21, 2017 at 09:57:43PM +0000, Matan Azrad wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > Sent: Thursday, December 21, 2017 10:14 PM
> > To: Matan Azrad <matan at mellanox.com>
> > Cc: Thomas Monjalon <thomas at monjalon.net>; dev at dpdk.org; Bruce
> > Richardson <bruce.richardson at intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev at intel.com>; Gaëtan Rivet <gaetan.rivet at 6wind.com>;
> > Wu, Jingjing <jingjing.wu at intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > > Hi
> > >
> <snip>
> > > > > > > I think we need to clearly describe what is the tread-safety
> > > > > > > policy in DPDK (especially in ethdev as a first example).
> > > > > > > Let's start with obvious things:
> > > > > > >
> > > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > > 			- no planned change because of performance
> > > > purpose
> > > > > > > 	2/ The list of devices is racy
> > > > > > > 			- to be fixed with atomics
> > > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > > 			- the configurations are different per-device
> > > > > > > 	4/ The configuration of a given device is racy
> > > > > > > 			- can be managed by the owner of the device
> > > > > > > 	5/ The device ownership is racy
> > > > > > > 			- to be fixed with atomics
> > > > > > >
> > > > > > > What am I missing?
> > > > > > >
> > >
> > > Thank you Thomas for this order.
> > > Actually the port ownership is a good opportunity to redefine the
> > > synchronization rules in ethdev :)
> > >
> > > > > > There is fan out to consider here:
> > > > > >
> > > > > > 1) Is device configuration racy with ownership?  That is to say,
> > > > > > can I change ownership of a device safely while another thread
> > > > > > that currently owns it modifies its configuration?
> > > > >
> > > > > If an entity steals ownership to another one, either it is agreed
> > > > > earlier, or it is done by a central authority.
> > > > > When it is acked that ownership can be moved, there should not be
> > > > > any configuration in progress.
> > > > > So it is more a communication issue than a race.
> > > > >
> > > > But if thats the case (specifically that mutual exclusion between
> > > > port ownership and configuration is an exercize left to an
> > > > application developer), then port ownership itself is largely
> > > > meaningless within the dpdk, because the notion of who owns the port
> > > > needs to be codified within the application anyway.
> > > >
> > >
> > > Bruce, As I understand it, only the dpdk entity who took ownership of a
> > port successfully can configure the device by default, if other dpdk entities
> > want to configure it too they must to be synchronized with the port owner
> > while it is not recommended after the port ownership integration.
> > >
> > Can you clarify what you mean by "it is not recommended after the port
> > ownership integration"?
> 
> Sure,
> The new defining of ethdev synchronization doesn't recommend to manage a port by 2 different dpdk entities, it can be done but not recommended.
>   
Ok, thats just not what you said above.  Your suggestion made it sound like you
thought that  after the integration of a port ownership model, that multiple
dpdk entries should not synchronize with one another, which made no sense.

> >  I think there is consensus that the port owner must
> > be the only entitiy to operate on a port (be that configuration/frame rx/tx, or
> > some other operation).
> 
> Your question above caused me to think that you don't understand it, How can someone who is not the port owner to change the port owner?
> Changing the port owner, like port configuration and port release must be done by the owner itself except the case that there is no owner to the port.
> See the API rte_eth_dev_owner_remove.
> 
See above, your phrasing I don't think accurately reflected what you meant to
convey. Or at least thats not how I read it

> > Multithreaded operation on a port always means
> > some level of synchronization between application threads and the dpdk
> > library,
> Yes.
>  >but I'm not sure why that would be different if we introduced a more
> > concrete notion of port ownership via a new library.
> >
> 
> What do you mean by "new library"?, port is an ethdev instance and should be managed by ethdev.
> 
I'm referring to the port ownership api that you proposed.  Apologies, I should
not have used the term "new library", but rather "new api".

>  > > So, for example,  if the dpdk entity is an application, the application should
> >> take ownership of the port and manage the synchronization of this port
> >> configuration between the application threads and its EAL host thread
> >> callbacks, no other dpdk entity should configure the same port because they
> >> should fail when they try to take ownership of the same port too.
> 
> > Well, failing is one good approach, yes, blocking on port relenquishment
> > could be another.  I'd recommend an API with the following interface:
> > 
> > rte_port_ownership_claim(int port_id) - blocks execution of the calling
> > thread until the previous owner releases ownership, then claims it and
> > returns
> > 
> > rte_port_ownership_release(int port_id) - releases ownership of port, or
> > returns error if the port was not owned by this execution context
> >
> > rte_port_owernship_try_claim(int port_id) - same as
> > rte_port_ownership_claim, but fails if the port is already owned.
> > 
> > That would give the option for both semantics.
> 
> I think the current APIs are better because of the next reasons:
> - It defines well who is the owner.
Theres no reason you can't integrate some ownership nonce to the above API as
well, thats easy to add.  The relevant part is the ability to exclude those who
are not owners (that is to say, block their progress until ownership is released
by a preceding entity).

> - The owner structure includes string to allow better debug and printing for humans. 
I've got no problem with any such internals, its really the synchronization that I'm after.

> Did you read it?
Yes, I don't see why you would think I hadn't, I think I've been very clear in
my understanding of you initial patch.  Have you taken the time to understand my
concerns? 

> I can add there an API that wait until the port ownership is released as you suggested in V2.
> 
I think that would be good.

> > > Each dpdk entity which wants to take ownership must to be able to
> > >synchronize the port configuration in its level.
> 
> > Can you elaborate on what you mean by level here?  Are you envisioning a
> > scheme in which multiple execution contexts might own a port for various
> > non-conflicting purposes?
>  
> Sure,
> 1) Application with 2 threads wanting to configure the same port:
> 	level = application code.
> 	
> 	a. The main thread should create owner identifier(rte_eth_dev_owner_new).
> 	b. The main thread should take the port ownership(rte_eth_dev_owner_set).
> 	c. Synchronization between the two threads should be done for the conflicted 		configurations by the application.
> 	d. when the application finishes the port usage it should release the owner(rte_eth_dev_owner_remove).
> 
> 2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for hotplug detections which can configure the 2 ports(by the host thread).
> 	Level = fail-safe code.
> 	a. Application starts the eal and the fail-safe driver probing function is called.
> 	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes ownership for them.
> 	c. Failsafe creates itself port and leaves it ownerless. 
> 	d. Failsafe starts the hotplug alarm mechanism.
> 	e. Application tries to take ownership for all ports and success only for failsafe port.
> 	f. Application start to configure the failsafe port asynchronously to failsafe hotplug alarm.
> 	g. Failsafe must use synchronization between failsafe alarm callback code and failsafe configuration APIs called by the application because they both try to configure the same sub-devices ports.
> 	h. When fail-safe finishes with the two sub devices it should release the ports owner.
> 
Ok, this I would describe as different use cases rather than parallel ownership,
in that in both cases there is still a single execution context which is
responsible for all aspects of a given port (which is fine with me, I'm just
trying to be clear in our description of the model).



> > >
> > > >
> > > > > > 2) Is device configuration racy with device addition/removal?
> > > > > > That is to say, can one thread remove a device while another
> > configures it.
> > > > >
> > > > > I think it is the same as two threads configuring the same device
> > > > > (item 4/ above). It can be managed with port ownership.
> > > > >
> > > > Only if you assert that application is required to have the owning
> > > > port be responsible for the ports deletion, which we can say, but
> > > > that leads to the issue above again.
> > > >
> > > >
> > > As Thomas said in item 2 the port creation must be synchronized by ethdev
> > and we need to add it there.
> > > I think it is obvious that port removal must to be done only by the port
> > owner.
> > >
> > You say that, but its obvious to you as a developer who has looked
> > extensively at the code.  It may well be less so to a consumer who is not an
> > active member of the community, for instance one who obtains the dpdk via
> > pre-built package.
> >
> 
> Yes I can understand, but new rules should be documented and be adjusted easy easy by the customers, no?
Ostensibly, it should be easy, yes, but in practice its a bit murkier.  For
instance, What if an application wants to enable packet capture on an interface
via rte_pdump_enable?  Does preforming that action require that the execution
context which calls that function own the port before doing so?  Digging through
the code suggests to me that it (suprisingly) does not, because all that
function does is set a socket to record packets too, but I would have
intuitively thought that enabling packet capture would require turning off the
mac filter table in the hardware, and so would have required ownership

Conversely, using the same example, calling rte_pdump_init, using the model from
your last patch, would require that the calling execution context ensured that
, at the time the cli application issued the monnitor request, that the port
be unowned, because the pdump main thread needs to set rx_tx callbacks on the
requested port, which I belive constitutes a configuration change needing port
ownership.

My point being, I think saying that ownership is easy and obvious isn't
accurate.  If we are to leave proper synchrnization of access to devices up to
the application, we either need to:

1) Assume downstream users are intimately familiar with the code
or
2) Exhaustively document the conditions under which ownership needs to be held

(1) is a non starter, and 2 I think is a fairly large undertaking, but unless we
are willing to codify synchronization in the code explicitly (via locking), (2)
is what we have to do.

Neil



More information about the dev mailing list