[dpdk-dev] eventdev: method for finding out unlink status

Jerin Jacob jerin.jacob at caviumnetworks.com
Mon Jul 30 12:36:41 CEST 2018


-----Original Message-----
> Date: Mon, 30 Jul 2018 09:38:01 +0000
> From: "Van Haaren, Harry" <harry.van.haaren at intel.com>
> To: Jerin Jacob <jerin.jacob at caviumnetworks.com>, "Elo, Matias (Nokia -
>  FI/Espoo)" <matias.elo at nokia.com>
> CC: "dev at dpdk.org" <dev at dpdk.org>
> Subject: RE: [dpdk-dev] eventdev: method for finding out unlink status
> 
> 
> > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > Sent: Monday, July 30, 2018 10:29 AM
> > To: Elo, Matias (Nokia - FI/Espoo) <matias.elo at nokia.com>
> > Cc: dev at dpdk.org; Van Haaren, Harry <harry.van.haaren at intel.com>
> > Subject: Re: [dpdk-dev] eventdev: method for finding out unlink status
> >
> > -----Original Message-----
> > > Date: Mon, 30 Jul 2018 09:17:47 +0000
> > > From: "Elo, Matias (Nokia - FI/Espoo)" <matias.elo at nokia.com>
> > > To: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> > > CC: "dev at dpdk.org" <dev at dpdk.org>, "Van Haaren, Harry"
> > >  <harry.van.haaren at intel.com>
> > > Subject: Re: [dpdk-dev] eventdev: method for finding out unlink status
> > > x-mailer: Apple Mail (2.3445.9.1)
> > >
> > >
> > > >>
> > > >> In bug report https://bugs.dpdk.org/show_bug.cgi?id=60 we have been
> > discussing
> > > >> issues related to events ending up in wrong ports after calling
> > > >> rte_event_port_unlink(). In addition of finding few bugs we have
> > identified a
> > > >> need for a new API call (or documentation extension) for an application
> > to be
> > > >
> > > > From HW perspective, documentation extension should be enough. adding
> > > > "there may be pre-scheduled events and the application is responsible to
> > process them"
> > > > on unlink(). Since dequeue() has which queue it is dequeue-ed from, the
> > > > application can allays make action based on that(i.e, Is the event
> > > > post/pre to unlink)
> > >
> > > At least in case of SW eventdev the problem is how the application can know
> > that
> > > it has processed all pre-scheduled events. E.g. dequeue may return nothing
> > but since
> > > the scheduler is running as a separate process events may still end up to
> > the unlinked
> > > port asynchronously.
> >
> > Can't we do, dequeue() in loop to get all the events from port. If
> > dequeue returns with zero event then ports are drained up. Right?
> 
> Nope - because the scheduler might not have performed and "Acked" the
> unlink(), and internally it has *just* scheduled an event, but it wasn't
> available in the dequeue ring yet.
> 
> Aka, its racy behavior - and we need a way to retrieve this "Unlink Ack"
> from the scheduler (which runs in another thread in event/sw).

OK. Some bits specific to event/sw. We will address it.

> 
> 
> > > >> able to find out when an unlink() call has finished and no new events are
> > > >> scheduled anymore to the particular event port. This is required e.g.
> > when doing
> > > >> clean-up after an application thread stops processing events.
> > > >
> > > > If thread stopping then it better to call dev_stop(). At least in HW
> > > > implementation,
> > >
> > > For an application doing dynamic load balancing stopping the whole eventdev
> > is not an
> > > option.
> >
> > OK. Makes sense. Doing unlink() and link() in fastpath is not a
> > problem.
> 
> Correct
> 
> 
> > Changing core assignment to event port is problem without stop(). I
> > guess, you
> > application or general would be OK with that constraint.
> 
> 
> I don't think that the eventdev API requires 1:1 Lcore / Port mapping, so really a
> PMD should be able to handle any thread calling any port.
> 
> The event/sw PMD allows any thread to call dequeue/enqueue any port,
> so long as it is not being accessed by another thread.

Yes. True. Eventdev API does not required 1:1 Lcore/Port mapping.
Just like event/sw requires some bits to clear "Unlink Ack". At least,
our HW implementation we need some bit clear when we change lcore to port
mapping. Currently we are doing it in stop() call, If there is a real valid use
case to change lcore to port mapping without stop, we would like to
propose and API to flush/clear state on Lcore/port mapping change.
It can be NOP for event/sw.

> 
> 
> > > > A given event port assigned to a new lcore other than
> > > > it previous one then we need to do some clean up at port level.
> > >
> > >  In my case I'm mapping an event port per thread statically (basically
> > thread_id == port_id),
> > > so this shouldn't be an issue.
> 
> This is the common case - but I don't think we should demand it.
> There is a valid scale-down model which just polls *all* ports using
> a single lcore, instead of unlink() of multiple ports.
> 
> 
> For this "runtime scale down" use-case the missing information is being
> able to identify when an unlink is complete. After that (and ensuring the
> port buffer is empty) the application can be guaranteed that there are no
> more events going to be sent to that port, and the application can take
> the worker lcore out of its polling-loop and put it to sleep.
> 
> As mentioned before, I think an "unlinks_in_progress()" function is perhaps
> the easiest way to achieve this functionality, as it allows relatively simple
> tracking of unlinks() using an atomic counter in sw. (Implementation details
> become complex when we have a separate core running event/sw, separate cores
> polling, and a control-plane thread calling unlink...)
> 
> I think the end result we're hoping for is something like pseudo code below,
> (keep in mind that the event/sw has a service-core thread running it, so no
> application code there):
> 
> int worker_poll = 1;
> 
> worker() {
>   while(worker_poll) {
>      // eventdev_dequeue_burst() etc
>   }
>   go_to_sleep(1);
> }
> 
> control_plane_scale_down() {
>   unlink(evdev, worker, queue_id);
>   while(unlinks_in_progress(evdev) > 0)
>       usleep(100);
> 
>   /* here we know that the unlink is complete.
>    * so we can now stop the worker from polling */
>   worker_poll = 0;
> }


Make sense. Instead of rte_event_is_unlink_in_progress(), How about
adding a callback in rte_event_port_unlink() which will be called on 
unlink completion. It will reduce the need for ONE more API.

Anyway it RC2 now, so we can not accept a new feature. So we will have
time for deprecation notice.


> 
> Hope my pseudo-code makes pseudo-sense :)
> 
> -Harry


More information about the dev mailing list