[dpdk-dev] eventdev: method for finding out unlink status

Liang, Ma liang.j.ma at intel.com
Mon Jul 30 17:32:56 CEST 2018


On 30 Jul 16:06, Jerin Jacob wrote:
> -----Original Message-----
> > Date: Mon, 30 Jul 2018 09:38:01 +0000
> > From: "Van Haaren, Harry" <harry.van.haaren at intel.com>
> > To: Jerin Jacob <jerin.jacob at caviumnetworks.com>, "Elo, Matias (Nokia -
> >  FI/Espoo)" <matias.elo at nokia.com>
> > CC: "dev at dpdk.org" <dev at dpdk.org>
> > Subject: RE: [dpdk-dev] eventdev: method for finding out unlink status
> > 
> > 
> > > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > > Sent: Monday, July 30, 2018 10:29 AM
> > > To: Elo, Matias (Nokia - FI/Espoo) <matias.elo at nokia.com>
> > > Cc: dev at dpdk.org; Van Haaren, Harry <harry.van.haaren at intel.com>
> > > Subject: Re: [dpdk-dev] eventdev: method for finding out unlink status
> > >
> > > -----Original Message-----
> > > > Date: Mon, 30 Jul 2018 09:17:47 +0000
> > > > From: "Elo, Matias (Nokia - FI/Espoo)" <matias.elo at nokia.com>
> > > > To: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> > > > CC: "dev at dpdk.org" <dev at dpdk.org>, "Van Haaren, Harry"
> > > >  <harry.van.haaren at intel.com>
> > > > Subject: Re: [dpdk-dev] eventdev: method for finding out unlink status
> > > > x-mailer: Apple Mail (2.3445.9.1)
> > > >
> > > >
> > > > >>
> > > > >> In bug report https://bugs.dpdk.org/show_bug.cgi?id=60 we have been
> > > discussing
> > > > >> issues related to events ending up in wrong ports after calling
> > > > >> rte_event_port_unlink(). In addition of finding few bugs we have
> > > identified a
> > > > >> need for a new API call (or documentation extension) for an application
> > > to be
> > > > >
> > > > > From HW perspective, documentation extension should be enough. adding
> > > > > "there may be pre-scheduled events and the application is responsible to
> > > process them"
> > > > > on unlink(). Since dequeue() has which queue it is dequeue-ed from, the
> > > > > application can allays make action based on that(i.e, Is the event
> > > > > post/pre to unlink)
> > > >
> > > > At least in case of SW eventdev the problem is how the application can know
> > > that
> > > > it has processed all pre-scheduled events. E.g. dequeue may return nothing
> > > but since
> > > > the scheduler is running as a separate process events may still end up to
> > > the unlinked
> > > > port asynchronously.
> > >
> > > Can't we do, dequeue() in loop to get all the events from port. If
> > > dequeue returns with zero event then ports are drained up. Right?
> > 
> > Nope - because the scheduler might not have performed and "Acked" the
> > unlink(), and internally it has *just* scheduled an event, but it wasn't
> > available in the dequeue ring yet.
> > 
> > Aka, its racy behavior - and we need a way to retrieve this "Unlink Ack"
> > from the scheduler (which runs in another thread in event/sw).
> 
> OK. Some bits specific to event/sw. We will address it.
BTW: OPDL is not support unlink in runtime. so if we need suggest user do a query to the CAP bits first. 
> 
> > 
> > 
> > > > >> able to find out when an unlink() call has finished and no new events are
> > > > >> scheduled anymore to the particular event port. This is required e.g.
> > > when doing
> > > > >> clean-up after an application thread stops processing events.
> > > > >
> > > > > If thread stopping then it better to call dev_stop(). At least in HW
> > > > > implementation,
> > > >
> > > > For an application doing dynamic load balancing stopping the whole eventdev
> > > is not an
> > > > option.
> > >
> > > OK. Makes sense. Doing unlink() and link() in fastpath is not a
> > > problem.
> > 
> > Correct
> > 
> > 
> > > Changing core assignment to event port is problem without stop(). I
> > > guess, you
> > > application or general would be OK with that constraint.
> > 
> > 
> > I don't think that the eventdev API requires 1:1 Lcore / Port mapping, so really a
> > PMD should be able to handle any thread calling any port.
> > 
> > The event/sw PMD allows any thread to call dequeue/enqueue any port,
> > so long as it is not being accessed by another thread.
> 
> Yes. True. Eventdev API does not required 1:1 Lcore/Port mapping.
> Just like event/sw requires some bits to clear "Unlink Ack". At least,
> our HW implementation we need some bit clear when we change lcore to port
> mapping. Currently we are doing it in stop() call, If there is a real valid use
> case to change lcore to port mapping without stop, we would like to
> propose and API to flush/clear state on Lcore/port mapping change.
> It can be NOP for event/sw.
> 
> > 
> > 
> > > > > A given event port assigned to a new lcore other than
> > > > > it previous one then we need to do some clean up at port level.
> > > >
> > > >  In my case I'm mapping an event port per thread statically (basically
> > > thread_id == port_id),
> > > > so this shouldn't be an issue.
> > 
> > This is the common case - but I don't think we should demand it.
> > There is a valid scale-down model which just polls *all* ports using
> > a single lcore, instead of unlink() of multiple ports.
> > 
> > 
> > For this "runtime scale down" use-case the missing information is being
> > able to identify when an unlink is complete. After that (and ensuring the
> > port buffer is empty) the application can be guaranteed that there are no
> > more events going to be sent to that port, and the application can take
> > the worker lcore out of its polling-loop and put it to sleep.
> > 
> > As mentioned before, I think an "unlinks_in_progress()" function is perhaps
> > the easiest way to achieve this functionality, as it allows relatively simple
> > tracking of unlinks() using an atomic counter in sw. (Implementation details
> > become complex when we have a separate core running event/sw, separate cores
> > polling, and a control-plane thread calling unlink...)
> > 
> > I think the end result we're hoping for is something like pseudo code below,
> > (keep in mind that the event/sw has a service-core thread running it, so no
> > application code there):
> > 
> > int worker_poll = 1;
> > 
> > worker() {
> >   while(worker_poll) {
> >      // eventdev_dequeue_burst() etc
> >   }
> >   go_to_sleep(1);
> > }
> > 
> > control_plane_scale_down() {
> >   unlink(evdev, worker, queue_id);
> >   while(unlinks_in_progress(evdev) > 0)
> >       usleep(100);
> > 
> >   /* here we know that the unlink is complete.
> >    * so we can now stop the worker from polling */
> >   worker_poll = 0;
> > }
> 
> 
> Make sense. Instead of rte_event_is_unlink_in_progress(), How about
> adding a callback in rte_event_port_unlink() which will be called on 
> unlink completion. It will reduce the need for ONE more API.
> 
> Anyway it RC2 now, so we can not accept a new feature. So we will have
> time for deprecation notice.
> 
> 
> > 
> > Hope my pseudo-code makes pseudo-sense :)
> > 
> > -Harry


More information about the dev mailing list