[dpdk-stable] [PATCH v2] doc/compress: clarify error handling on data-plane

Trahe, Fiona fiona.trahe at intel.com
Wed May 8 16:00:56 CEST 2019


HI Shally,

> -----Original Message-----
> From: Shally Verma [mailto:shallyv at marvell.com]
> Sent: Wednesday, May 8, 2019 1:41 PM
> To: Trahe, Fiona <fiona.trahe at intel.com>; dev at dpdk.org
> Cc: akhil.goyal at nxp.com; Ashish Gupta <ashishg at marvell.com>; Daly, Lee <lee.daly at intel.com>; Sunila
> Sahu <ssahu at marvell.com>; stable at dpdk.org
> Subject: RE: [PATCH v2] doc/compress: clarify error handling on data-plane
> 
> Hi Fiona
> 
> 
> > -----Original Message-----
> > From: Trahe, Fiona <fiona.trahe at intel.com>
> > Sent: Tuesday, May 7, 2019 11:54 PM
> > To: Shally Verma <shallyv at marvell.com>; dev at dpdk.org
> > Cc: akhil.goyal at nxp.com; Ashish Gupta <ashishg at marvell.com>; Daly, Lee
> > <lee.daly at intel.com>; Sunila Sahu <ssahu at marvell.com>; stable at dpdk.org;
> > Trahe, Fiona <fiona.trahe at intel.com>
> > Subject: RE: [PATCH v2] doc/compress: clarify error handling on data-plane
> >
> > Hi Shally
> >
> > > > > > +
> > > > > > +There are some exceptions whereby errors can occur on the
> > > > ``enqueue``.
> > > > > > +For any error which can occur in a production environment and
> > > > > > +can be successful after a retry with the same op the PMD may
> > > > > > +return the error on the enqueue.
> > > > > This statement looks bit confusing.
> > > > > Seems like we are trying to add a description regarding op status
> > > > > check even after the enqueue call unlike current scenario, where
> > > > > app only check for it after dequeue?
> > > > [Fiona] The line following this explains that there is no need to
> > > > check op.status in this case.
> > > > Maybe it's not obvious that the application SHOULD check that all
> > > > ops are enqueued?
> > > > I can reword as:
> > > > The application should always check the value returned by the enqueue.
> > > > If less than the full burst is enqueued there's no need for the
> > > > application to check op.status of any or every op - it can simply
> > > > retry from the return
> > > > value+1 in a later enqueue and expect success.
> > > >
> > >  I agree to purpose of patch but have these confusions when I read
> > description above:
> > >
> > > My understand is , if op status is INVALID_ARGS or any ERROR which is
> > > permanent in nature, Then nb_enqd return will be less than actually
> > passed.
> > [Fiona] True.
> >
> > > Regardless of whatever reason, if any time app gets nb_enqd < actually
> > > passed, then app should check status of nb_enqd + 1th op
> > [Fiona]. No, that's exactly what I was proposing to avoid.
> >
> > > to find exact cause of failure and then either attempt re-enqueue Or
> > > correct op preparation or take any other appropriate action.
> > [Fiona] I was proposing to constrain PMDs to only return a subset of errors
> > on the enqueue, so apps could be optimised.
> > But if you think it's not possible for PMDs to comply with it, then yes, apps
> > would always have to check status of nb_enqd + 1th op, and fork depending
> > on the status.
> > Is this the case?
> > If so, much of this patch is unnecessary and I'll send a simplified v3 as almost
> > any status can be returned anywhere.
> >
> [Shally] Okay. I seem to understand it now.
> Purpose seem reasonable just a simpler rephrase would help.
> It will be easier for me to further feedback on 1st v2 patch sent. So will send it another email.
[Fiona] ok, will look for that.
 

> > > Also, STATUS_ERROR is very generic, it can be when queue is full in
> > > which case app can re-attempt an enqueue of same op
> >  OR
> > > It can also indicate any irrecoverable error on enqueue, in which app
> > > just probably has to reset everything. For such kind of case, it might
> > > not be possible for PMD design to even push it into completion queue
> > > for an app to dequeue .  I would suggest  add another status code type
> > > which reflect permanent error condition i.e. irrecoverable error code
> > > which tells an app to perform PMD qp reset/re-init to recover and simplify
> > description just to state an expected APP behavior to avoid infinite loop
> > condition.
> > > It is then an app choice whether or not to check for op status for
> > > error after enqueue depending on whether its running in production
> > environment or dev environment.
> > [Fiona] I wouldn't expect ERROR in a queue full case. I'd see ERROR as the
> > catch-all when some other specific status isn't appropriate. If you think
> > there's a need for another specific status then best send an API patch
> > proposing it. This patch is only documenting the existing set.
> 
> [Shally] Sorry, I missed. STATUS_NOT_PROCESSED can be indication of queue_full.
> STATUS_ERROR on dequeue = any catch-all error case
> STATUS_ERROR on enqueue = any irrecoverable error on op. app should not attempt same op or may be
> reset queue pair or PMD.
> 
> Is this interpretation correct?
[Fiona] Yes


More information about the stable mailing list