[dpdk-dev] [PATCH] net/failsafe: stat support enhancement
    Matan Azrad 
    matan at mellanox.com
       
    Wed Sep  6 08:54:35 CEST 2017
    
    
  
Hi Gaetan,
> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet at 6wind.com]
> Sent: Tuesday, September 5, 2017 7:34 PM
> To: Matan Azrad <matan at mellanox.com>
> Cc: dev at dpdk.org
> Subject: Re: [PATCH] net/failsafe: stat support enhancement
> 
> On Tue, Sep 05, 2017 at 03:12:55PM +0000, Matan Azrad wrote:
> > Hi Gaetan,
> >
> > > -----Original Message-----
> > > From: Gaëtan Rivet [mailto:gaetan.rivet at 6wind.com]
> > > Sent: Tuesday, September 5, 2017 4:23 PM
> > > To: Matan Azrad <matan at mellanox.com>
> > > Cc: dev at dpdk.org
> > > Subject: Re: [PATCH] net/failsafe: stat support enhancement
> > >
> > > Hi Matan,
> > >
> > > On Tue, Sep 05, 2017 at 12:56:34PM +0300, Matan Azrad wrote:
> > > > The previous stats code returned only the current TX sub device stats.
> > > >
> > > > This enhancement extends it to return the sum of all sub devices
> > > > stats with history of removed sub-devices.
> > > >
> > > > Dedicated stats accumulator saves the stat history of all sub
> > > > device remove events.
> > > >
> > > > Each failsafe sub device contains the last stats asked by the user
> > > > and updates the accumulator in removal time.
> > > >
> > > > I would like to implement ultimate snapshot on removal time.
> > > > The stats_get API needs to be changed to return error in the case
> > > > it is too late to retrieve statistics.
> > >
> > > You need to have this API evolution first. It is not assured to be
> > > accepted by the community.
> > >
> >
> > I already sent suggestion to mailing list, find "stats_get API: return value
> suggestion":
> > No objections for now.
> >
> >
> > > As it is, this version is incorrect, because the only available
> > > stats for a removed slave is the last snapshot.
> > >
> >
> > It is for sure better from the previous version, the accuracy can be
> > better if my suggestion will be accepted.
> > If it is, I will improve it.
> >
> > > Thus it complicates the rules while still being incorrect. Even if
> > > you were able to push for this API evolution, PMDs with hard
> > > counters would be the ones to return an error code on stat read while
> removed.
> > > You may be lucky at the moment because MLX drivers do not support
> > > hard counters, but this won't always be true (and it will be false soon
> enough).
> > > You will thus be back at square one, with a new useless API and
> > > still incorrect stats in the fail-safe. On the other hand, the
> > > fail-safe should strive to be as easy to use as possible with most
> > > drivers, and not cater specifically to soft- counters ones.
> > >
> > > So, my take on this: I understand that aggregated stats would be
> interesting.
> > > Keep it simple & stupid: simple aggregation on stats_get.
> > > You will have a rollback at some point on device removal, but this
> > > is still easier to detect / deal with than partially / sometimes incorrect
> stats.
> > >
> >
> > 100% accuracy is impossible when removal event is in the game, but we
> > can do it better as we can.
> >
> > As I wrote in commit log(if stat gets will return value):
> > When some drivers can give the stats after RMV interrupt and before
> > dev_close, We just can to improve accuracy and be closer to 100%.
> > The stats of the sub devices which return error will be taken from the last
> snapshot.
> >
> > It is just enhancement which is important for customers.
> >
> 
> The trade-off is between accuracy and complexity, both at the fail-safe level
> and at the application level.
> 
>                   Your solution       |          KISS solution
>              -------------------------+--------------------------------
> Accuracy     Closer to reality.       | Farther from reality.
>              Upon device removal,     | Upon device removal,
>              only packets received    | all previous accounted received
>              since the last stats_get | packets are missing, inducing
>              are missing.             | a stats rollback from the app
>                                       | PoV.
>              -------------------------+--------------------------------
> Complexity   Medium implementation    | Simple implementation.
> (fast-path)  complexity.              | Two hotplug sequences will be
>              The accumulator induces  | identical (from a stats PoV).
>              a trailing effect        |
>              between device remove /  |
>              plugin, thus incurring   |
>              edge-effects and         |
>              reducing idempotency of  |
>              hotplug sequences.       |
>              -------------------------+--------------------------------
> Complexity   Statistical errors       | Glaring statistical errors
> (application) induced by device       | induced by device removal.
>              removal is hard.         | Detecting those (stats rollback)
>              Device removal is        | is trivial and can thus be
>              pretty much invisible.   | mitigated.
>                                       |
>              Plays nice with the      | While easier to deal with,
>              "seamless" feature of    | it actually pushes application
>              fail-safe, but might be  | to have custom code-path for
>              a curse in disguise.     | the fail-safe PMD, which goes
>                                       | against its design principles.
> 
> 
Thanks a lot for the analysis.
I agree very much with your complexity analysis, I have version to deliver the accumulator update 
function from the remove interrupt to the fs_stats_get function with more complexity in fs_stats_get.
Is it interesting you?
> I consider that your proposed API will only somewhat fix errors with PMDs
> having software counters. Thus it being accepted or rejected is irrelevant for
> this assessment.
And improve the applications (and failsafe) report about optional errors in the current stats_get.
> 
> While I strongly support the simplest solution for any implementation, I
> dislike going against the fail-safe core principles.
> 
> This is a very flimsy sheet of ice we are walking upon. Keep in mind that at
> the first complication due to your proposal, it will be reverted and the KISS
> solution used instead.
> 
But sometimes if the accuracy is just beside us and we can take it easily, 
why not?  
> But as it is, I thus accept it. I will do a proper review (there are a few things to
> fix still, now that the `why` is dealt with) soon.
> 
> > > > By this way, failsafe can get stats snapshot in removal interrupt
> > > > callback for each PMD which can give stats after removal event.
> > > >
> > > > Signed-off-by: Matan Azrad <matan at mellanox.com>
> > > > ---
> > > >  drivers/net/failsafe/failsafe_ether.c   | 33
> > > +++++++++++++++++++++++++++++++++
> > > >  drivers/net/failsafe/failsafe_ops.c     | 16 ++++++++++++----
> > > >  drivers/net/failsafe/failsafe_private.h |  5 +++++
> > > >  3 files changed, 50 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/net/failsafe/failsafe_ether.c
> > > > b/drivers/net/failsafe/failsafe_ether.c
> > > > index a3a8cce..133080d 100644
> > > > --- a/drivers/net/failsafe/failsafe_ether.c
> > > > +++ b/drivers/net/failsafe/failsafe_ether.c
> > > > @@ -399,6 +399,37 @@ failsafe_eth_dev_state_sync(struct
> > > > rte_eth_dev
> > > *dev)
> > > >  	return ret;
> > > >  }
> > > >
> > > > +void
> > > > +fs_increment_stats(struct rte_eth_stats *from, struct
> > > > +rte_eth_stats
> > > > +*to) {
> > > > +	uint8_t i;
> > > > +
> > > > +	RTE_ASSERT(from != NULL && to != NULL);
> > > > +	to->ipackets += from->ipackets;
> > > > +	to->opackets += from->opackets;
> > > > +	to->ibytes += from->ibytes;
> > > > +	to->obytes += from->obytes;
> > > > +	to->imissed += from->imissed;
> > > > +	to->ierrors += from->ierrors;
> > > > +	to->oerrors += from->oerrors;
> > > > +	to->rx_nombuf += from->rx_nombuf;
> > > > +	for (i = 0; i < RTE_ETHDEV_QUEUE_STAT_CNTRS; i++) {
> > > > +		to->q_ipackets[i] += from->q_ipackets[i];
> > > > +		to->q_opackets[i] += from->q_opackets[i];
> > > > +		to->q_ibytes[i] += from->q_ibytes[i];
> > > > +		to->q_obytes[i] += from->q_obytes[i];
> > > > +		to->q_errors[i] += from->q_errors[i];
> > > > +	}
> > > > +}
> > > > +
> > > > +void
> > > > +fs_increment_stats_accumulator(struct sub_device *sdev) {
> > > > +	fs_increment_stats(&sdev->stats_snapshot,
> > > > +			&PRIV(sdev->fs_dev)->stats_accumulator);
> > > > +	memset(&sdev->stats_snapshot, 0, sizeof(struct rte_eth_stats));
> > > > +}
> > > > +
> > > >  int
> > > >  failsafe_eth_rmv_event_callback(uint8_t port_id __rte_unused,
> > > >  				enum rte_eth_event_type event
> > > __rte_unused, @@ -410,6 +441,8 @@
> > > > failsafe_eth_rmv_event_callback(uint8_t port_id __rte_unused,
> > > >  	fs_switch_dev(sdev->fs_dev, sdev);
> > > >  	/* Use safe bursts in any case. */
> > > >  	set_burst_fn(sdev->fs_dev, 1);
> > > > +	/* Increment the stats accumulator by the last stats snapshot. */
> > > > +	fs_increment_stats_accumulator(sdev);
> > > >  	/*
> > > >  	 * Async removal, the sub-PMD will try to unregister
> > > >  	 * the callback at the source of the current thread context.
> > > > diff --git a/drivers/net/failsafe/failsafe_ops.c
> > > > b/drivers/net/failsafe/failsafe_ops.c
> > > > index ff9ad15..e47cc85 100644
> > > > --- a/drivers/net/failsafe/failsafe_ops.c
> > > > +++ b/drivers/net/failsafe/failsafe_ops.c
> > > > @@ -586,9 +586,14 @@ static void
> > > >  fs_stats_get(struct rte_eth_dev *dev,
> > > >  	     struct rte_eth_stats *stats)  {
> > > > -	if (TX_SUBDEV(dev) == NULL)
> > > > -		return;
> > > > -	rte_eth_stats_get(PORT_ID(TX_SUBDEV(dev)), stats);
> > > > +	struct sub_device *sdev;
> > > > +	uint8_t i;
> > > > +
> > > > +	memcpy(stats, &PRIV(dev)->stats_accumulator, sizeof(*stats));
> > > > +	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
> > > > +		rte_eth_stats_get(PORT_ID(sdev), &sdev->stats_snapshot);
> > > > +		fs_increment_stats(&sdev->stats_snapshot, stats);
> > > > +	}
> > > >  }
> > > >
> > > >  static void
> > > > @@ -597,8 +602,11 @@ fs_stats_reset(struct rte_eth_dev *dev)
> > > >  	struct sub_device *sdev;
> > > >  	uint8_t i;
> > > >
> > > > -	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE)
> > > > +	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
> > > >  		rte_eth_stats_reset(PORT_ID(sdev));
> > > > +		memset(&sdev->stats_snapshot, 0, sizeof(struct
> > > rte_eth_stats));
> > > > +	}
> > > > +	memset(&PRIV(dev)->stats_accumulator, 0, sizeof(struct
> > > > +rte_eth_stats));
> > > >  }
> > > >
> > > >  /**
> > > > diff --git a/drivers/net/failsafe/failsafe_private.h
> > > > b/drivers/net/failsafe/failsafe_private.h
> > > > index 0361cf4..267c749 100644
> > > > --- a/drivers/net/failsafe/failsafe_private.h
> > > > +++ b/drivers/net/failsafe/failsafe_private.h
> > > > @@ -102,6 +102,8 @@ struct sub_device {
> > > >  	uint8_t sid;
> > > >  	/* Device state machine */
> > > >  	enum dev_state state;
> > > > +	/* Last stats snapshot passed to user */
> > > > +	struct rte_eth_stats stats_snapshot;
> > > >  	/* Some device are defined as a command line */
> > > >  	char *cmdline;
> > > >  	/* fail-safe device backreference */ @@ -140,6 +142,7 @@ struct
> > > > fs_priv {
> > > >  	 * synchronized state.
> > > >  	 */
> > > >  	enum dev_state state;
> > > > +	struct rte_eth_stats stats_accumulator;
> > > >  	unsigned int pending_alarm:1; /* An alarm is pending */
> > > >  	/* flow isolation state */
> > > >  	int flow_isolated:1;
> > > > @@ -180,6 +183,8 @@ int failsafe_eal_uninit(struct rte_eth_dev
> > > > *dev);
> > > >
> > > >  int failsafe_eth_dev_state_sync(struct rte_eth_dev *dev);  void
> > > > failsafe_dev_remove(struct rte_eth_dev *dev);
> > > > +void fs_increment_stats(struct rte_eth_stats *from, struct
> > > > +rte_eth_stats *to); void fs_increment_stats_accumulator(struct
> > > > +sub_device *sdev);
> > > >  int failsafe_eth_rmv_event_callback(uint8_t port_id,
> > > >  				    enum rte_eth_event_type type,
> > > >  				    void *arg, void *out);
> > > > --
> > > > 2.7.4
> > > >
> > >
> > > --
> > > Gaëtan Rivet
> > > 6WIND
> 
> --
> Gaëtan Rivet
> 6WIND
Thanks,
Matan azrad
    
    
More information about the dev
mailing list