[dpdk-dev] [PATCH v4 10/11] eal: replace rte_panic instances in init sequence

Arnon Warshavsky arnon at qwilt.com
Fri Apr 20 15:55:49 CEST 2018


 Thanks Aaron

Previously, it wasn't possible for mem_cfg_fd to be reused after a

> failure.  Now it is - please reset it to -1. in these close conditions.
>
> Will do.

>
>
> Again, previously this would have aborted on a failure.  So it needs to
> be reset to a value that allows retry.
>

Same here

>
> >       switch (rte_config.process_type){
> >       case RTE_PROC_PRIMARY:
> > -             rte_eal_config_create();
> > +             if (rte_eal_config_create())
> > +                     return -1;
> >               break;
> >       case RTE_PROC_SECONDARY:
> > -             rte_eal_config_attach();
> > +             if (rte_eal_config_attach())
> > +                     return -1;
> >               rte_eal_mcfg_wait_complete(rte_config.mem_config);
> >               break;
> >       case RTE_PROC_AUTO:
> >       case RTE_PROC_INVALID:
>
> Not for this patch, but I just noticed that this should probably use a
> 'default' case.
>
> Will add this while Im here


>
> Use rte_eal_init_alert to indicate why you are failing the init.
>
Will do

>
> >       if (rte_mp_channel_init() < 0) {
> >               rte_eal_init_alert("failed to init mp channel\n");
> > @@ -652,7 +676,8 @@ static void rte_eal_init_alert(const char *msg)
> >
> >       eal_check_mem_on_local_socket();
> >
> > -     eal_thread_init_master(rte_config.master_lcore);
> > +     if (eal_thread_init_master(rte_config.master_lcore) != 0)
> > +             return -1;
>
> Is it ever possible to recover from this?


Definitely not recoverable, but not different than the other cases where
panic propagate all the way up rather than aborting

> Still needs
> rte_eal_init_alert() call.
>
Will do

>
>
> How are you cleaning up the threads that were spawned?  Lets say this
> loop will execute 5 times, and on the 3rd entry, these errors happen.
> You now leave DPDK 'half-initialized' - you've spun up threads and
> allocated memory.
>
...

>
> I don't see how any of this is better for the user.  In fact, I think
> this is worse because it will make portions of the application stop
> working without any way to move forward.  rte_panic() will at least give
> the process a chance to recover from a potentially ephemeral condition.
>
> As I wrote in a different reply on this patch
I was probably too eager to get rid of this panic taking some wrong
assumptions on the way the library will be called.
Removing the panic from the thread is obviously more complex and also ABI
breaking.
>From my own bw, I will not make it with a proper change to this version, so
I will revert back to panicing on this patchset
and aim for the thread in the next build.


>
> This seems to only exist as a way of triggering the run_once check in
> the eal_init.  It doesn't add anything except one more state variable to
> check against.  What is the purpose?
>

Actually this is not a run-once in purpose, rather an attempt to define a
state for the device
and on the way work around breaking abi on the the void function called
before that.

> +     if (rte_get_panic_state())
> +             return -1;
> +

Please just use run_once.  That's a better way of preventing this.
>


> As stated above - no a run-once
>
> All of the comments from the bsd side apply here.
>


More information about the dev mailing list