[dpdk-dev] [PATCH 2/2] examples/kni: stop lcores while doing kni ops

Tomasz Duszynski tdu at semihalf.com
Wed Oct 25 11:28:46 CEST 2017


Hi Ferruh,

On Fri, Oct 20, 2017 at 05:42:13PM -0700, Ferruh Yigit wrote:
> On 10/16/2017 4:45 AM, Tomasz Duszynski wrote:
> > Since the transmit and receive functions should not be invoked when
> > the device is stopped, stop lcores during kni ops and restart them
> > after device is started once again.
>
> Hi Tomasz,
>
> Are you observing any error or unexpected behavior because of rx/tx functions? I
> am not sure about the patch, please check below logs, and trying to understand
> scope of the patch, can you please give more details what happens if this patch
> is missing?

Right, calling rx/tx functions after device was stopped will break
things.

For instace, putting interface up will call rte_eth_dev_stop() which
frees port resources. If that happens (and usually happens) during rx/tx
driver would break, as resources used in library functions calls
are bogus at that point.

So we end up with SIGSEGV.

According to dpdk documentation rx/tx functions cannot be called before
dev_start(). Thus I think cores that do rx/tx should be stopped just
before rte_eth_dev_stop() is called.

That could be fixed inside pmd but would require locks in rx/tx path which
is rather a no-go.

>
> >
> > Signed-off-by: Tomasz Duszynski <tdu at semihalf.com>
> > ---
> >  examples/kni/main.c | 28 ++++++++++++++++++++++++++++
> >  1 file changed, 28 insertions(+)
> >
> > diff --git a/examples/kni/main.c b/examples/kni/main.c
> > index cb48fb5..5c50448 100644
> > --- a/examples/kni/main.c
> > +++ b/examples/kni/main.c
> > @@ -166,6 +166,23 @@ static int kni_change_mtu(uint16_t port_id, unsigned int new_mtu);
> >  static int kni_config_network_interface(uint16_t port_id, uint8_t if_up);
> >
> >  static rte_atomic32_t kni_stop = RTE_ATOMIC32_INIT(0);
> > +static rte_atomic32_t kni_restart = RTE_ATOMIC32_INIT(0);
> > +
> > +static void
> > +kni_stop_lcores(void)
> > +{
> > +	unsigned int i;
> > +
> > +	rte_atomic32_inc(&kni_restart);
> > +	rte_atomic32_inc(&kni_stop);
> > +
> > +	RTE_LCORE_FOREACH(i) {
> > +		if (i == rte_lcore_id())
> > +			continue;
>
> This function called by port Rx core [1], and since the thread can't wait itself
> to finish, specially if nb_kni > 1, the Rx core still can do some work even
> after exit from this function.

Ack.

>
> > +
> > +		rte_eal_wait_lcore(i);
>
> The API documentation says: "To be executed on the MASTER lcore only."
> Not sure what happens when called from slave core, as we did here.

OK.

>
> > +	}
> > +}
> >
> >  /* Print out statistics on packets handled */
> >  static void
> > @@ -712,6 +729,7 @@ kni_change_mtu(uint16_t port_id, unsigned int new_mtu)
> >
> >  	RTE_LOG(INFO, APP, "Change MTU of port %d to %u\n", port_id, new_mtu);
> >
> > +	kni_stop_lcores();
> >  	/* Stop specific port */
> >  	rte_eth_dev_stop(port_id);
> >
> > @@ -755,6 +773,8 @@ kni_config_network_interface(uint16_t port_id, uint8_t if_up)
> >  	RTE_LOG(INFO, APP, "Configure network interface of %d %s\n",
> >  					port_id, if_up ? "up" : "down");
> >
> > +	kni_stop_lcores();
> > +
> >  	if (if_up != 0) { /* Configure network interface up */
> >  		rte_eth_dev_stop(port_id);
> >  		ret = rte_eth_dev_start(port_id);
> > @@ -911,6 +931,7 @@ main(int argc, char** argv)
> >  	}
> >  	check_all_ports_link_status(nb_sys_ports, ports_mask);
> >
> > +restart:
> >  	/* Launch per-lcore function on every lcore */
> >  	rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER);
> >  	RTE_LCORE_FOREACH_SLAVE(i) {
> > @@ -918,6 +939,13 @@ main(int argc, char** argv)
> >  			return -1;
> >  	}
> >
> > +	if (rte_atomic32_read(&kni_restart)) {
> > +		rte_atomic32_dec(&kni_stop);
> > +		rte_atomic32_dec(&kni_restart);
>
> kni_stop_lcores() called per port, so it is possible that kni_stop and
> kni_restart increased parallel, many times. But this decrement is per
> application, so they will be decremented sequentially, casing app stop - start
> unnecessarily.

Right.

>
> > +
> > +		goto restart;
>
> This will cause assigning tasks to cores again, and will produce all related
> logs again, if you enable debug logs you will see it [2]. I believe confusing to
> have those logs every time mtu updated etc...

Right.

>
> > +	}
> > +
> >  	/* Release resources */
> >  	for (port = 0; port < nb_sys_ports; port++) {
> >  		if (!(ports_mask & (1 << port)))
> > --
> > 2.7.4
> >
>
> [1]
> main_loop
>   kni_ingress
>     rte_kni_handle_request
>       kni_change_mtu
>       ||
>       kni_config_network_interface
>         kni_stop_lcores
>
>
> [2]
> APP: Change MTU of port 0 to 1402
> PMD: ixgbe_set_rx_function(): Vector rx enabled, please make sure RX burst size
> no less than 4 (port=0).
> PMD: ixgbe_dev_link_status_print():  Port 0: Link Down
> PMD: ixgbe_dev_link_status_print(): PCI Address: 0000:08:00.1
> APP: Lcore 1 is reading from port 0
> APP: Lcore 2 is writing to port 0
> APP: Lcore 3 is reading from port 1
> APP: Lcore 4 is writing to port 1
> APP: Lcore 5 has nothing to do
> APP: Lcore 6 has nothing to do
> APP: Lcore 7 has nothing to do
> APP: Lcore 8 has nothing to do
> APP: Lcore 9 has nothing to do
> APP: Lcore 10 has nothing to do
> APP: Lcore 11 has nothing to do
> APP: Lcore 12 has nothing to do
> APP: Lcore 13 has nothing to do
> APP: Lcore 14 has nothing to do
> APP: Lcore 15 has nothing to do
> APP: Lcore 16 has nothing to do
> APP: Lcore 17 has nothing to do
> APP: Lcore 18 has nothing to do
> APP: Lcore 19 has nothing to do
> APP: Lcore 20 has nothing to do
> APP: Lcore 21 has nothing to do
> APP: Lcore 22 has nothing to do
> APP: Lcore 23 has nothing to do
> APP: Lcore 24 has nothing to do
> APP: Lcore 25 has nothing to do
> APP: Lcore 26 has nothing to do
> APP: Lcore 27 has nothing to do
> APP: Lcore 28 has nothing to do
> APP: Lcore 29 has nothing to do
> APP: Lcore 30 has nothing to do
> APP: Lcore 31 has nothing to do
> APP: Lcore 0 has nothing to do

--
- Tomasz Duszyński


More information about the dev mailing list