[dpdk-dev] rte_eal_hotplug_remove() generates error message

Hideyuki Yamashita yamashita.hideyuki at po.ntt-tx.co.jp
Mon Dec 17 12:03:11 CET 2018


> On 17-Dec-18 10:23 AM, Burakov, Anatoly wrote:
> > On 17-Dec-18 10:02 AM, Hideyuki Yamashita wrote:
> >> Dear Thomas and all,
> >>
> >> I took a look on dpdk code.
> >> I firstly write qustions and my analisys
> >> on the current dpdk code follows after that.
> >>
> >> [1.Questions]
> >> I have several questions to ask again.
> >> Is my understanding correct about followings
> >>
> >> Q1: "EAL:ERROR, Invalid memory" is ignorable
> >>
> >> Q2: there is no big difference between calling
> >> rte_eal_hotplug_remove(bus->name, dev->name)
> >> and
> >> rte_dev_remove(dev) because anyway it
> >> reaches to rte_pmd_vhost_remove and encounter
> >> the same error.
> >>
> >> [2.snip from my code]
> >> .....
> >> ???????? rte_eth_dev_close(port_id);
> >> ???????? ret = rte_dev_remove(dev);
> >> ???????? if (ret < 0)
> >> ???????????????? return ret;
> >> ???????? rte_eth_dev_release_port(&rte_eth_devices[port_id]);
> >>
> >> [3. My analysis on dpdk code]
> >> static int
> >> ?? rte_pmd_vhost_remove(struct rte_vdev_device *dev)
> >> ?? {
> >> ??? ...........
> >> ????????  eth_dev_close(eth_dev);
> >>
> >> ?????????? rte_free(vring_states[eth_dev->data->port_id]);
> >> ?????????? vring_states[eth_dev->data->port_id] = NULL;
> >>
> >> ?????????? rte_eth_dev_release_port(eth_dev);
> >>
> >> As you can see in rte_eth_vhost.c
> >> It calls both eth_dev_close and rte_eth_dev_release_port.
> >> And inside both functions, it tries to free mac_addrs.
> >> ???????? rte_free(dev->data->mac_addrs);?????? //in rth_dev_close
> >> ???????? rte_free(eth_dev->data->mac_addrs);? //in
> >> rte_eth_dev_release_port
> >>
> >> I understand that is the reason why
> >> /* Free the memory space back to heap */
> >> void rte_free(void *addr)
> >> {
> >> ???????? if (addr == NULL) return;
> >> ???????? if (malloc_heap_free(malloc_elem_from_data(addr)) < 0)
> >> ???????????????? RTE_LOG(ERR, EAL, "Error: Invalid memory\n");
> >> }
> >> encounter the error.
> >> As an experiment, I commented out one of them, "ERR, Invalid memory"
> >> disappered.
> >>
> >> Thanks and BR,
> >> Hideyuki Yamashita
> >> NTT TechnoCross
> >>
> >>> Adding my colleague Yasufumi and Hiroyuki as CC.
> >>>
> >>> We are waiting valuable advice from you.
> >>>
> >>> Thanks in advance,
> >>> Hideyuki Yamashita
> >>> NTT TechnoCross
> >>>
> >>>>
> >>>> Dear Thomas and all,
> >>>>
> >>>> I hope you all get safely back home after DPDK summit.
> >>>> (When I get back Japan, it is chilling. (start of winter))
> >>>>
> >>>> On DPDK 18.11.0, we tried to remove vhost device by using
> >>>> rte_eal_hotplug_remove().
> >>>> However, following syslog message is printed.
> >>>> “EAL: Error: Invalid memory”
> >>>>
> >>>> At DPDK summit San Jose, we had chance to ask Thomas how to handle
> >>>> the error message, and he gave us following advice:
> >>>> Replace “rte_eal_hotplug_add()” to “rte_dev_probe(devargs)” and
> >>>> “rte_eal_hotplug_remove()” to “rte_eth_dev_close() and
> >>>> rte_dev_remove(rte_dev)”
> >>>>
> >>>> We tested above changes, but the result is the same (i.e., same
> >>>> error message is printed).
> >>>> The debug log message says:
> >>>> ---
> >>>> [primary]
> >>>> VHOST_CONFIG: vhost-user server: socket created, fd: 17
> >>>> VHOST_CONFIG: bind to /tmp/sock0
> >>>> EAL: Error: Invalid memory
> >>>> VHOST_CONFIG: vhost-user server: socket created, fd: 17
> >>>> VHOST_CONFIG: bind to /tmp/sock0
> >>>>
> >>>> [secondary]
> >>>> APP: devargs=eth_vhost0,iface=/tmp/sock0,queues=1
> >>>> EAL: request: eal_dev_mp_request
> >>>> EAL: msg: eal_dev_mp_request
> >>>> EAL: request: bus_vdev_mp
> >>>> EAL: msg: bus_vdev_mp
> >>>> EAL: msg: bus_vdev_mp
> >>>> EAL: reply: eal_dev_mp_request
> >>>> EAL: msg: eal_dev_mp_request
> >>>> rte_eth_promiscuous_disable: Function not supported
> >>>> rte_eth_allmulticast_disable: Function not supported
> >>>> APP: To Server: add
> >>>> EAL: request: eal_dev_mp_request
> >>>> EAL: msg: eal_dev_mp_request
> >>>> EAL: reply: eal_dev_mp_request
> >>>> EAL: msg: eal_dev_mp_request
> >>>> APP: To Server: del
> >>>> APP: devargs=eth_vhost0,iface=/tmp/sock0,queues=1
> >>>> EAL: request: eal_dev_mp_request
> >>>> EAL: msg: eal_dev_mp_request
> >>>> EAL: request: bus_vdev_mp
> >>>> EAL: msg: bus_vdev_mp
> >>>> EAL: msg: bus_vdev_mp
> >>>> EAL: reply: eal_dev_mp_request
> >>>> EAL: msg: eal_dev_mp_request
> >>>> rte_eth_promiscuous_disable: Function not supported
> >>>> rte_eth_allmulticast_disable: Function not supported
> >>>> APP: To Server: add
> >>>> ---
> >>>>
> >>>> We would like to ask:
> >>>> 1)??? Is the message “EAL: Error: Invalid memory” ignorable or not?
> >>>> There is no obstacle except this message to re-add the vhost device.
> >>>> 2)??? Which is the better(best?) way to add/del vhost device
> >>>> “rte_eal_hotplug_add/remove()” or the way Thomas suggested?
> >>>>
> >>>> Thanks in advance and have a nice day.
> >>>>
> >>>> BR,
> >>>> Hideyuki Yamashita
> >>>> NTT TechnoCross
> >>>
> >>
> >>
> >>
> >
> > Hi Hideyuki,
> >
> > The error you're referring to (about invalid memory) means that you're
> > trying to free a pointer that points to invalid memory. Meaning, either
> > the pointer itself is not pointing to an allocated area, or it points to
> > memory that has already been freed.
> >
> > If dev->data->mac_addrs and eth_dev->data->mac_addrs point to the same
> > area, this is a bug, because this would lead to double free, and
> > rte_malloc will rightly complain about invalid memory. Now, malloc won't
> > try to do anything with the invalid memory, so the error itself is
> > harmless *as far as malloc is concerned*, but these attempts to free the
> > memory twice should be fixed whereever they happen.
> >
> > I'm not well-versed in dev infrastructure, so i wouldn't be able to say
> > which one of the rte_free calls is an extra, unneeded one. This is
> > something e.g. Thomas could help with, or the driver maintainer.
> > 
> ...with that said, to me, this looks suspicious:
> 
> 
>  >          rte_eth_dev_close(port_id);
>  >          ret = rte_dev_remove(dev);
>  >          if (ret < 0)
>  >                  return ret;
>  >          rte_eth_dev_release_port(&rte_eth_devices[port_id]);
> 
> So, you're closing the port, then you remove the device, then you  release port. From your own analysis, what dev_remove does is the following:
> 
>  >    rte_pmd_vhost_remove(struct rte_vdev_device *dev)
>  >    {
>  >     ...........
>  >           eth_dev_close(eth_dev);
>  >
>  >            rte_free(vring_states[eth_dev->data->port_id]);
>  >            vring_states[eth_dev->data->port_id] = NULL;
>  >
>  >            rte_eth_dev_release_port(eth_dev);
> 
> So, you have duplicate call to eth_dev_close() (which is likely not a  problem), and a duplicate call to rte_eth_dev_release_port(). So, if  rte_dev_remove() already calls rte_eth_dev_release_port(), why are you  calling it in your code?
> 
> Now, as i said above, i'm not well-versed enough in the dev  infrastructure to know which of the calls should go where, but i'm  pretty sure you shouldn't call rte_eth_dev_release_port twice - this is  likely what's causing your double-free.
> 
> -- Thanks,
> Anatoly

Hello Anatoly,

You are right, I should not call rte_eth_dev_release_port.
As an experiment, I commented out rte_eth_dev_release_port.
But the same error remains.
(I compiled my app(SPP) after the modification)

VHOST_CONFIG: vhost-user server: socket created, fd: 15
VHOST_CONFIG: bind to /tmp/sock0
EAL: Error: Invalid memory

So I think something is still there in DPDK.

BR,
Hideyuki Yamashita
NTT TechnoCross



More information about the dev mailing list