[dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Jan 18 15:17:02 CET 2018


Hi Matan,

> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Another thing - you'll probably need
> > > > > > > > > > > > > > > > > > > to
> > > > > grab/release
> > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > It is a public function used by
> > > > > > > > > > > > > > > > > > > drivers, so need to be protected
> > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Yes, I thought about it, but decided not
> > > > > > > > > > > > > > > > > > to use lock in
> > > > > > > next:
> > > > > > > > > > > > > > > > > > rte_eth_dev_allocated rte_eth_dev_count
> > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > As I can see in patch #3 you protect by
> > > > > > > > > > > > > > > > > lock access to rte_eth_dev_data[].name
> > > > > > > > > > > > > > > > > (which seems like a good
> > > > > > > thing).
> > > > > > > > > > > > > > > > > So I think any other public function that
> > > > > > > > > > > > > > > > > access rte_eth_dev_data[].name should be
> > > > > > > > > > > > > > > > > protected by the
> > > > > > > same
> > > > > > > > > lock.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I don't think so, I can understand to use
> > > > > > > > > > > > > > > > the ownership lock here(as in port
> > > > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > Don't you think it is just timing?(ask in
> > > > > > > > > > > > > > > > the next moment and you may get another
> > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > crash.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > As I understand rte_eth_dev_data[].name unique
> > > > > identifies
> > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > allocation/release/find
> > > > > functions.
> > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > "1. The port allocation and port release
> > > > > > > > > > > > > > > synchronization will be managed by ethdev."
> > > > > > > > > > > > > > > To me it means that ethdev layer has to make
> > > > > > > > > > > > > > > sure that all accesses to
> > > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > > atomic.
> > > > > > > > > > > > > > > Otherwise what would prevent the situation
> > > > > > > > > > > > > > > when one
> > > > > > > process
> > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Under race condition - in the worst case it might
> > > > > > > > > > > > > crash, though for that you'll have to be really unlucky.
> > > > > > > > > > > > > Though in most cases as you said it would just not
> > > > > > > > > > > > > operate
> > > > > > > correctly.
> > > > > > > > > > > > > I think if we start to protect dev->name by lock
> > > > > > > > > > > > > we need to do it for all instances (both read and write).
> > > > > > > > > > > > >
> > > > > > > > > > > > Since under the ownership rules, the user must take
> > > > > > > > > > > > ownership
> > > > > of a
> > > > > > > > > > > > port
> > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > >
> > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > >
> > > > > > > > > > So? The user still should take ownership of a device
> > > > > > > > > > before using it
> > > > > (by
> > > > > > > > > name or by port id).
> > > > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > > > >
> > > > > > > > > > > > Please, Can you describe specific crash scenario and
> > > > > > > > > > > > explain how could the
> > > > > > > > > > > locking fix it?
> > > > > > > > > > >
> > > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1
> > > > > > > > > > > >doing
> > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()-
> > > >strcmp().
> > > > > > > > > > > And because of race condition -
> > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > return
> > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > Which wrong device do you mean? I guess it is the device
> > > > > > > > > > which
> > > > > > > currently is
> > > > > > > > > being created by thread 0.
> > > > > > > > > > > Then rte_pmd_ring_remove() will call rte_free() for
> > > > > > > > > > > related resources, while It can still be in use by someone
> > else.
> > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity) must
> > > > > > > > > > take
> > > > > > > ownership
> > > > > > > > > > (or validate that he is the owner) of a port before
> > > > > > > > > > doing it(free,
> > > > > > > release), so
> > > > > > > > > no issue here.
> > > > > > > > >
> > > > > > > > > Forget about ownership for a second.
> > > > > > > > > Suppose we have a process it created ring port for itself
> > > > > > > > > (without
> > > > > setting
> > > > > > > any
> > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > rte_pmd_ring_remove()
> > > for it.
> > > > > > > > > At the same time second process decides to call
> > > > > rte_eth_dev_allocate()
> > > > > > > (let
> > > > > > > > > say for anither ring port).
> > > > > > > > > They could collide trying to read (process 0) and modify
> > > > > > > > > (process 1)
> > > > > same
> > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > >
> > > > > > > > Do you mean that process 0 will compare successfully the
> > > > > > > > process 1
> > > > > new
> > > > > > > port name?
> > > > > > >
> > > > > > > Yes.
> > > > > > >
> > > > > > > > The state are in local process memory - so process 0 will
> > > > > > > > not compare
> > > > > the
> > > > > > > process 1 port, from its point of view this port is in UNUSED
> > > > > > > > state.
> > > > > > > >
> > > > > > >
> > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > >
> > > > > > Someone in process 0 should attach it using protected
> > > > > > attach_secondary
> > > > > somewhere in your scenario.
> > > > >
> > > > > Yes, process 0 can have this port attached too, why not?
> > > > See the function with inline comments:
> > > >
> > > > struct rte_eth_dev *
> > > > rte_eth_dev_allocated(const char *name) {
> > > > 	unsigned i;
> > > >
> > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > >
> > > > 	    	The below state are in local process memory,
> > > > 		So, if here process 1 will allocate a new port (the current i),
> > > update its local state to ATTACHED and write the name,
> > > > 		the state is not visible by process 0 until someone in process
> > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > 		So, to use rte_eth_dev_attach_secondary process 0 must
> > > take the lock
> > > > and it can't, because it is currently locked by process 1.
> > >
> > > Ok I see.
> > > Thanks for your patience.
> > > BTW, that means that if let say process 0 will call
> > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > rte_eth_dev_allocate("yyy") we can endup with same port_id be used for
> > > different devices and 2 processes will overwrite the same
> > rte_eth_dev_data[port_id]?
> >
> > No, contrary to the state, the lock itself is in shared memory, so 2 processes
> > cannot allocate port in the same time.(you can see it in the next patch of this
> > series).

I am not talking about racing here.
Let say process 0 calls rte_pmd_ring_probe()->....->rte_eth_dev_allocate("xxx")
rte_eth_dev_allocate() finds that port N is 'free', i.e.
local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED
so it assigns new dev ("xxx") to port N.
Then after some time process 1 calls rte_pmd_ring_probe()->....->rte_eth_dev_allocate("yyy").
>From its perspective port N is still free:  rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED,
so it will assign new dev ("yyy") to the same port.
Konstantin
  

> >
> 
> Actually I think only one process(primary) should allocate ports, the others should attach them.
> The race of port allocation is only between the threads of the primary process.
> 
> 
> > > Konstantin
> > >
> > > >
> > > > 		if ((rte_eth_devices[i].state == RTE_ETH_DEV_ATTACHED)
> > > &&
> > > > 		strcmp(rte_eth_devices[i].data->name, name) == 0)
> > > > 			return &rte_eth_devices[i];
> > > > 	}
> > > > 	return NULL;
> > > >
> > > >



More information about the dev mailing list