[dpdk-dev] [PATCH 04/22] ethdev: enable hotplug on multi-process
Zhang, Qi Z
qi.z.zhang at intel.com
Tue Jun 19 05:22:53 CEST 2018
> -----Original Message-----
> From: Burakov, Anatoly
> Sent: Monday, June 18, 2018 4:18 PM
> To: Zhang, Qi Z <qi.z.zhang at intel.com>; thomas at monjalon.net
> Cc: Ananyev, Konstantin <konstantin.ananyev at intel.com>; dev at dpdk.org;
> Richardson, Bruce <bruce.richardson at intel.com>; Yigit, Ferruh
> <ferruh.yigit at intel.com>; Shelton, Benjamin H
> <benjamin.h.shelton at intel.com>; Vangati, Narender
> <narender.vangati at intel.com>
> Subject: Re: [PATCH 04/22] ethdev: enable hotplug on multi-process
>
> On 07-Jun-18 1:38 PM, Qi Zhang wrote:
> > The patch introduce the solution to handle different hotplug cases in
> > multi-process situation, it include below scenario:
> >
> > 1. Attach a share device from primary
> > 2. Detach a share device from primary
> > 3. Attach a share device from secondary 4. Detach a share device from
> > secondary 5. Attach a private device from secondary 6. Detach a
> > private device from secondary 7. Detach a share device from secondary
> > privately 8. Attach a share device from secondary privately
> >
> > In primary-secondary process model, we assume device is shared by
> default.
> > that means attach or detach a device on any process will broadcast to
> > all other processes through mp channel then device information will be
> > synchronized on all processes.
> >
> > Any failure during attaching process will cause inconsistent status
> > between processes, so proper rollback action should be considered.
> > Also it is not safe to detach a share device when other process still
> > use it, so a handshake mechanism is introduced, it will be implemented
> > in following separate patch.
> >
> > Scenario for Case 1, 2:
> >
> > attach device
> > a) primary attach the new device if failed goto h).
> > b) primary send attach sync request to all secondary.
> > c) secondary receive request and attach device and send reply.
> > d) primary check the reply if all success go to i).
> > e) primary send attach rollback sync request to all secondary.
> > f) secondary receive the request and detach device and send reply.
> > g) primary receive the reply and detach device as rollback action.
> > h) attach fail
> > i) attach success
> >
> > detach device
> > a) primary perform pre-detach check, if device is locked, goto i).
> > b) primary send pre-detach sync request to all secondary.
> > c) secondary perform pre-detach check and send reply.
> > d) primary check the reply if any fail goto i).
> > e) primary send detach sync request to all secondary
> > f) secondary detach the device and send reply (assume no fail)
> > g) primary detach the device.
> > h) detach success
> > i) detach failed
> >
> > Case 3, 4:
> > This will be implemented in following patch.
>
> If these will be implemented in following patch, why spend half the commit
> message talking about it? :)
Sorry, I didn't get your point about "see half commit to talk about it" :)
This patch covered an overview, and also the implementation of case 1,2,5,6,7,8
For case 3, 4, just below 4 lines to describe it
3. Attach a share device from secondary.
4. Detach a share device from secondary.
Case 3, 4:
This will be implemented in following patch.
> is commit doesn't implement secondary
> process functionality at all, so the commit message should probably be
> reworded to only include primary process logic, no?
OK, I will reword it to highlight the patch's scope as description at above.
>
> >
> > Case 5, 6:
> > Secondary process can attach private device which only visible to itself,
> > in this case no IPC is involved, primary process is not allowed to have
> > private device so far.
> >
> > Case 7, 8:
> > Secondary process can also temporally to detach a share device "privately"
> > then attach it back later, this action also not impact other processes.
> >
> > APIs chenages:
>
> Multiple typos - "chenages", "temporally", "allowd", etc.
Thanks
>
> >
> > rte_eth_dev_attach and rte_eth_dev_attach are extended to support
> > share device attach/detach in primary-secondary process model, it will
> > be called in case 1,2,3,4.
> >
> > New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
> > introduced to cover case 5,6,7,8, this API can only be invoked in secondary
> > process.
> > > Signed-off-by: Qi Zhang <qi.z.zhang at intel.com>
> > ---
>
> <snip>
>
> > rte_eal_mcfg_complete();
> >
> > + if (rte_eth_dev_mp_init()) {
> > + rte_eal_init_alert("rte_eth_dev_mp_init() failed\n");
> > + rte_errno = ENOEXEC;
> > + return -1;
> > + }
> > +
>
> Why is this done after the end of init? rte_eal_mcfg_complete() makes it
> so that secondaries can initialize, at that point all initialization
> should have been finished. I would expect this to be called after
> (before?) bus probe, since this is device-related.
OK will move ahead.
>
> > return fctret;
> > }
> >
> > diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile
> > index c2f2f7d82..04e93f337 100644
> > --- a/lib/librte_ethdev/Makefile
> > +++ b/lib/librte_ethdev/Makefile
> > @@ -19,6 +19,7 @@ EXPORT_MAP := rte_ethdev_version.map
> > LIBABIVER := 9
> >
>
> <snip>
>
> > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +
> > + /**
> > + * If secondary process, we just send request to primray
> > + * to start the process.
> > + */
> > + req.t = REQ_TYPE_ATTACH;
> > + strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN);
> > +
> > + ret = rte_eth_dev_request_to_primary(&req);
> > + if (ret) {
> > + ethdev_log(ERR, "Failed to send device attach request to
> primary\n");
>
> The log message is a little misleading. It can be that secondary has
> failed to send request. It can also be that it succeeded, but the attach
> itself has failed. I think a better message would be "attach request has
> failed" or something to that effect.
The return value of rte_eth_dev_request_to_primary only means communication fail,
(message not able to send, or not get reply in time).
but not the fail on attach/detach itself. (which comes from req->result)
>
> > + return ret;
> > + }
> > +
> > + *port_id = req.port_id;
> > + return req.result;
> > + }
> > +
> > + ret = do_eth_dev_attach(devargs, port_id);
> > + if (ret)
> > + return ret;
> > +
> > + /* send attach request to seoncary */
> > + req.t = REQ_TYPE_ATTACH;
> > + strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN);
> > + req.port_id = *port_id;
> > + ret = rte_eth_dev_request_to_secondary(&req);
> > + if (ret) {
> > + ethdev_log(ERR, "Failed to send device attach request to
> secondary\n");
>
> Same as above - log message can/might be misleading. There are a few
> other places where similar log message is present, those should be
> corrected too.
Same as above
>
> > + goto rollback;
> > + }
> > +
> > + if (req.result)
> > + goto rollback;
> > +
> > + return 0;
>
> <snip>
>
> > +{
> > + uint32_t dev_flags;
> > +
> > + if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > + return -ENOTSUP;
> > +
> > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> > +
> > + dev_flags = rte_eth_devices[port_id].data->dev_flags;
> > + if (dev_flags & RTE_ETH_DEV_BONDED_SLAVE) {
> > + ethdev_log(ERR,
> > + "Port %" PRIu16 " is bonded, cannot detach", port_id);
> > + return -ENOTSUP;
> > + }
>
> Do we have to do a similar check for failsafe devices?
Just keep it same logic as before, it could be a separate patch to fix I guess.
>
> > +
> > + return do_eth_dev_detach(port_id);
> > +}
> > +
> > static int
> > rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t
> nb_queues)
> > {
> > diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> > index 36e3984ea..bb03d613b 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
>
> <snip>
>
> > /**
> > + * Attach a private Ethernet device specified by arguments.
> > + * A private device is invisible to other process.
> > + * Can only be invoked in secondary process.
> > + *
> > + * @param devargs
> > + * A pointer to a strings array describing the new device
> > + * to be attached. The strings should be a pci address like
> > + * '0000:01:00.0' or virtual device name like 'net_pcap0'.
> > + * @param port_id
> > + * A pointer to a port identifier actually attached.
> > + * @return
> > + * 0 on success and port_id is filled, negative on error
> > + */
> > +int rte_eth_dev_attach_private(const char *devargs, uint16_t *port_id);
>
> New API's should be marked as __rte_experimental.
OK
>
> > +
> > +/**
> > * Detach a Ethernet device specified by port identifier.
> > * This function must be called when the device is in the
> > * closed state.
> > + * In multi-process mode, it will sync with other process
> > + * to detach the device.
> > *
> > * @param port_id
> > * The port identifier of the device to detach.
> > @@ -1490,6 +1511,22 @@ int rte_eth_dev_attach(const char *devargs,
> uint16_t *port_id);
>
> <snip>
>
> > + * Detach a Ethernet device in current process.
> > + *
> > + * @param port_id
> > + * The port identifier of the device to detach.
> > + * @param devname
> > + * A pointer to a buffer that will be filled with the device name.
> > + * This buffer must be at least RTE_DEV_NAME_MAX_LEN long.
> > + * @return
> > + * 0 on success and devname is filled, negative on error
> > + */
> > +int do_eth_dev_detach(uint16_t port_id);
> > +
>
> Why is this made part of an external API? You should have a separate,
> private header file for these.
OK, will add to ethdev_private.h in v2.
>
> > #ifdef __cplusplus
> > }
> > #endif
> > diff --git a/lib/librte_ethdev/rte_ethdev_mp.c
> b/lib/librte_ethdev/rte_ethdev_mp.c
> > new file mode 100644
> > index 000000000..8ede8151d
> > --- /dev/null
> > +++ b/lib/librte_ethdev/rte_ethdev_mp.c
> > @@ -0,0 +1,195 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2010-2018 Intel Corporation
> > + */
> > +
> > +#include "rte_ethdev_driver.h"
> > +#include "rte_ethdev_mp.h"
> > +
> > +static int detach_on_secondary(uint16_t port_id)
>
> <snip>
>
> > + free(da.args);
> > + return 0;
> > +}
> > +
> > +static int handle_secondary_request(const struct rte_mp_msg *msg, const
> void *peer)
> > +{
> > + (void)msg;
> > + (void)(peer);
> > + return -ENOTSUP;
>
> Please either mark arguments as __rte_unused, or use RTE_SET_USED(blah)
> macro. Same in other similar places.
OK.
>
> > +}
> > +
> > +static int handle_primary_response(const struct rte_mp_msg *msg, const
> void *peer)
> > +{
> > + (void)msg;
> > + (void)(peer);
> > + return -ENOTSUP;
> > +}
> > +
> > +static int handle_primary_request(const struct rte_mp_msg *msg, const
> void *peer)
> > +{
> > + const struct eth_dev_mp_req *req =
> > + (const struct eth_dev_mp_req *)msg->param;
>
> <snip>
>
> > + case REQ_TYPE_DETACH:
> > + case REQ_TYPE_ATTACH_ROLLBACK:
> > + ret = detach_on_secondary(req->port_id);
> > + break;
> > + default:
> > + ret = -EINVAL;
> > + }
> > +
> > + strcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST);
>
> Here and in other places: rte_strlcpy?
OK
Thanks!
Qi
>
> --
> Thanks,
> Anatoly
More information about the dev
mailing list