[dpdk-dev] [PATCH 04/22] ethdev: enable hotplug on multi-process
Burakov, Anatoly
anatoly.burakov at intel.com
Mon Jun 18 10:18:17 CEST 2018
On 07-Jun-18 1:38 PM, Qi Zhang wrote:
> The patch introduce the solution to handle different hotplug cases in
> multi-process situation, it include below scenario:
>
> 1. Attach a share device from primary
> 2. Detach a share device from primary
> 3. Attach a share device from secondary
> 4. Detach a share device from secondary
> 5. Attach a private device from secondary
> 6. Detach a private device from secondary
> 7. Detach a share device from secondary privately
> 8. Attach a share device from secondary privately
>
> In primary-secondary process model, we assume device is shared by default.
> that means attach or detach a device on any process will broadcast to
> all other processes through mp channel then device information will be
> synchronized on all processes.
>
> Any failure during attaching process will cause inconsistent status
> between processes, so proper rollback action should be considered.
> Also it is not safe to detach a share device when other process still use
> it, so a handshake mechanism is introduced, it will be implemented in
> following separate patch.
>
> Scenario for Case 1, 2:
>
> attach device
> a) primary attach the new device if failed goto h).
> b) primary send attach sync request to all secondary.
> c) secondary receive request and attach device and send reply.
> d) primary check the reply if all success go to i).
> e) primary send attach rollback sync request to all secondary.
> f) secondary receive the request and detach device and send reply.
> g) primary receive the reply and detach device as rollback action.
> h) attach fail
> i) attach success
>
> detach device
> a) primary perform pre-detach check, if device is locked, goto i).
> b) primary send pre-detach sync request to all secondary.
> c) secondary perform pre-detach check and send reply.
> d) primary check the reply if any fail goto i).
> e) primary send detach sync request to all secondary
> f) secondary detach the device and send reply (assume no fail)
> g) primary detach the device.
> h) detach success
> i) detach failed
>
> Case 3, 4:
> This will be implemented in following patch.
If these will be implemented in following patch, why spend half the
commit message talking about it? :) This commit doesn't implement
secondary process functionality at all, so the commit message should
probably be reworded to only include primary process logic, no?
>
> Case 5, 6:
> Secondary process can attach private device which only visible to itself,
> in this case no IPC is involved, primary process is not allowd to have
> private device so far.
>
> Case 7, 8:
> Secondary process can also temporally to detach a share device "privately"
> then attach it back later, this action also not impact other processes.
>
> APIs chenages:
Multiple typos - "chenages", "temporally", "allowd", etc.
>
> rte_eth_dev_attach and rte_eth_dev_attach are extended to support
> share device attach/detach in primary-secondary process model, it will
> be called in case 1,2,3,4.
>
> New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
> introduced to cover case 5,6,7,8, this API can only be invoked in secondary
> process.
> > Signed-off-by: Qi Zhang <qi.z.zhang at intel.com>
> ---
<snip>
> rte_eal_mcfg_complete();
>
> + if (rte_eth_dev_mp_init()) {
> + rte_eal_init_alert("rte_eth_dev_mp_init() failed\n");
> + rte_errno = ENOEXEC;
> + return -1;
> + }
> +
Why is this done after the end of init? rte_eal_mcfg_complete() makes it
so that secondaries can initialize, at that point all initialization
should have been finished. I would expect this to be called after
(before?) bus probe, since this is device-related.
> return fctret;
> }
>
> diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile
> index c2f2f7d82..04e93f337 100644
> --- a/lib/librte_ethdev/Makefile
> +++ b/lib/librte_ethdev/Makefile
> @@ -19,6 +19,7 @@ EXPORT_MAP := rte_ethdev_version.map
> LIBABIVER := 9
>
<snip>
> + if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +
> + /**
> + * If secondary process, we just send request to primray
> + * to start the process.
> + */
> + req.t = REQ_TYPE_ATTACH;
> + strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN);
> +
> + ret = rte_eth_dev_request_to_primary(&req);
> + if (ret) {
> + ethdev_log(ERR, "Failed to send device attach request to primary\n");
The log message is a little misleading. It can be that secondary has
failed to send request. It can also be that it succeeded, but the attach
itself has failed. I think a better message would be "attach request has
failed" or something to that effect.
> + return ret;
> + }
> +
> + *port_id = req.port_id;
> + return req.result;
> + }
> +
> + ret = do_eth_dev_attach(devargs, port_id);
> + if (ret)
> + return ret;
> +
> + /* send attach request to seoncary */
> + req.t = REQ_TYPE_ATTACH;
> + strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN);
> + req.port_id = *port_id;
> + ret = rte_eth_dev_request_to_secondary(&req);
> + if (ret) {
> + ethdev_log(ERR, "Failed to send device attach request to secondary\n");
Same as above - log message can/might be misleading. There are a few
other places where similar log message is present, those should be
corrected too.
> + goto rollback;
> + }
> +
> + if (req.result)
> + goto rollback;
> +
> + return 0;
<snip>
> +{
> + uint32_t dev_flags;
> +
> + if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> + return -ENOTSUP;
> +
> + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> +
> + dev_flags = rte_eth_devices[port_id].data->dev_flags;
> + if (dev_flags & RTE_ETH_DEV_BONDED_SLAVE) {
> + ethdev_log(ERR,
> + "Port %" PRIu16 " is bonded, cannot detach", port_id);
> + return -ENOTSUP;
> + }
Do we have to do a similar check for failsafe devices?
> +
> + return do_eth_dev_detach(port_id);
> +}
> +
> static int
> rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues)
> {
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index 36e3984ea..bb03d613b 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
<snip>
> /**
> + * Attach a private Ethernet device specified by arguments.
> + * A private device is invisible to other process.
> + * Can only be invoked in secondary process.
> + *
> + * @param devargs
> + * A pointer to a strings array describing the new device
> + * to be attached. The strings should be a pci address like
> + * '0000:01:00.0' or virtual device name like 'net_pcap0'.
> + * @param port_id
> + * A pointer to a port identifier actually attached.
> + * @return
> + * 0 on success and port_id is filled, negative on error
> + */
> +int rte_eth_dev_attach_private(const char *devargs, uint16_t *port_id);
New API's should be marked as __rte_experimental.
> +
> +/**
> * Detach a Ethernet device specified by port identifier.
> * This function must be called when the device is in the
> * closed state.
> + * In multi-process mode, it will sync with other process
> + * to detach the device.
> *
> * @param port_id
> * The port identifier of the device to detach.
> @@ -1490,6 +1511,22 @@ int rte_eth_dev_attach(const char *devargs, uint16_t *port_id);
<snip>
> + * Detach a Ethernet device in current process.
> + *
> + * @param port_id
> + * The port identifier of the device to detach.
> + * @param devname
> + * A pointer to a buffer that will be filled with the device name.
> + * This buffer must be at least RTE_DEV_NAME_MAX_LEN long.
> + * @return
> + * 0 on success and devname is filled, negative on error
> + */
> +int do_eth_dev_detach(uint16_t port_id);
> +
Why is this made part of an external API? You should have a separate,
private header file for these.
> #ifdef __cplusplus
> }
> #endif
> diff --git a/lib/librte_ethdev/rte_ethdev_mp.c b/lib/librte_ethdev/rte_ethdev_mp.c
> new file mode 100644
> index 000000000..8ede8151d
> --- /dev/null
> +++ b/lib/librte_ethdev/rte_ethdev_mp.c
> @@ -0,0 +1,195 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2010-2018 Intel Corporation
> + */
> +
> +#include "rte_ethdev_driver.h"
> +#include "rte_ethdev_mp.h"
> +
> +static int detach_on_secondary(uint16_t port_id)
<snip>
> + free(da.args);
> + return 0;
> +}
> +
> +static int handle_secondary_request(const struct rte_mp_msg *msg, const void *peer)
> +{
> + (void)msg;
> + (void)(peer);
> + return -ENOTSUP;
Please either mark arguments as __rte_unused, or use RTE_SET_USED(blah)
macro. Same in other similar places.
> +}
> +
> +static int handle_primary_response(const struct rte_mp_msg *msg, const void *peer)
> +{
> + (void)msg;
> + (void)(peer);
> + return -ENOTSUP;
> +}
> +
> +static int handle_primary_request(const struct rte_mp_msg *msg, const void *peer)
> +{
> + const struct eth_dev_mp_req *req =
> + (const struct eth_dev_mp_req *)msg->param;
<snip>
> + case REQ_TYPE_DETACH:
> + case REQ_TYPE_ATTACH_ROLLBACK:
> + ret = detach_on_secondary(req->port_id);
> + break;
> + default:
> + ret = -EINVAL;
> + }
> +
> + strcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST);
Here and in other places: rte_strlcpy?
--
Thanks,
Anatoly
More information about the dev
mailing list