[dpdk-dev] [PATCH v4 06/24] ethdev: enable hotplug on multi-process

Burakov, Anatoly anatoly.burakov at intel.com
Tue Jun 26 14:09:17 CEST 2018


On 26-Jun-18 8:08 AM, Qi Zhang wrote:
> We are going to introduce the solution to handle different hotplug
> cases in multi-process situation, it include below scenario:
> 
> 1. Attach a share device from primary
> 2. Detach a share device from primary
> 3. Attach a share device from secondary
> 4. Detach a share device from secondary
> 5. Attach a private device from secondary
> 6. Detach a private device from secondary
> 7. Detach a share device from secondary privately
> 8. Attach a share device from secondary privately
> 
> In primary-secondary process model, we assume device is shared by default.
> that means attach or detach a device on any process will broadcast to
> all other processes through mp channel then device information will be
> synchronized on all processes.
> 
> Any failure during attaching process will cause inconsistent status
> between processes, so proper rollback action should be considered.
> Also it is not safe to detach a share device when other process still use
> it, so a handshake mechanism is introduced.
> 
> This patch covers the implementation of case 1,2,5,6,7,8.
> Case 3,4 will be implemented on separate patch as well as handshake
> mechanism.
> 
> Scenario for Case 1, 2:
> 
> attach device
> a) primary attach the new device if failed goto h).
> b) primary send attach sync request to all secondary.
> c) secondary receive request and attach device and send reply.
> d) primary check the reply if all success go to i).
> e) primary send attach rollback sync request to all secondary.
> f) secondary receive the request and detach device and send reply.
> g) primary receive the reply and detach device as rollback action.
> h) attach fail
> i) attach success
> 
> detach device
> a) primary perform pre-detach check, if device is locked, goto i).
> b) primary send pre-detach sync request to all secondary.
> c) secondary perform pre-detach check and send reply.
> d) primary check the reply if any fail goto i).
> e) primary send detach sync request to all secondary
> f) secondary detach the device and send reply (assume no fail)
> g) primary detach the device.
> h) detach success
> i) detach failed
> 
> Case 5, 6:
> Secondary process can attach private device which only visible to itself,
> in this case no IPC is involved, primary process is not allowed to have
> private device so far.
> 
> Case 7, 8:
> Secondary process can also temporally to detach a share device "privately"
> then attach it back later, this action also not impact other processes.
> 
> APIs changes:
> 
> rte_eth_dev_attach and rte_eth_dev_attach are extended to support
> share device attach/detach in primary-secondary process model, it will
> be called in case 1,2,3,4.
> 
> New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
> introduced to cover case 5,6,7,8, this API can only be invoked in secondary
> process.
> 
> Signed-off-by: Qi Zhang <qi.z.zhang at intel.com>
> ---

<snip>

> +static int
> +handle_primary_request(const struct rte_mp_msg *msg, const void *peer)
> +{
> +
> +	struct rte_mp_msg mp_resp;
> +	const struct eth_dev_mp_req *req =
> +		(const struct eth_dev_mp_req *)msg->param;
> +	struct eth_dev_mp_req *resp =
> +		(struct eth_dev_mp_req *)mp_resp.param;
> +	struct mp_reply_bundle *bundle;
> +	int ret = 0;
> +
> +	memset(&mp_resp, 0, sizeof(mp_resp));
> +	strlcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST, sizeof(mp_resp.name));
> +	mp_resp.len_param = sizeof(*req);
> +	memcpy(resp, req, sizeof(*resp));
> +
> +	bundle = calloc(1, sizeof(*bundle));
> +	if (bundle == NULL) {
> +		resp->result = -ENOMEM;
> +		ret = rte_mp_reply(&mp_resp, peer);
> +		if (ret) {
> +			ethdev_log(ERR, "failed to send reply to primary request\n");
> +			return ret;
> +		}
> +	}
> +
> +	bundle->msg = *msg;
> +	bundle->peer = peer;
> +
> +	ret = rte_eal_mp_task_add(__handle_primary_request, bundle);
> +	if (ret) {
> +		resp->result = ret;
> +		ret = rte_mp_reply(&mp_resp, peer);
> +		if (ret) {
> +			ethdev_log(ERR, "failed to send reply to primary request\n");
> +			return ret;
> +		}
> +	}

What you're doing here is quite dangerous. The parameter "const void 
*peer" is only guaranteed to be valid at the time of the callback - not 
necessarily afterwards. So, if you're handing off sending replies to a 
separate thread, things might blow up because the pointer may no longer 
be valid.

-- 
Thanks,
Anatoly


More information about the dev mailing list