<div dir="ltr">Acked-by: Igor Ryzhov <<a href="mailto:iryzhov@nfware.com">iryzhov@nfware.com</a>></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Oct 9, 2021 at 2:58 AM Ferruh Yigit <<a href="mailto:ferruh.yigit@intel.com">ferruh.yigit@intel.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">To enable bifurcated device support, rtnl_lock is released before calling<br>
userspace callbacks and asynchronous requests are enabled.<br>
<br>
But these changes caused more issues, like bug #809, #816. To reduce the<br>
scope of the problems, the bifurcated device support related changes are<br>
only enabled when it is requested explicitly with new 'enable_bifurcated'<br>
module parameter.<br>
And bifurcated device support is disabled by default.<br>
<br>
So the bifurcated device related problems are isolated and they can be<br>
fixed without impacting all use cases.<br>
<br>
Bugzilla ID: 816<br>
Fixes: 631217c76135 ("kni: fix kernel deadlock with bifurcated device")<br>
Cc: <a href="mailto:stable@dpdk.org" target="_blank">stable@dpdk.org</a><br>
<br>
Signed-off-by: Ferruh Yigit <<a href="mailto:ferruh.yigit@intel.com" target="_blank">ferruh.yigit@intel.com</a>><br>
---<br>
Cc: Igor Ryzhov <<a href="mailto:iryzhov@nfware.com" target="_blank">iryzhov@nfware.com</a>><br>
Cc: Elad Nachman <<a href="mailto:eladv6@gmail.com" target="_blank">eladv6@gmail.com</a>><br>
Cc: Eric Christian <<a href="mailto:erclists@gmail.com" target="_blank">erclists@gmail.com</a>><br>
Cc: Stephen Hemminger <<a href="mailto:stephen@networkplumber.org" target="_blank">stephen@networkplumber.org</a>><br>
---<br>
.../prog_guide/kernel_nic_interface.rst | 28 +++++++++++++<br>
kernel/linux/kni/kni_dev.h | 3 ++<br>
kernel/linux/kni/kni_misc.c | 36 ++++++++++++++++<br>
kernel/linux/kni/kni_net.c | 42 +++++++++++--------<br>
4 files changed, 92 insertions(+), 17 deletions(-)<br>
<br>
diff --git a/doc/guides/prog_guide/kernel_nic_interface.rst b/doc/guides/prog_guide/kernel_nic_interface.rst<br>
index 1ce03ec1a374..771c7d7fdac4 100644<br>
--- a/doc/guides/prog_guide/kernel_nic_interface.rst<br>
+++ b/doc/guides/prog_guide/kernel_nic_interface.rst<br>
@@ -56,6 +56,12 @@ can be specified when the module is loaded to control its behavior:<br>
off Interfaces will be created with carrier state set to off.<br>
on Interfaces will be created with carrier state set to on.<br>
(charp)<br>
+ parm: enable_bifurcated: Enable request processing support for<br>
+ bifurcated drivers, which means releasing rtnl_lock before calling<br>
+ userspace callback and supporting async requests (default=off):<br>
+ on Enable request processing support for bifurcated drivers.<br>
+ (charp)<br>
+<br>
<br>
Loading the ``rte_kni`` kernel module without any optional parameters is<br>
the typical way a DPDK application gets packets into and out of the kernel<br>
@@ -174,6 +180,28 @@ To set the default carrier state to *off*:<br>
If the ``carrier`` parameter is not specified, the default carrier state<br>
of KNI interfaces will be set to *off*.<br>
<br>
+.. _kni_bifurcated_device_support:<br>
+<br>
+Bifurcated Device Support<br>
+~~~~~~~~~~~~~~~~~~~~~~~~~<br>
+<br>
+User callbacks are executed while kernel module holds the ``rtnl`` lock, this<br>
+causes a deadlock when callbacks run control commands on another Linux kernel<br>
+network interface.<br>
+<br>
+Bifurcated devices has kernel network driver part and to prevent deadlock for<br>
+them ``enable_bifurcated`` is used.<br>
+<br>
+To enable bifurcated device support:<br>
+<br>
+.. code-block:: console<br>
+<br>
+ # insmod <build_dir>/kernel/linux/kni/rte_kni.ko enable_bifurcated=on<br>
+<br>
+Enabling bifurcated device support releases ``rtnl`` lock before calling<br>
+callback and locks it back after callback. Also enables asynchronous request to<br>
+support callbacks that requires rtnl lock to work (interface down).<br>
+<br>
KNI Creation and Deletion<br>
-------------------------<br>
<br>
diff --git a/kernel/linux/kni/kni_dev.h b/kernel/linux/kni/kni_dev.h<br>
index c15da311ba25..e8633486eeb8 100644<br>
--- a/kernel/linux/kni/kni_dev.h<br>
+++ b/kernel/linux/kni/kni_dev.h<br>
@@ -34,6 +34,9 @@<br>
/* Default carrier state for created KNI network interfaces */<br>
extern uint32_t kni_dflt_carrier;<br>
<br>
+/* Request processing support for bifurcated drivers. */<br>
+extern uint32_t bifurcated_support;<br>
+<br>
/**<br>
* A structure describing the private information for a kni device.<br>
*/<br>
diff --git a/kernel/linux/kni/kni_misc.c b/kernel/linux/kni/kni_misc.c<br>
index 2b464c438113..aae977c187a9 100644<br>
--- a/kernel/linux/kni/kni_misc.c<br>
+++ b/kernel/linux/kni/kni_misc.c<br>
@@ -41,6 +41,10 @@ static uint32_t multiple_kthread_on;<br>
static char *carrier;<br>
uint32_t kni_dflt_carrier;<br>
<br>
+/* Request processing support for bifurcated drivers. */<br>
+static char *enable_bifurcated;<br>
+uint32_t bifurcated_support;<br>
+<br>
#define KNI_DEV_IN_USE_BIT_NUM 0 /* Bit number for device in use */<br>
<br>
static int kni_net_id;<br>
@@ -568,6 +572,22 @@ kni_parse_carrier_state(void)<br>
return 0;<br>
}<br>
<br>
+static int __init<br>
+kni_parse_bifurcated_support(void)<br>
+{<br>
+ if (!enable_bifurcated) {<br>
+ bifurcated_support = 0;<br>
+ return 0;<br>
+ }<br>
+<br>
+ if (strcmp(enable_bifurcated, "on") == 0)<br>
+ bifurcated_support = 1;<br>
+ else<br>
+ return -1;<br>
+<br>
+ return 0;<br>
+}<br>
+<br>
static int __init<br>
kni_init(void)<br>
{<br>
@@ -593,6 +613,13 @@ kni_init(void)<br>
else<br>
pr_debug("Default carrier state set to on.\n");<br>
<br>
+ if (kni_parse_bifurcated_support() < 0) {<br>
+ pr_err("Invalid parameter for bifurcated support\n");<br>
+ return -EINVAL;<br>
+ }<br>
+ if (bifurcated_support == 1)<br>
+ pr_debug("bifurcated support is enabled.\n");<br>
+<br>
#ifdef HAVE_SIMPLIFIED_PERNET_OPERATIONS<br>
rc = register_pernet_subsys(&kni_net_ops);<br>
#else<br>
@@ -659,3 +686,12 @@ MODULE_PARM_DESC(carrier,<br>
"\t\ton Interfaces will be created with carrier state set to on.\n"<br>
"\t\t"<br>
);<br>
+<br>
+module_param(enable_bifurcated, charp, 0644);<br>
+MODULE_PARM_DESC(enable_bifurcated,<br>
+"Enable request processing support for bifurcated drivers, "<br>
+"which means releasing rtnl_lock before calling userspace callback and "<br>
+"supporting async requests (default=off):\n"<br>
+"\t\ton Enable request processing support for bifurcated drivers.\n"<br>
+"\t\t"<br>
+);<br>
diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c<br>
index 611719b5ee27..29e5b9e21f9e 100644<br>
--- a/kernel/linux/kni/kni_net.c<br>
+++ b/kernel/linux/kni/kni_net.c<br>
@@ -113,12 +113,14 @@ kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)<br>
<br>
ASSERT_RTNL();<br>
<br>
- /* If we need to wait and RTNL mutex is held<br>
- * drop the mutex and hold reference to keep device<br>
- */<br>
- if (req->async == 0) {<br>
- dev_hold(dev);<br>
- rtnl_unlock();<br>
+ if (bifurcated_support) {<br>
+ /* If we need to wait and RTNL mutex is held<br>
+ * drop the mutex and hold reference to keep device<br>
+ */<br>
+ if (req->async == 0) {<br>
+ dev_hold(dev);<br>
+ rtnl_unlock();<br>
+ }<br>
}<br>
<br>
mutex_lock(&kni->sync_lock);<br>
@@ -132,12 +134,14 @@ kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)<br>
goto fail;<br>
}<br>
<br>
- /* No result available since request is handled<br>
- * asynchronously. set response to success.<br>
- */<br>
- if (req->async != 0) {<br>
- req->result = 0;<br>
- goto async;<br>
+ if (bifurcated_support) {<br>
+ /* No result available since request is handled<br>
+ * asynchronously. set response to success.<br>
+ */<br>
+ if (req->async != 0) {<br>
+ req->result = 0;<br>
+ goto async;<br>
+ }<br>
}<br>
<br>
ret_val = wait_event_interruptible_timeout(kni->wq,<br>
@@ -160,9 +164,11 @@ kni_net_process_request(struct net_device *dev, struct rte_kni_request *req)<br>
<br>
fail:<br>
mutex_unlock(&kni->sync_lock);<br>
- if (req->async == 0) {<br>
- rtnl_lock();<br>
- dev_put(dev);<br>
+ if (bifurcated_support) {<br>
+ if (req->async == 0) {<br>
+ rtnl_lock();<br>
+ dev_put(dev);<br>
+ }<br>
}<br>
return ret;<br>
}<br>
@@ -207,8 +213,10 @@ kni_net_release(struct net_device *dev)<br>
/* Setting if_up to 0 means down */<br>
req.if_up = 0;<br>
<br>
- /* request async because of the deadlock problem */<br>
- req.async = 1;<br>
+ if (bifurcated_support) {<br>
+ /* request async because of the deadlock problem */<br>
+ req.async = 1;<br>
+ }<br>
<br>
ret = kni_net_process_request(dev, &req);<br>
<br>
-- <br>
2.31.1<br>
<br>
</blockquote></div>