[dpdk-dev] [PATCH v8 1/3] ethdev: new API to free consumed buffers in Tx ring

Billy McFall bmcfall at redhat.com
Fri Mar 24 19:55:53 CET 2017


Add a new API to force free consumed buffers on Tx ring. API will return
the number of packets freed (0-n) or error code if feature not supported
(-ENOTSUP) or input invalid (-ENODEV).

Signed-off-by: Billy McFall <bmcfall at redhat.com>
Acked-by: Keith Wiles <Keith.Wiles at intel.com>
---
 doc/guides/conf.py                      |  7 +++++--
 doc/guides/nics/features/default.ini    |  4 +++-
 doc/guides/prog_guide/poll_mode_drv.rst | 33 +++++++++++++++++++++++++++++++++
 doc/guides/rel_notes/release_17_05.rst  |  6 ++++++
 lib/librte_ether/rte_ethdev.c           | 14 ++++++++++++++
 lib/librte_ether/rte_ethdev.h           | 31 +++++++++++++++++++++++++++++++
 6 files changed, 92 insertions(+), 3 deletions(-)

diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index 34c62de..4cac26d 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -64,6 +64,9 @@
 
 master_doc = 'index'
 
+# Maximum feature description string length
+feature_str_len = 25
+
 # Figures, tables and code-blocks automatically numbered if they have caption
 numfig = True
 
@@ -300,7 +303,7 @@ def print_table_body(outfile, num_cols, ini_files, ini_data, default_features):
 def print_table_row(outfile, feature, line):
     """ Print a single row of the table with fixed formatting. """
     line = line.rstrip()
-    print('   {:<20}{}'.format(feature, line), file=outfile)
+    print('   {:<{}}{}'.format(feature, feature_str_len, line), file=outfile)
 
 
 def print_table_divider(outfile, num_cols):
@@ -309,7 +312,7 @@ def print_table_divider(outfile, num_cols):
     column_dividers = ['='] * num_cols
     line += ' '.join(column_dividers)
 
-    feature = '=' * 20
+    feature = '=' * feature_str_len
 
     print_table_row(outfile, feature, line)
 
diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index 299078f..0135c0c 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -3,7 +3,8 @@
 ;
 ; This file defines the features that are valid for inclusion in
 ; the other driver files and also the order that they appear in
-; the features table in the documentation.
+; the features table in the documentation. The feature description
+; string should not exceed feature_str_len defined in conf.py.
 ;
 [Features]
 Speed capabilities   =
@@ -11,6 +12,7 @@ Link status          =
 Link status event    =
 Queue status event   =
 Rx interrupt         =
+Free Tx mbuf on demand =
 Queue start/stop     =
 MTU update           =
 Jumbo frame          =
diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index d4c92ea..e714bbe 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -249,6 +249,39 @@ One descriptor in the TX ring is used as a sentinel to avoid a hardware race con
 
     When configuring for DCB operation, at port initialization, both the number of transmit queues and the number of receive queues must be set to 128.
 
+Free Tx mbuf on Demand
+~~~~~~~~~~~~~~~~~~~~~~
+
+Many of the drivers don't release the mbuf back to the mempool, or local cache, immediately after the packet has been
+transmitted.
+Instead, they leave the mbuf in their Tx ring and either perform a bulk release when the ``tx_rs_thresh`` has been
+crossed or free the mbuf when a slot in the Tx ring is needed.
+
+An application can request the driver to release used mbufs with the ``rte_eth_tx_done_cleanup()`` API.
+This API requests the driver to release mbufs that are no longer in use, independent of whether or not the
+``tx_rs_thresh`` has been crossed.
+There are two scenarios when an application may want the mbuf released immediately:
+
+* When a given packet needs to be sent to multiple destination interfaces (either for Layer 2 flooding or Layer 3
+  multi-cast).
+  One option is to make a copy of the packet or a copy of the header portion that needs to be manipulated.
+  A second option is to transmit the packet and then poll the ``rte_eth_tx_done_cleanup()`` API until the reference
+  count on the packet is decremented.
+  Then the same packet can be transmitted to the next destination interface.
+  The application is still responsible for managing any packet manipulations needed between the different destination
+  interface, but a packet copy can be avoided.
+  This API is independent of whether the packet was transmitted or dropped, only that the mbuf is no longer in use by
+  the interface.
+
+* Some applications are designed to make multiple runs, like a packet generator.
+  For performance reasons and consistency between runs, the application may want to reset back to an initial state
+  between each run, where all mbufs are returned to the mempool.
+  In this case, it can call the ``rte_eth_tx_done_cleanup()`` API for each destination interface it has been using
+  to request it to release of all its used mbufs.
+
+To determine if a driver supports this API, check for the *Free Tx mbuf on demand* feature in the *Network Interface
+Controller Drivers* document.
+
 Hardware Offload
 ~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst
index 918f483..25ae319 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -49,6 +49,12 @@ New Features
 
   sPAPR IOMMU based pci probing enabled for vfio-pci devices.
 
+* **Added free Tx mbuf on demand API.**
+
+  Added a new function ``rte_eth_tx_done_cleanup()`` which allows an application
+  to request the driver to release mbufs from their Tx ring that are no longer
+  in use, independent of whether or not the ``tx_rs_thresh`` has been crossed.
+
 Resolved Issues
 ---------------
 
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index eb0a94a..b796e7d 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1275,6 +1275,20 @@ rte_eth_tx_buffer_init(struct rte_eth_dev_tx_buffer *buffer, uint16_t size)
 	return ret;
 }
 
+int
+rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t free_cnt)
+{
+	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+
+	/* Validate Input Data. Bail if not valid or not supported. */
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_done_cleanup, -ENOTSUP);
+
+	/* Call driver to free pending mbufs. */
+	return (*dev->dev_ops->tx_done_cleanup)(dev->data->tx_queues[queue_id],
+			free_cnt);
+}
+
 void
 rte_eth_promiscuous_enable(uint8_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 4be217c..b3ee872 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1183,6 +1183,9 @@ typedef int (*eth_fw_version_get_t)(struct rte_eth_dev *dev,
 				     char *fw_version, size_t fw_size);
 /**< @internal Get firmware information of an Ethernet device. */
 
+typedef int (*eth_tx_done_cleanup_t)(void *txq, uint32_t free_cnt);
+/**< @internal Force mbufs to be from TX ring. */
+
 typedef void (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
 	uint16_t rx_queue_id, struct rte_eth_rxq_info *qinfo);
 
@@ -1488,6 +1491,7 @@ struct eth_dev_ops {
 	eth_rx_disable_intr_t      rx_queue_intr_disable; /**< Disable Rx queue interrupt. */
 	eth_tx_queue_setup_t       tx_queue_setup;/**< Set up device TX queue. */
 	eth_queue_release_t        tx_queue_release; /**< Release TX queue. */
+	eth_tx_done_cleanup_t      tx_done_cleanup;/**< Free tx ring mbufs */
 
 	eth_dev_led_on_t           dev_led_on;    /**< Turn on LED. */
 	eth_dev_led_off_t          dev_led_off;   /**< Turn off LED. */
@@ -3178,6 +3182,33 @@ rte_eth_tx_buffer_count_callback(struct rte_mbuf **pkts, uint16_t unsent,
 		void *userdata);
 
 /**
+ * Request the driver to free mbufs currently cached by the driver. The
+ * driver will only free the mbuf if it is no longer in use. It is the
+ * application's responsibity to ensure rte_eth_tx_buffer_flush(..) is
+ * called if needed.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the transmit queue through which output packets must be
+ *   sent.
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @param free_cnt
+ *   Maximum number of packets to free. Use 0 to indicate all possible packets
+ *   should be freed. Note that a packet may be using multiple mbufs.
+ * @return
+ *   Failure: < 0
+ *     -ENODEV: Invalid interface
+ *     -ENOTSUP: Driver does not support function
+ *   Success: >= 0
+ *     0-n: Number of packets freed. More packets may still remain in ring that
+ *     are in use.
+ */
+int
+rte_eth_tx_done_cleanup(uint8_t port_id, uint16_t queue_id, uint32_t free_cnt);
+
+/**
  * The eth device event type for interrupt, and maybe others in the future.
  */
 enum rte_eth_event_type {
-- 
2.9.3



More information about the dev mailing list