[dpdk-dev] [PATCH v2] examples/skeleton-cat: PQoS CAT and CDP, example of libpqos usage

Wojciech Andralojc wojciechx.andralojc at intel.com
Fri Feb 26 16:07:13 CET 2016


Because of the feedback that we have received off the mailing list,
that extending EAL commands is not an option due to the
Intel Architecture nature of CAT,
we have changed the design of PQoS patch.

The current V2 patch implements a sample code, based on the DPDK skeleton
example app, that links against the existing 01.org PQoS library
(https://github.com/01org/intel-cmt-cat).
This eliminates the need for librte_pqos and EAL extensions introduced in
the V1 patch. The sample code implements a C module that parses
the application specific part of the command line with CAT configuration
options (--l3ca, same format as the V1 patch EAL command, but expects
CPU ids rather than lcores).
The module is easy to re-use in other applications as needed.

Signed-off-by: Wojciech Andralojc <wojciechx.andralojc at intel.com>
Signed-off-by: Tomasz Kantecki <tomasz.kantecki at intel.com>
Signed-off-by: Marcel D Cornu <marcel.d.cornu at intel.com>
---
Version 2:
* Added signal handlers to do clean-up on SIGINT and SIGTERM
* Clean-up function modified to zero globals (needed for testing)
* Init function modified to return more applicable errnos

Version 1:
* Initial version

Details of "--l3ca" app parameter to configure Intel CAT and CDP features:
--l3ca=bitmask@<cpu_list>
--l3ca=(code_bitmask,data_bitmask)@<cpu_list>

makes selected CPU's use specified CAT bitmasks, bitmasks must be
expressed in hexadecimal form

CAT and CDP features allow management of the CPU's last level cache.
CAT introduces classes of service (COS) that are essentially bitmasks.
In current CAT implementations, a bit in a COS bitmask corresponds to
one cache way in the last level cache.
A CPU core is always assigned to one of the CAT classes.
By programming CPU core assignment and COS bitmasks, applications can be
given exclusive, shared, or mixed access to the CPU's last level cache.
CDP extends CAT so that there are two bitmasks per COS,
one for data and one for code.
The number of classes and number of valid bits in a COS bitmask is CPU
model specific and COS bitmasks need to be contiguous. Sample code calls
this bitmask a cbm or a capacity bitmask.
By default, after reset, all CPU cores are assigned to COS 0 and all
classes are programmed to allow fill into all cache ways.
CDP is off by default.

For more information about CAT please see
https://github.com/01org/intel-cmt-cat

Known issues and limitations:
- --l3ca must be a first app parameter
---
 MAINTAINERS                               |   4 +
 doc/guides/sample_app_ug/index.rst        |   1 +
 doc/guides/sample_app_ug/skeleton-cat.rst | 461 ++++++++++++++
 examples/Makefile                         |   1 +
 examples/skeleton-cat/Makefile            |  68 ++
 examples/skeleton-cat/basicfwd-cat.c      | 220 +++++++
 examples/skeleton-cat/cat.c               | 992 ++++++++++++++++++++++++++++++
 examples/skeleton-cat/cat.h               |  72 +++
 8 files changed, 1819 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/skeleton-cat.rst
 create mode 100644 examples/skeleton-cat/Makefile
 create mode 100644 examples/skeleton-cat/basicfwd-cat.c
 create mode 100644 examples/skeleton-cat/cat.c
 create mode 100644 examples/skeleton-cat/cat.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 628bc05..7a6702b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -600,3 +600,7 @@ F: doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst
 M: Pablo de Lara <pablo.de.lara.guarch at intel.com>
 M: Daniel Mrzyglod <danielx.t.mrzyglod at intel.com>
 F: examples/ptpclient/
+
+M: Tomasz Kantecki <tomasz.kantecki at intel.com>
+F: examples/skeleton-cat/
+F: doc/guides/sample_app_ug/skeleton-cat.rst
\ No newline at end of file
diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst
index 8a646dd..f065e54 100644
--- a/doc/guides/sample_app_ug/index.rst
+++ b/doc/guides/sample_app_ug/index.rst
@@ -41,6 +41,7 @@ Sample Applications User Guide
     exception_path
     hello_world
     skeleton
+    skeleton-cat
     rxtx_callbacks
     ip_frag
     ipv4_multicast
diff --git a/doc/guides/sample_app_ug/skeleton-cat.rst b/doc/guides/sample_app_ug/skeleton-cat.rst
new file mode 100644
index 0000000..cc174fc
--- /dev/null
+++ b/doc/guides/sample_app_ug/skeleton-cat.rst
@@ -0,0 +1,461 @@
+..  BSD LICENSE
+    Copyright(c) 2016 Intel Corporation. All rights reserved.
+    All rights reserved.
+
+    Redistribution and use in source and binary forms, with or without
+    modification, are permitted provided that the following conditions
+    are met:
+
+    * Redistributions of source code must retain the above copyright
+    notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in
+    the documentation and/or other materials provided with the
+    distribution.
+    * Neither the name of Intel Corporation nor the names of its
+    contributors may be used to endorse or promote products derived
+    from this software without specific prior written permission.
+
+    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+Cache Allocation Technology (CAT) enabled Basic Forwarding Sample Application
+=============================================================================
+
+Basic Forwarding sample application is a simple *skeleton* example of
+a forwarding application. It has been extended to make use of CAT via extended
+command line options and linking against the libpqos library.
+
+It is intended as a demonstration of the basic components of a DPDK forwarding
+application and use of the libpqos library to program CAT.
+For more detailed implementations see the L2 and L3 forwarding
+sample applications.
+
+CAT and CDP features allow management of the CPU's last level cache.
+CAT introduces classes of service (COS) that are essentially bitmasks.
+In current CAT implementations, a bit in a COS bitmask corresponds to
+one cache way in last level cache.
+A CPU core is always assigned to one of the CAT classes.
+By programming CPU core assignment and COS bitmasks, applications can be given
+exclusive, shared, or mixed access to the CPU's last level cache.
+CDP extends CAT so that there are two bitmasks per COS,
+one for data and one for code.
+The number of classes and number of valid bits in a COS bitmask is CPU model
+specific and COS bitmasks need to be contiguous. Sample code calls this bitmask
+``cbm`` or capacity bitmask.
+By default, after reset, all CPU cores are assigned to COS 0 and all classes
+are programmed to allow fill into all cache ways.
+CDP is off by default.
+
+For more information about CAT please see https://github.com/01org/intel-cmt-cat
+
+
+Compiling the Application
+-------------------------
+
+Requires ``libpqos`` from Intel's
+`intel-cmt-cat software package <https://github.com/01org/intel-cmt-cat>`_
+hosted on GitHub repository. For installation notes, please see ``README`` file.
+
+GIT:
+
+* https://github.com/01org/intel-cmt-cat
+
+To compile the application export the path to PQoS lib (needed only if libpqos
+is not installed in the default location (/usr/local))
+and the DPDK source tree and go to the example directory:
+
+.. code-block:: console
+
+    export PQOS_INSTALL_PATH=/path/to/libpqos
+    export RTE_SDK=/path/to/rte_sdk
+
+    cd ${RTE_SDK}/examples/skeleton-cat
+
+Set the target, for example:
+
+.. code-block:: console
+
+    export RTE_TARGET=x86_64-native-linuxapp-gcc
+
+See the *DPDK Getting Started* Guide for possible ``RTE_TARGET`` values.
+
+Build the application as follows:
+
+.. code-block:: console
+
+    make
+
+
+Running the Application
+-----------------------
+
+To run the example in a ``linuxapp`` environment and enable CAT on cpus 0-2:
+
+.. code-block:: console
+
+    ./build/basicfwd-cat -c 2 -n 4 -- --l3ca=0x3@(0-2)
+
+or to enable CAT and CDP on cpus 1,3:
+
+.. code-block:: console
+
+    ./build/basicfwd-cat -c 2 -n 4 -- --l3ca=(0x00C00,0x00300)@(1,3)
+
+The option to enable CAT is:
+
+* ``--l3ca='<common_cbm at cpus>[,<(code_cbm,data_cbm)@cpus>...]'``:
+
+  where ``cbm`` stands for capacity bitmask and must be expressed in
+  hexadecimal form.
+
+  ``common_cbm`` is a single mask, for a CDP enabled system, a group of two
+  masks (``code_cbm`` and ``data_cbm``) is used.
+
+  ``(`` and ``)`` are necessary if it's a group.
+
+  ``cpus`` could be a single digit/range or a group and must be expressed in
+  decimal form.
+
+  ``(`` and ``)`` are necessary if it's a group.
+
+  e.g. ``--l3ca='0x00F00@(1,3),0x0FF00@(4-6),0xF0000 at 7'``
+
+  * cpus 1 and 3 share its 4 ways with cpus 4, 5 and 6;
+
+  * cpus 4, 5 and 6 share half (4 out of 8 ways) of its L3 with cpus 1 and 3;
+
+  * cpus 4, 5 and 6 have exclusive access to 4 out of 8 ways;
+
+  * cpu 7 has exclusive access to all of its 4 ways;
+
+  e.g. ``--l3ca='(0x00C00,0x00300)@(1,3)'`` for CDP enabled system
+
+  * cpus 1 and 3 have access to 2 ways for code and 2 ways for data, code and
+    data ways are not overlapping.
+
+
+Refer to *DPDK Getting Started Guide* for general information on running
+applications and the Environment Abstraction Layer (EAL) options.
+
+
+To reset or list CAT configuration and control CDP please use ``pqos`` tool
+from Intel's
+`intel-cmt-cat software package <https://github.com/01org/intel-cmt-cat>`_.
+
+To enabled or disable CDP:
+
+.. code-block:: console
+
+    sudo ./pqos -S cdp-on
+
+    sudo ./pqos -S cdp-off
+
+to reset CAT configuration:
+
+.. code-block:: console
+
+    sudo ./pqos -R
+
+to list CAT config:
+
+.. code-block:: console
+
+    sudo ./pqos -s
+
+For more info about ``pqos`` tool please see its man page or
+`intel-cmt-cat wiki <https://github.com/01org/intel-cmt-cat/wiki>`_.
+
+
+Explanation
+-----------
+
+The following sections provide an explanation of the main components of the
+code.
+
+All DPDK library functions used in the sample code are prefixed with ``rte_``
+and are explained in detail in the *DPDK API Documentation*.
+
+
+The Main Function
+~~~~~~~~~~~~~~~~~
+
+The ``main()`` function performs the initialization and calls the execution
+threads for each lcore.
+
+The first task is to initialize the Environment Abstraction Layer (EAL).  The
+``argc`` and ``argv`` arguments are provided to the ``rte_eal_init()``
+function. The value returned is the number of parsed arguments:
+
+.. code-block:: c
+
+    int ret = rte_eal_init(argc, argv);
+    if (ret < 0)
+        rte_exit(EXIT_FAILURE, "Error with EAL initialization\n");
+
+The next task is to initialize the PQoS library and configure CAT. The
+``argc`` and ``argv`` arguments are provided to the ``cat_init()``
+function. The value returned is the number of parsed arguments:
+
+.. code-block:: c
+
+    int ret = cat_init(argc, argv);
+    if (ret < 0)
+        rte_exit(EXIT_FAILURE, "PQOS: L3CA init failed!\n");
+
+``cat_init()`` is a wrapper function which parses the command, validates
+the requested parameters and configures CAT accordingly.
+
+Parsing of command line arguments is done in ``parse_args(...)``.
+libpqos is then initialized with the ``pqos_init(...)`` call. Next, libpqos is
+queried for system CPU information and L3CA capabilities via
+``pqos_cap_get(...)`` and ``pqos_cap_get_type(..., PQOS_CAP_TYPE_L3CA, ...)``
+calls. When all capability and topology information is collected, the requested
+CAT configuration is validated. A check is then performed (on per socket basis)
+for a sufficient number of unassociated COS. COS are selected and
+configured via the ``pqos_l3ca_set(...)`` call. Finally, COS are associated to
+relevant CPUs via ``pqos_l3ca_assoc_set(...)`` calls.
+
+``atexit(...)`` is used to register ``cat_exit(...)`` to be called on
+a clean exit. ``cat_exit(...)`` performs a simple CAT clean-up, by associating
+COS 0 to all involved CPUs via ``pqos_l3ca_assoc_set(...)`` calls.
+
+The ``main()`` also allocates a mempool to hold the mbufs (Message Buffers)
+used by the application:
+
+.. code-block:: c
+
+    mbuf_pool = rte_mempool_create("MBUF_POOL",
+                                   NUM_MBUFS * nb_ports,
+                                   MBUF_SIZE,
+                                   MBUF_CACHE_SIZE,
+                                   sizeof(struct rte_pktmbuf_pool_private),
+                                   rte_pktmbuf_pool_init, NULL,
+                                   rte_pktmbuf_init,      NULL,
+                                   rte_socket_id(),
+                                   0);
+
+Mbufs are the packet buffer structures used by DPDK. They are explained in
+detail in the "Mbuf Library" section of the *DPDK Programmer's Guide*.
+
+The ``main()`` function also initializes all the ports using the user defined
+``port_init()`` function which is explained in the next section:
+
+.. code-block:: c
+
+    for (portid = 0; portid < nb_ports; portid++) {
+        if (port_init(portid, mbuf_pool) != 0) {
+            rte_exit(EXIT_FAILURE,
+                     "Cannot init port %" PRIu8 "\n", portid);
+        }
+    }
+
+
+Once the initialization is complete, the application is ready to launch a
+function on an lcore. In this example ``lcore_main()`` is called on a single
+lcore.
+
+
+.. code-block:: c
+
+	lcore_main();
+
+The ``lcore_main()`` function is explained below.
+
+
+
+The Port Initialization  Function
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The main functional part of the port initialization used in the Basic
+Forwarding application is shown below:
+
+.. code-block:: c
+
+    static inline int
+    port_init(uint8_t port, struct rte_mempool *mbuf_pool)
+    {
+        struct rte_eth_conf port_conf = port_conf_default;
+        const uint16_t rx_rings = 1, tx_rings = 1;
+        struct ether_addr addr;
+        int retval;
+        uint16_t q;
+
+        if (port >= rte_eth_dev_count())
+            return -1;
+
+        /* Configure the Ethernet device. */
+        retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
+        if (retval != 0)
+            return retval;
+
+        /* Allocate and set up 1 RX queue per Ethernet port. */
+        for (q = 0; q < rx_rings; q++) {
+            retval = rte_eth_rx_queue_setup(port, q, RX_RING_SIZE,
+                    rte_eth_dev_socket_id(port), NULL, mbuf_pool);
+            if (retval < 0)
+                return retval;
+        }
+
+        /* Allocate and set up 1 TX queue per Ethernet port. */
+        for (q = 0; q < tx_rings; q++) {
+            retval = rte_eth_tx_queue_setup(port, q, TX_RING_SIZE,
+                    rte_eth_dev_socket_id(port), NULL);
+            if (retval < 0)
+                return retval;
+        }
+
+        /* Start the Ethernet port. */
+        retval = rte_eth_dev_start(port);
+        if (retval < 0)
+            return retval;
+
+        /* Enable RX in promiscuous mode for the Ethernet device. */
+        rte_eth_promiscuous_enable(port);
+
+        return 0;
+    }
+
+The Ethernet ports are configured with default settings using the
+``rte_eth_dev_configure()`` function and the ``port_conf_default`` struct:
+
+.. code-block:: c
+
+    static const struct rte_eth_conf port_conf_default = {
+        .rxmode = { .max_rx_pkt_len = ETHER_MAX_LEN }
+    };
+
+For this example the ports are set up with 1 RX and 1 TX queue using the
+``rte_eth_rx_queue_setup()`` and ``rte_eth_tx_queue_setup()`` functions.
+
+The Ethernet port is then started:
+
+.. code-block:: c
+
+        retval  = rte_eth_dev_start(port);
+
+
+Finally the RX port is set in promiscuous mode:
+
+.. code-block:: c
+
+        rte_eth_promiscuous_enable(port);
+
+
+The Lcores Main
+~~~~~~~~~~~~~~~
+
+As we saw above the ``main()`` function calls an application function on the
+available lcores. For the Basic Forwarding application the lcore function
+looks like the following:
+
+.. code-block:: c
+
+    static __attribute__((noreturn)) void
+    lcore_main(void)
+    {
+        const uint8_t nb_ports = rte_eth_dev_count();
+        uint8_t port;
+
+        /*
+         * Check that the port is on the same NUMA node as the polling thread
+         * for best performance.
+         */
+        for (port = 0; port < nb_ports; port++)
+            if (rte_eth_dev_socket_id(port) > 0 &&
+                    rte_eth_dev_socket_id(port) !=
+                            (int)rte_socket_id())
+                printf("WARNING, port %u is on remote NUMA node to "
+                        "polling thread.\n\tPerformance will "
+                        "not be optimal.\n", port);
+
+        printf("\nCore %u forwarding packets. [Ctrl+C to quit]\n",
+                rte_lcore_id());
+
+        /* Run until the application is quit or killed. */
+        for (;;) {
+            /*
+             * Receive packets on a port and forward them on the paired
+             * port. The mapping is 0 -> 1, 1 -> 0, 2 -> 3, 3 -> 2, etc.
+             */
+            for (port = 0; port < nb_ports; port++) {
+
+                /* Get burst of RX packets, from first port of pair. */
+                struct rte_mbuf *bufs[BURST_SIZE];
+                const uint16_t nb_rx = rte_eth_rx_burst(port, 0,
+                        bufs, BURST_SIZE);
+
+                if (unlikely(nb_rx == 0))
+                    continue;
+
+                /* Send burst of TX packets, to second port of pair. */
+                const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1, 0,
+                        bufs, nb_rx);
+
+                /* Free any unsent packets. */
+                if (unlikely(nb_tx < nb_rx)) {
+                    uint16_t buf;
+                    for (buf = nb_tx; buf < nb_rx; buf++)
+                        rte_pktmbuf_free(bufs[buf]);
+                }
+            }
+        }
+    }
+
+
+The main work of the application is done within the loop:
+
+.. code-block:: c
+
+        for (;;) {
+            for (port = 0; port < nb_ports; port++) {
+
+                /* Get burst of RX packets, from first port of pair. */
+                struct rte_mbuf *bufs[BURST_SIZE];
+                const uint16_t nb_rx = rte_eth_rx_burst(port, 0,
+                        bufs, BURST_SIZE);
+
+                if (unlikely(nb_rx == 0))
+                    continue;
+
+                /* Send burst of TX packets, to second port of pair. */
+                const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1, 0,
+                        bufs, nb_rx);
+
+                /* Free any unsent packets. */
+                if (unlikely(nb_tx < nb_rx)) {
+                    uint16_t buf;
+                    for (buf = nb_tx; buf < nb_rx; buf++)
+                        rte_pktmbuf_free(bufs[buf]);
+                }
+            }
+        }
+
+Packets are received in bursts on the RX ports and transmitted in bursts on
+the TX ports. The ports are grouped in pairs with a simple mapping scheme
+using the an XOR on the port number::
+
+    0 -> 1
+    1 -> 0
+
+    2 -> 3
+    3 -> 2
+
+    etc.
+
+The ``rte_eth_tx_burst()`` function frees the memory buffers of packets that
+are transmitted. If packets fail to transmit, ``(nb_tx < nb_rx)``, then they
+must be freed explicitly using ``rte_pktmbuf_free()``.
+
+The forwarding loop can be interrupted and the application closed using
+``Ctrl-C``.
diff --git a/examples/Makefile b/examples/Makefile
index 1665df1..10168a9 100644
--- a/examples/Makefile
+++ b/examples/Makefile
@@ -74,6 +74,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += qos_sched
 DIRS-y += quota_watermark
 DIRS-$(CONFIG_RTE_ETHDEV_RXTX_CALLBACKS) += rxtx_callbacks
 DIRS-y += skeleton
+DIRS-y += skeleton-cat
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += tep_termination
 DIRS-$(CONFIG_RTE_LIBRTE_TIMER) += timer
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += vhost
diff --git a/examples/skeleton-cat/Makefile b/examples/skeleton-cat/Makefile
new file mode 100644
index 0000000..e3ef642
--- /dev/null
+++ b/examples/skeleton-cat/Makefile
@@ -0,0 +1,68 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overridden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+# Default location of PQoS library and includes,
+# can be overridden by command line or environment
+PQOS_INSTALL_PATH ?= /usr/local/
+PQOS_LIBRARY_PATH = $(PQOS_INSTALL_PATH)/lib/libpqos.a
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = basicfwd-cat
+
+# all source are stored in SRCS-y
+SRCS-y := basicfwd-cat.c cat.c
+
+CFLAGS += $(WERROR_FLAGS)
+
+# workaround for a gcc bug with noreturn attribute
+# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603
+ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
+CFLAGS_main.o += -Wno-return-type
+endif
+
+EXTRA_CFLAGS += -O3 -g -Wfatal-errors
+
+CFLAGS += -I$(PQOS_INSTALL_PATH)/include
+CFLAGS_cat.o := -D_GNU_SOURCE
+
+LDLIBS += -L$(PQOS_INSTALL_PATH)/lib
+LDLIBS += $(PQOS_LIBRARY_PATH) \
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/skeleton-cat/basicfwd-cat.c b/examples/skeleton-cat/basicfwd-cat.c
new file mode 100644
index 0000000..cb68bef
--- /dev/null
+++ b/examples/skeleton-cat/basicfwd-cat.c
@@ -0,0 +1,220 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <inttypes.h>
+#include <rte_eal.h>
+#include <rte_ethdev.h>
+#include <rte_cycles.h>
+#include <rte_lcore.h>
+#include <rte_mbuf.h>
+
+#include "cat.h"
+
+#define RX_RING_SIZE 128
+#define TX_RING_SIZE 512
+
+#define NUM_MBUFS 8191
+#define MBUF_CACHE_SIZE 250
+#define BURST_SIZE 32
+
+static const struct rte_eth_conf port_conf_default = {
+	.rxmode = { .max_rx_pkt_len = ETHER_MAX_LEN }
+};
+
+/* basicfwd-cat.c: CAT enabled, basic DPDK skeleton forwarding example. */
+
+/*
+ * Initializes a given port using global settings and with the RX buffers
+ * coming from the mbuf_pool passed as a parameter.
+ */
+static inline int
+port_init(uint8_t port, struct rte_mempool *mbuf_pool)
+{
+	struct rte_eth_conf port_conf = port_conf_default;
+	const uint16_t rx_rings = 1, tx_rings = 1;
+	int retval;
+	uint16_t q;
+
+	if (port >= rte_eth_dev_count())
+		return -1;
+
+	/* Configure the Ethernet device. */
+	retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
+	if (retval != 0)
+		return retval;
+
+	/* Allocate and set up 1 RX queue per Ethernet port. */
+	for (q = 0; q < rx_rings; q++) {
+		retval = rte_eth_rx_queue_setup(port, q, RX_RING_SIZE,
+				rte_eth_dev_socket_id(port), NULL, mbuf_pool);
+		if (retval < 0)
+			return retval;
+	}
+
+	/* Allocate and set up 1 TX queue per Ethernet port. */
+	for (q = 0; q < tx_rings; q++) {
+		retval = rte_eth_tx_queue_setup(port, q, TX_RING_SIZE,
+				rte_eth_dev_socket_id(port), NULL);
+		if (retval < 0)
+			return retval;
+	}
+
+	/* Start the Ethernet port. */
+	retval = rte_eth_dev_start(port);
+	if (retval < 0)
+		return retval;
+
+	/* Display the port MAC address. */
+	struct ether_addr addr;
+	rte_eth_macaddr_get(port, &addr);
+	printf("Port %u MAC: %02" PRIx8 " %02" PRIx8 " %02" PRIx8
+			   " %02" PRIx8 " %02" PRIx8 " %02" PRIx8 "\n",
+			(unsigned)port,
+			addr.addr_bytes[0], addr.addr_bytes[1],
+			addr.addr_bytes[2], addr.addr_bytes[3],
+			addr.addr_bytes[4], addr.addr_bytes[5]);
+
+	/* Enable RX in promiscuous mode for the Ethernet device. */
+	rte_eth_promiscuous_enable(port);
+
+	return 0;
+}
+
+/*
+ * The lcore main. This is the main thread that does the work, reading from
+ * an input port and writing to an output port.
+ */
+static __attribute__((noreturn)) void
+lcore_main(void)
+{
+	const uint8_t nb_ports = rte_eth_dev_count();
+	uint8_t port;
+
+	/*
+	 * Check that the port is on the same NUMA node as the polling thread
+	 * for best performance.
+	 */
+	for (port = 0; port < nb_ports; port++)
+		if (rte_eth_dev_socket_id(port) > 0 &&
+				rte_eth_dev_socket_id(port) !=
+						(int)rte_socket_id())
+			printf("WARNING, port %u is on remote NUMA node to "
+					"polling thread.\n\tPerformance will "
+					"not be optimal.\n", port);
+
+	printf("\nCore %u forwarding packets. [Ctrl+C to quit]\n",
+			rte_lcore_id());
+
+	/* Run until the application is quit or killed. */
+	for (;;) {
+		/*
+		 * Receive packets on a port and forward them on the paired
+		 * port. The mapping is 0 -> 1, 1 -> 0, 2 -> 3, 3 -> 2, etc.
+		 */
+		for (port = 0; port < nb_ports; port++) {
+
+			/* Get burst of RX packets, from first port of pair. */
+			struct rte_mbuf *bufs[BURST_SIZE];
+			const uint16_t nb_rx = rte_eth_rx_burst(port, 0,
+					bufs, BURST_SIZE);
+
+			if (unlikely(nb_rx == 0))
+				continue;
+
+			/* Send burst of TX packets, to second port of pair. */
+			const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1, 0,
+					bufs, nb_rx);
+
+			/* Free any unsent packets. */
+			if (unlikely(nb_tx < nb_rx)) {
+				uint16_t buf;
+				for (buf = nb_tx; buf < nb_rx; buf++)
+					rte_pktmbuf_free(bufs[buf]);
+			}
+		}
+	}
+}
+
+/*
+ * The main function, which does initialization and calls the per-lcore
+ * functions.
+ */
+int
+main(int argc, char *argv[])
+{
+	struct rte_mempool *mbuf_pool;
+	unsigned nb_ports;
+	uint8_t portid;
+
+	/* Initialize the Environment Abstraction Layer (EAL). */
+	int ret = rte_eal_init(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Error with EAL initialization\n");
+
+	argc -= ret;
+	argv += ret;
+
+	ret = cat_init(argc, argv);
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "PQOS: L3CA init failed!\n");
+
+	argc -= ret;
+	argv += ret;
+
+	/* Check that there is an even number of ports to send/receive on. */
+	nb_ports = rte_eth_dev_count();
+	if (nb_ports < 2 || (nb_ports & 1))
+		rte_exit(EXIT_FAILURE, "Error: number of ports must be even\n");
+
+	/* Creates a new mempool in memory to hold the mbufs. */
+	mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NUM_MBUFS * nb_ports,
+		MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+
+	if (mbuf_pool == NULL)
+		rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n");
+
+	/* Initialize all ports. */
+	for (portid = 0; portid < nb_ports; portid++)
+		if (port_init(portid, mbuf_pool) != 0)
+			rte_exit(EXIT_FAILURE, "Cannot init port %"PRIu8 "\n",
+					portid);
+
+	if (rte_lcore_count() > 1)
+		printf("\nWARNING: Too many lcores enabled. Only 1 used.\n");
+
+	/* Call lcore_main on the master core only. */
+	lcore_main();
+
+	return 0;
+}
diff --git a/examples/skeleton-cat/cat.c b/examples/skeleton-cat/cat.c
new file mode 100644
index 0000000..f594b4d
--- /dev/null
+++ b/examples/skeleton-cat/cat.c
@@ -0,0 +1,992 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <getopt.h>
+#include <inttypes.h>
+#include <limits.h>
+#include <sched.h>
+#include <signal.h>
+#include <stdio.h>
+
+#include <rte_common.h>
+#include <rte_memcpy.h>
+
+#include <pqos.h>
+
+#include "cat.h"
+
+#define BITS_PER_HEX		4
+#define PQOS_MAX_SOCKETS	8
+#define PQOS_MAX_SOCKET_CORES	64
+#define PQOS_MAX_CORES		(PQOS_MAX_SOCKET_CORES * PQOS_MAX_SOCKETS)
+
+static const struct pqos_cap *m_cap;
+static const struct pqos_cpuinfo *m_cpu;
+static const struct pqos_capability *m_cap_l3ca;
+static unsigned m_sockets[PQOS_MAX_SOCKETS];
+static unsigned m_sock_count;
+static struct cat_config m_config[PQOS_MAX_CORES];
+static unsigned m_config_count;
+
+static unsigned
+bits_count(uint64_t bitmask)
+{
+	unsigned count = 0;
+
+	for (; bitmask != 0; count++)
+		bitmask &= bitmask - 1;
+
+	return count;
+}
+
+/*
+ * Parse elem, the elem could be single number/range or '(' ')' group
+ * 1) A single number elem, it's just a simple digit. e.g. 9
+ * 2) A single range elem, two digits with a '-' between. e.g. 2-6
+ * 3) A group elem, combines multiple 1) or 2) with '( )'. e.g (0,2-4,6)
+ *    Within group elem, '-' used for a range separator;
+ *                       ',' used for a single number.
+ */
+static int
+parse_set(const char *input, rte_cpuset_t *cpusetp)
+{
+	unsigned idx;
+	const char *str = input;
+	char *end = NULL;
+	unsigned min, max;
+	const unsigned num = PQOS_MAX_CORES;
+
+	CPU_ZERO(cpusetp);
+
+	while (isblank(*str))
+		str++;
+
+	/* only digit or left bracket is qualify for start point */
+	if ((!isdigit(*str) && *str != '(') || *str == '\0')
+		return -1;
+
+	/* process single number or single range of number */
+	if (*str != '(') {
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		while (isblank(*end))
+			end++;
+
+		min = idx;
+		max = idx;
+		if (*end == '-') {
+			/* process single <number>-<number> */
+			end++;
+			while (isblank(*end))
+				end++;
+			if (!isdigit(*end))
+				return -1;
+
+			errno = 0;
+			idx = strtoul(end, &end, 10);
+			if (errno || end == NULL || idx >= num)
+				return -1;
+			max = idx;
+			while (isblank(*end))
+				end++;
+			if (*end != ',' && *end != '\0')
+				return -1;
+		}
+
+		if (*end != ',' && *end != '\0' && *end != '@')
+			return -1;
+
+		for (idx = RTE_MIN(min, max); idx <= RTE_MAX(min, max);
+				idx++)
+			CPU_SET(idx, cpusetp);
+
+		return end - input;
+	}
+
+	/* process set within bracket */
+	str++;
+	while (isblank(*str))
+		str++;
+	if (*str == '\0')
+		return -1;
+
+	min = PQOS_MAX_CORES;
+	do {
+
+		/* go ahead to the first digit */
+		while (isblank(*str))
+			str++;
+		if (!isdigit(*str))
+			return -1;
+
+		/* get the digit value */
+		errno = 0;
+		idx = strtoul(str, &end, 10);
+		if (errno || end == NULL || idx >= num)
+			return -1;
+
+		/* go ahead to separator '-',',' and ')' */
+		while (isblank(*end))
+			end++;
+		if (*end == '-') {
+			if (min == PQOS_MAX_CORES)
+				min = idx;
+			else /* avoid continuous '-' */
+				return -1;
+		} else if ((*end == ',') || (*end == ')')) {
+			max = idx;
+			if (min == PQOS_MAX_CORES)
+				min = idx;
+			for (idx = RTE_MIN(min, max); idx <= RTE_MAX(min, max);
+					idx++)
+				CPU_SET(idx, cpusetp);
+
+			min = PQOS_MAX_CORES;
+		} else
+			return -1;
+
+		str = end + 1;
+	} while (*end != '\0' && *end != ')');
+
+	return str - input;
+}
+
+/* Test if bitmask is contiguous */
+static int
+is_contiguous(uint64_t bitmask)
+{
+	/* check if bitmask is contiguous */
+	unsigned i = 0;
+	unsigned j = 0;
+	const unsigned max_idx = (sizeof(bitmask) * CHAR_BIT);
+
+	if (bitmask == 0)
+		return 0;
+
+	for (i = 0; i < max_idx; i++) {
+		if (((1ULL << i) & bitmask) != 0)
+			j++;
+		else if (j > 0)
+			break;
+	}
+
+	if (bits_count(bitmask) != j) {
+		printf("PQOS: mask 0x%llx is not contiguous.\n",
+			(unsigned long long)bitmask);
+		return 0;
+	}
+
+	return 1;
+}
+
+/*
+ * The format pattern: --l3ca='<cbm at cpus>[,<(ccbm,dcbm)@cpus>...]'
+ * cbm could be a single mask or for a CDP enabled system, a group of two masks
+ * ("code cbm" and "data cbm")
+ * '(' and ')' are necessary if it's a group.
+ * cpus could be a single digit/range or a group.
+ * '(' and ')' are necessary if it's a group.
+ *
+ * e.g. '0x00F00@(1,3), 0x0FF00@(4-6), 0xF0000 at 7'
+ * - CPUs 1 and 3 share its 4 ways with CPUs 4, 5 and 6;
+ * - CPUs 4,5 and 6 share half (4 out of 8 ways) of its L3 with 1 and 3;
+ * - CPUs 4,5 and 6 have exclusive access to 4 out of  8 ways;
+ * - CPU 7 has exclusive access to all of its 4 ways;
+ *
+ * e.g. '(0x00C00,0x00300)@(1,3)' for a CDP enabled system
+ * - cpus 1 and 3 have access to 2 ways for code and 2 ways for data,
+ *   code and data ways are not overlapping.;
+ */
+static int
+parse_l3ca(const char *l3ca)
+{
+	unsigned idx = 0;
+	const char *cbm_start = NULL;
+	char *cbm_end = NULL;
+	const char *end = NULL;
+	int offset;
+	rte_cpuset_t cpuset;
+	uint64_t mask = 0;
+	uint64_t cmask = 0;
+
+	if (l3ca == NULL)
+		goto err;
+
+	/* Get cbm */
+	do {
+		CPU_ZERO(&cpuset);
+		mask = 0;
+		cmask = 0;
+
+		while (isblank(*l3ca))
+			l3ca++;
+
+		if (*l3ca == '\0')
+			goto err;
+
+		/* record mask_set start point */
+		cbm_start = l3ca;
+
+		/* go across a complete bracket */
+		if (*cbm_start == '(') {
+			l3ca += strcspn(l3ca, ")");
+			if (*l3ca++ == '\0')
+				goto err;
+		}
+
+		/* scan the separator '@', ','(next) or '\0'(finish) */
+		l3ca += strcspn(l3ca, "@,");
+
+		if (*l3ca == '@') {
+			/* explicit assign cpu_set */
+			offset = parse_set(l3ca + 1, &cpuset);
+			if (offset < 0 || CPU_COUNT(&cpuset) == 0)
+				goto err;
+
+			end = l3ca + 1 + offset;
+		} else
+			goto err;
+
+		if (*end != ',' && *end != '\0')
+			goto err;
+
+		/* parse mask_set from start point */
+		if (*cbm_start == '(') {
+			cbm_start++;
+
+			while (isblank(*cbm_start))
+				cbm_start++;
+
+			if (!isxdigit(*cbm_start))
+				goto err;
+
+			errno = 0;
+			cmask = strtoul(cbm_start, &cbm_end, 16);
+			if (errno != 0 || cbm_end == NULL || cmask == 0)
+				goto err;
+
+			while (isblank(*cbm_end))
+				cbm_end++;
+
+			if (*cbm_end != ',')
+				goto err;
+
+			cbm_end++;
+
+			while (isblank(*cbm_end))
+				cbm_end++;
+
+			if (!isxdigit(*cbm_end))
+				goto err;
+
+			errno = 0;
+			mask = strtoul(cbm_end, &cbm_end, 16);
+			if (errno != 0 || cbm_end == NULL || mask == 0)
+				goto err;
+		} else {
+			while (isblank(*cbm_start))
+				cbm_start++;
+
+			if (!isxdigit(*cbm_start))
+				goto err;
+
+			errno = 0;
+			mask = strtoul(cbm_start, &cbm_end, 16);
+			if (errno != 0 || cbm_end == NULL || mask == 0)
+				goto err;
+
+		}
+
+		if (mask == 0 || is_contiguous(mask) == 0)
+			goto err;
+
+		if (cmask != 0 && is_contiguous(cmask) == 0)
+			goto err;
+
+		rte_memcpy(&m_config[idx].cpumask,
+			&cpuset, sizeof(rte_cpuset_t));
+
+		if (cmask != 0) {
+			m_config[idx].cdp = 1;
+			m_config[idx].code_mask = cmask;
+			m_config[idx].data_mask = mask;
+		} else
+			m_config[idx].mask = mask;
+
+		m_config_count++;
+
+		l3ca = end + 1;
+		idx++;
+	} while (*end != '\0' && idx < PQOS_MAX_CORES);
+
+	if (m_config_count == 0)
+		goto err;
+
+	return 0;
+
+err:
+	return -EINVAL;
+}
+
+static int
+check_cpus_overlapping(void)
+{
+	unsigned i = 0;
+	unsigned j = 0;
+	rte_cpuset_t mask;
+
+	CPU_ZERO(&mask);
+
+	for (i = 0; i < m_config_count; i++) {
+		for (j = i + 1; j < m_config_count; j++) {
+			CPU_AND(&mask,
+				&m_config[i].cpumask,
+				&m_config[j].cpumask);
+
+			if (CPU_COUNT(&mask) != 0) {
+				printf("PQOS: Requested CPUs sets are "
+					"overlapping.\n");
+				return -EINVAL;
+			}
+		}
+	}
+
+	return 0;
+}
+
+static int
+check_cpus(void)
+{
+	unsigned i = 0;
+	unsigned cpu_id = 0;
+	unsigned cos_id = 0;
+	int ret = 0;
+
+	for (i = 0; i < m_config_count; i++) {
+		for (cpu_id = 0; cpu_id < PQOS_MAX_CORES; cpu_id++) {
+			if (CPU_ISSET(cpu_id, &m_config[i].cpumask) != 0) {
+
+				ret = pqos_cpu_check_core(m_cpu, cpu_id);
+				if (ret != PQOS_RETVAL_OK) {
+					printf("PQOS: %u is not a valid "
+						"logical core id.\n", cpu_id);
+					ret = -ENODEV;
+					goto exit;
+				}
+
+				ret = pqos_l3ca_assoc_get(cpu_id, &cos_id);
+				if (ret != PQOS_RETVAL_OK) {
+					printf("PQOS: Failed to read COS "
+						"associated to cpu %u.\n",
+						cpu_id);
+					ret = -EFAULT;
+					goto exit;
+				}
+
+				/* Check if COS assigned to lcore is different
+				 * then default one (#0) */
+				if (cos_id != 0) {
+					printf("PQOS: cpu %u has already "
+						"associated COS#%u. "
+						"Please reset L3CA.\n",
+						cpu_id, cos_id);
+					ret = -EBUSY;
+					goto exit;
+				}
+			}
+		}
+	}
+
+exit:
+	return ret;
+}
+
+static int
+check_cdp(void)
+{
+	unsigned i = 0;
+
+	for (i = 0; i < m_config_count; i++) {
+		if (m_config[i].cdp == 1 && m_cap_l3ca->u.l3ca->cdp_on == 0) {
+			if (m_cap_l3ca->u.l3ca->cdp == 0) {
+				printf("PQOS: CDP requested but not "
+					"supported.\n");
+			} else {
+				printf("PQOS: CDP requested but not enabled. "
+					"Please enable CDP.\n");
+			}
+			return -ENOTSUP;
+		}
+	}
+
+	return 0;
+}
+
+static int
+check_cbm_len_and_contention(void)
+{
+	unsigned i = 0;
+	uint64_t mask = 0;
+	const uint64_t not_cbm = (UINT64_MAX << (m_cap_l3ca->u.l3ca->num_ways));
+	const uint64_t cbm_contention_mask = m_cap_l3ca->u.l3ca->way_contention;
+	int ret = 0;
+
+	for (i = 0; i < m_config_count; i++) {
+		if (m_config[i].cdp == 1)
+			mask = m_config[i].code_mask | m_config[i].data_mask;
+		else
+			mask = m_config[i].mask;
+
+		if ((mask & not_cbm) != 0) {
+			printf("PQOS: One or more of requested CBM masks not "
+				"supported by system (too long).\n");
+			ret = -ENOTSUP;
+			break;
+		}
+
+		/* Just a warning */
+		if ((mask & cbm_contention_mask) != 0) {
+			printf("PQOS: One or more of requested CBM  masks "
+				"overlap CBM contention mask.\n");
+			break;
+		}
+
+	}
+
+	return ret;
+}
+
+static int
+check_and_select_classes(unsigned cos_id_map[][PQOS_MAX_SOCKETS])
+{
+	unsigned i = 0;
+	unsigned j = 0;
+	unsigned phy_pkg_id = 0;
+	unsigned cos_id = 0;
+	unsigned cpu_id = 0;
+	unsigned phy_pkg_lcores[PQOS_MAX_SOCKETS][m_config_count];
+	const unsigned cos_num = m_cap_l3ca->u.l3ca->num_classes;
+	unsigned used_cos_table[PQOS_MAX_SOCKETS][cos_num];
+	int ret = 0;
+
+	memset(phy_pkg_lcores, 0, sizeof(phy_pkg_lcores));
+	memset(used_cos_table, 0, sizeof(used_cos_table));
+
+	/* detect currently used COS */
+	for (j = 0; j < m_cpu->num_cores; j++) {
+		cpu_id = m_cpu->cores[j].lcore;
+
+		ret = pqos_l3ca_assoc_get(cpu_id, &cos_id);
+		if (ret != PQOS_RETVAL_OK) {
+			printf("PQOS: Failed to read COS associated to "
+				"cpu %u on phy_pkg %u.\n", cpu_id, phy_pkg_id);
+			ret = -EFAULT;
+			goto exit;
+		}
+
+		ret = pqos_cpu_get_socketid(m_cpu, cpu_id, &phy_pkg_id);
+		if (ret != PQOS_RETVAL_OK) {
+			printf("PQOS: Failed to get socket for cpu %u\n",
+				cpu_id);
+			ret = -EFAULT;
+			goto exit;
+		}
+
+		/* Mark COS as used */
+		if (used_cos_table[phy_pkg_id][cos_id] == 0)
+			used_cos_table[phy_pkg_id][cos_id]++;
+	}
+
+	/* look for avail. COS to fulfill requested config */
+	for (i = 0; i < m_config_count; i++) {
+		for (j = 0; j < m_cpu->num_cores; j++) {
+			cpu_id = m_cpu->cores[j].lcore;
+			if (CPU_ISSET(cpu_id, &m_config[i].cpumask) == 0)
+				continue;
+
+			ret = pqos_cpu_get_socketid(m_cpu, cpu_id, &phy_pkg_id);
+			if (ret != PQOS_RETVAL_OK) {
+				printf("PQOS: Failed to get socket for "
+					"cpu %u\n", cpu_id);
+				ret = -EFAULT;
+				goto exit;
+			}
+
+			/* Check if we already have COS selected
+			 * to be used for that group on that socket */
+			if (phy_pkg_lcores[phy_pkg_id][i] != 0)
+				continue;
+
+			phy_pkg_lcores[phy_pkg_id][i]++;
+
+			/* Search for avail. COS to be used on that socket */
+			for (cos_id = 0; cos_id < cos_num; cos_id++) {
+				if (used_cos_table[phy_pkg_id][cos_id] == 0) {
+					used_cos_table[phy_pkg_id][cos_id]++;
+					cos_id_map[i][phy_pkg_id] = cos_id;
+					break;
+				}
+			}
+
+			/* If there is no COS available ...*/
+			if (cos_id == cos_num) {
+				ret = -E2BIG;
+				goto exit;
+			}
+		}
+	}
+
+exit:
+	if (ret != 0)
+		printf("PQOS: Not enough available COS to configure "
+			"requested configuration.\n");
+
+	return ret;
+}
+
+static int
+configure_cat(unsigned cos_id_map[][PQOS_MAX_SOCKETS])
+{
+	unsigned phy_pkg_id = 0;
+	unsigned cpu_id = 0;
+	unsigned cos_id = 0;
+	unsigned i = 0;
+	unsigned j = 0;
+	struct pqos_l3ca l3ca = {0};
+	int ret = 0;
+
+	for (i = 0; i < m_config_count; i++) {
+		memset(&l3ca, 0, sizeof(l3ca));
+
+		l3ca.cdp = m_config[i].cdp;
+		if (m_config[i].cdp == 1) {
+			l3ca.code_mask = m_config[i].code_mask;
+			l3ca.data_mask = m_config[i].data_mask;
+		} else
+			l3ca.ways_mask = m_config[i].mask;
+
+		for (j = 0; j < m_sock_count; j++) {
+			phy_pkg_id = m_sockets[j];
+			if (cos_id_map[i][phy_pkg_id] == 0)
+				continue;
+
+			l3ca.class_id = cos_id_map[i][phy_pkg_id];
+
+			ret = pqos_l3ca_set(phy_pkg_id, 1, &l3ca);
+			if (ret != PQOS_RETVAL_OK) {
+				printf("PQOS: Failed to set COS %u on "
+					"phy_pkg %u.\n", l3ca.class_id,
+					phy_pkg_id);
+				ret = -EFAULT;
+				goto exit;
+			}
+		}
+	}
+
+	for (i = 0; i < m_config_count; i++) {
+		for (j = 0; j < m_cpu->num_cores; j++) {
+			cpu_id = m_cpu->cores[j].lcore;
+			if (CPU_ISSET(cpu_id, &m_config[i].cpumask) == 0)
+				continue;
+
+			ret = pqos_cpu_get_socketid(m_cpu, cpu_id, &phy_pkg_id);
+			if (ret != PQOS_RETVAL_OK) {
+				printf("PQOS: Failed to get socket for "
+					"cpu %u\n", cpu_id);
+				ret = -EFAULT;
+				goto exit;
+			}
+
+			cos_id = cos_id_map[i][phy_pkg_id];
+
+			ret = pqos_l3ca_assoc_set(cpu_id, cos_id);
+			if (ret != PQOS_RETVAL_OK) {
+				printf("PQOS: Failed to associate COS %u to "
+					"cpu %u\n", cos_id, cpu_id);
+				ret = -EFAULT;
+				goto exit;
+			}
+		}
+	}
+
+exit:
+	return ret;
+}
+
+
+/* Parse the argument given in the command line of the application */
+static int
+parse_args(int argc, char **argv)
+{
+	int opt = 0;
+	int retval = 0;
+	int oldopterr = 0;
+	char **argvopt = argv;
+	char *prgname = argv[0];
+
+	static struct option lgopts[] = {
+		{ "l3ca", required_argument, 0, 0 },
+		{ NULL, 0, 0, 0 }
+	};
+
+	/* Disable printing messages within getopt() */
+	oldopterr = opterr;
+	opterr = 0;
+
+	opt = getopt_long(argc, argvopt, "", lgopts, NULL);
+	if (opt == 0) {
+		retval = parse_l3ca(optarg);
+		if (retval != 0) {
+			printf("PQOS: Invalid L3CA parameters!\n");
+			goto exit;
+		}
+
+		argv[optind - 1] = prgname;
+		retval = optind - 1;
+	} else
+		retval = 0;
+
+exit:
+	/* reset getopt lib */
+	optind = 0;
+
+	/* Restore opterr value */
+	opterr = oldopterr;
+
+	return retval;
+}
+
+static void
+print_cmd_line_config(void)
+{
+	char cpustr[PQOS_MAX_CORES * 3] = {0};
+	unsigned i = 0;
+	unsigned j = 0;
+
+	for (i = 0; i < m_config_count; i++) {
+		unsigned len = 0;
+		memset(cpustr, 0, sizeof(cpustr));
+
+		/* Generate CPU list */
+		for (j = 0; j < PQOS_MAX_CORES; j++) {
+			if (CPU_ISSET(j, &m_config[i].cpumask) != 1)
+				continue;
+
+			len += snprintf(cpustr + len, sizeof(cpustr) - len - 1,
+				"%u,", j);
+
+			if (len >= sizeof(cpustr) - 1)
+				break;
+		}
+
+		if (m_config[i].cdp == 1) {
+			printf("PQOS: CPUs: %s cMASK: 0x%llx, dMASK: "
+				"0x%llx\n", cpustr,
+				(unsigned long long)m_config[i].code_mask,
+				(unsigned long long)m_config[i].data_mask);
+		} else {
+			printf("PQOS: CPUs: %s MASK: 0x%llx\n", cpustr,
+					(unsigned long long)m_config[i].mask);
+		}
+	}
+}
+
+/**
+ * @brief Prints CAT configuration
+ */
+static void
+print_cat_config(void)
+{
+	int ret = PQOS_RETVAL_OK;
+	unsigned i = 0;
+
+	for (i = 0; i < m_sock_count; i++) {
+		struct pqos_l3ca tab[PQOS_MAX_L3CA_COS] = {{0} };
+		unsigned num = 0;
+		unsigned n = 0;
+
+		ret = pqos_l3ca_get(m_sockets[i], PQOS_MAX_L3CA_COS, &num, tab);
+		if (ret != PQOS_RETVAL_OK) {
+			printf("PQOS: Error retrieving COS!\n");
+			return;
+		}
+
+		printf("PQOS: COS definitions for Socket %u:\n", m_sockets[i]);
+		for (n = 0; n < num; n++) {
+			if (tab[n].cdp == 1) {
+				printf("PQOS: COS: %u, cMASK: 0x%llx, "
+					"dMASK: 0x%llx\n", tab[n].class_id,
+					(unsigned long long)tab[n].code_mask,
+					(unsigned long long)tab[n].data_mask);
+			} else {
+				printf("PQOS: COS: %u, MASK: 0x%llx\n",
+					tab[n].class_id,
+					(unsigned long long)tab[n].ways_mask);
+			}
+		}
+	}
+
+	for (i = 0; i < m_sock_count; i++) {
+		unsigned lcores[PQOS_MAX_SOCKET_CORES] = {0};
+		unsigned lcount = 0;
+		unsigned n = 0;
+
+		ret = pqos_cpu_get_cores(m_cpu, m_sockets[i],
+				PQOS_MAX_SOCKET_CORES, &lcount, &lcores[0]);
+		if (ret != PQOS_RETVAL_OK) {
+			printf("PQOS: Error retrieving core information!\n");
+			return;
+		}
+
+		printf("PQOS: CPU information for socket %u:\n", m_sockets[i]);
+		for (n = 0; n < lcount; n++) {
+			unsigned class_id = 0;
+
+			ret = pqos_l3ca_assoc_get(lcores[n], &class_id);
+			if (ret == PQOS_RETVAL_OK)
+				printf("PQOS: CPU: %u, COS: %u\n", lcores[n],
+					class_id);
+			else
+				printf("PQOS: CPU: %u, ERROR\n", lcores[n]);
+		}
+	}
+
+}
+
+static int
+cat_validate(void)
+{
+	int ret = 0;
+
+	ret = check_cpus();
+	if (ret != 0)
+		return ret;
+
+	ret = check_cdp();
+	if (ret != 0)
+		return ret;
+
+	ret = check_cbm_len_and_contention();
+	if (ret != 0)
+		return ret;
+
+	ret = check_cpus_overlapping();
+	if (ret != 0)
+		return ret;
+
+	return 0;
+}
+
+static int
+cat_set(void)
+{
+	int ret = 0;
+	unsigned cos_id_map[m_config_count][PQOS_MAX_SOCKETS];
+
+	memset(cos_id_map, 0, sizeof(cos_id_map));
+
+	ret = check_and_select_classes(cos_id_map);
+	if (ret != 0)
+		return ret;
+
+	ret = configure_cat(cos_id_map);
+	if (ret != 0)
+		return ret;
+
+	return 0;
+}
+
+static void
+cat_fini(void)
+{
+	int ret = 0;
+
+	printf("PQOS: Shutting down PQoS library...\n");
+
+	/* deallocate all the resources */
+	ret = pqos_fini();
+	if (ret != PQOS_RETVAL_OK && ret != PQOS_RETVAL_INIT)
+		printf("PQOS: Error shutting down PQoS library!\n");
+
+	m_cap = NULL;
+	m_cpu = NULL;
+	m_cap_l3ca = NULL;
+	memset(m_sockets, 0, sizeof(m_sockets));
+	m_sock_count = 0;
+	memset(m_config, 0, sizeof(m_config));
+	m_config_count = 0;
+}
+
+void
+cat_exit(void)
+{
+	unsigned i = 0;
+	unsigned j = 0;
+	unsigned cpu_id = 0;
+	int ret = 0;
+
+	/* if lib is not initialized, do nothing */
+	if (m_cap == NULL && m_cpu == NULL)
+		return;
+
+	printf("PQOS: Reverting CAT configuration...\n");
+
+	for (i = 0; i < m_config_count; i++) {
+		for (j = 0; j < m_cpu->num_cores; j++) {
+			cpu_id = m_cpu->cores[j].lcore;
+			if (CPU_ISSET(cpu_id, &m_config[i].cpumask) == 0)
+				continue;
+
+			ret = pqos_l3ca_assoc_set(cpu_id, 0);
+			if (ret != PQOS_RETVAL_OK) {
+				printf("PQOS: Failed to associate COS 0 to "
+					"cpu %u\n", cpu_id);
+			}
+		}
+	}
+
+	cat_fini();
+}
+
+static void
+signal_handler(int signum)
+{
+	if (signum == SIGINT || signum == SIGTERM) {
+		printf("\nPQOS: Signal %d received, preparing to exit...\n",
+				signum);
+
+		cat_exit();
+
+		/* exit with the expected status */
+		signal(signum, SIG_DFL);
+		kill(getpid(), signum);
+	}
+}
+
+int
+cat_init(int argc, char **argv)
+{
+	int ret = 0;
+	int args_num = 0;
+	struct pqos_config cfg = {0};
+
+	if (m_cap != NULL || m_cpu != NULL) {
+		printf("PQOS: CAT module already initialized!\n");
+		return -EEXIST;
+	}
+
+	/* Parse cmd line args */
+	ret = parse_args(argc, argv);
+
+	if (ret <= 0)
+		goto err;
+
+	args_num = ret;
+
+	/* Print cmd line configuration */
+	print_cmd_line_config();
+
+	/* PQoS Initialization - Check and initialize CAT capability */
+	cfg.fd_log = STDOUT_FILENO;
+	cfg.verbose = 0;
+	cfg.cdp_cfg = PQOS_REQUIRE_CDP_ANY;
+	ret = pqos_init(&cfg);
+	if (ret != PQOS_RETVAL_OK) {
+		printf("PQOS: Error initializing PQoS library!\n");
+		ret = -EFAULT;
+		goto err;
+	}
+
+	/* Get capability and CPU info pointer */
+	ret = pqos_cap_get(&m_cap, &m_cpu);
+	if (ret != PQOS_RETVAL_OK || m_cap == NULL || m_cpu == NULL) {
+		printf("PQOS: Error retrieving PQoS capabilities!\n");
+		ret = -EFAULT;
+		goto err;
+	}
+
+	/* Get L3CA capabilities */
+	ret = pqos_cap_get_type(m_cap, PQOS_CAP_TYPE_L3CA, &m_cap_l3ca);
+	if (ret != PQOS_RETVAL_OK || m_cap_l3ca == NULL) {
+		printf("PQOS: Error retrieving PQOS_CAP_TYPE_L3CA "
+			"capabilities!\n");
+		ret = -EFAULT;
+		goto err;
+	}
+
+	/* Get CPU socket information */
+	ret = pqos_cpu_get_sockets(m_cpu, PQOS_MAX_SOCKETS, &m_sock_count,
+		m_sockets);
+	if (ret != PQOS_RETVAL_OK) {
+		printf("PQOS: Error retrieving CPU socket information!\n");
+		ret = -EFAULT;
+		goto err;
+	}
+
+	/* Validate cmd line configuration */
+	ret = cat_validate();
+	if (ret != 0) {
+		printf("PQOS: Requested CAT configuration is not valid!\n");
+		goto err;
+	}
+
+	/* configure system */
+	ret = cat_set();
+	if (ret != 0) {
+		printf("PQOS: Failed to configure CAT!\n");
+		goto err;
+	}
+
+	signal(SIGINT, signal_handler);
+	signal(SIGTERM, signal_handler);
+
+	ret = atexit(cat_exit);
+	if (ret != 0) {
+		printf("PQOS: Cannot set exit function\n");
+		goto err;
+	}
+
+	/* Print CAT configuration */
+	print_cat_config();
+
+	return args_num;
+
+err:
+	/* deallocate all the resources */
+	cat_fini();
+	return ret;
+}
diff --git a/examples/skeleton-cat/cat.h b/examples/skeleton-cat/cat.h
new file mode 100644
index 0000000..aef2b76
--- /dev/null
+++ b/examples/skeleton-cat/cat.h
@@ -0,0 +1,72 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of Intel Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _CAT_H
+#define _CAT_H
+
+/**
+ * @file
+ * PQoS CAT
+ */
+
+#include <stdint.h>
+#include <string.h>
+
+#include <rte_lcore.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* L3 cache allocation class of service data structure */
+struct cat_config {
+	rte_cpuset_t cpumask;		/* CPUs bitmask */
+	int cdp;			/* data & code masks used if true */
+	union {
+		uint64_t mask;		/* capacity bitmask (CBM) */
+		struct {
+			uint64_t data_mask; /* data capacity bitmask (CBM) */
+			uint64_t code_mask; /* code capacity bitmask (CBM) */
+		};
+	};
+};
+
+int cat_init(int argc, char **argv);
+
+void cat_exit(void);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _CAT_H */
-- 
1.9.3



More information about the dev mailing list