[dpdk-ci] [PATCH] ci: added patch parser for patch files

Owen Hilyard ohilyard at iol.unh.edu
Fri Dec 4 20:45:11 CET 2020

This commit contains a script, patch_parser.py, and a config file,
patch_parser.cfg. These are tooling that the UNH CI team has been
testing in order to reduce the number of tests that need to be run
per patch. This resulted from our push to increase the number of
functional tests running in the CI. While working on expanding test
coverage, we found that DTS could easily take over 6 hours to run, so
we decided to begin work on tagging patches and then only running the
required tests.

The script works by taking in an address for the config file and then
a list of patch files, which it will parse and then produce a list of
tags for that list of patches based on the config file. The config file
is designed to work as a mapping for a base path to a set of tags. It
also contains an ordered list of priorities for tags so that this may
also be used by hierarchical tools rather than modular ones.

The intention of the UNH team with giving this tooling to the wider
DPDK community is to have people more familiar with the internal
functionality of DPDK provide most of the tagging. This would allow
UNH to have a better turn around time for testing by eliminating
unnecessary tests, while still increasing the number of tests in the

The different patch tags are currently defined as such:

    Core DPDK functionality. Examples include kernel modules and
    librte_eal. This tag should be used sparingly as it is intended
     to signal to automated test suites that it is necessary to
     run most of the tests for DPDK and as such will consume CI
     resources for a long period of time.

    For NIC drivers and other hardware interface code. This should be
    used as a generic tag with each driver getting it's own tag.

    Used in a similar manner to "driver". This tag is intended for
    code used in only in applications that DPDK provides, such as
    testpmd or helloworld. This tag should be accompanied by a tag
    which denotes which application specifically has been changed.

    This is intended to be used as a tag for paths which only contain
    documentation, such as "doc/". It's intended use is as a way to
    trigger the automatic re-building of the documentation website.

Signed-off-by: Owen Hilyard <ohilyard at iol.unh.edu>
 config/patch_parser.cfg | 25 ++++++++++++++++
 tools/patch_parser.py   | 64 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 89 insertions(+)
 create mode 100644 config/patch_parser.cfg
 create mode 100755 tools/patch_parser.py

diff --git a/config/patch_parser.cfg b/config/patch_parser.cfg
new file mode 100644
index 0000000..5757f9a
--- /dev/null
+++ b/config/patch_parser.cfg
@@ -0,0 +1,25 @@
+# Description of the categories as initially designed
+drivers =
+    driver,
+    core
+kernel = core
+doc = documentation
+lib = core
+meson_options.txt = core
+examples = application
+app = application
+license = documentation
+VERSION = documentation
+build = core
+# This is an ordered list of the importance of each patch classification.
+# It should be used to determine which classification to use on tools which
+# do not support multiple patch classifications.
+priority_list =
+    core,
+    driver,
+    application,
+    documentation
diff --git a/tools/patch_parser.py b/tools/patch_parser.py
new file mode 100755
index 0000000..01fc55d
--- /dev/null
+++ b/tools/patch_parser.py
@@ -0,0 +1,64 @@
+#!/usr/bin/env python3
+import itertools
+import sys
+from configparser import ConfigParser
+from typing import List, Dict, Set
+def get_patch_files(patch_file: str) -> List[str]:
+    with open(patch_file, 'r') as f:
+        lines = list(itertools.takewhile(
+            lambda line: line.strip().endswith('+') or line.strip().endswith('-'),
+            itertools.dropwhile(
+                lambda line: not line.strip().startswith("---"),
+                f.readlines()
+            )
+        ))
+        filenames = map(lambda line: line.strip().split(' ')[0], lines)
+        # takewhile includes the --- which starts the filenames
+        return list(filenames)[1:]
+def get_all_files_from_patches(patch_files: List[str]) -> Set[str]:
+    return set(itertools.chain.from_iterable(map(get_patch_files, patch_files)))
+def parse_comma_delimited_list_from_string(mod_str: str) -> List[str]:
+    return list(map(str.strip, mod_str.split(',')))
+def get_dictionary_attributes_from_config_file(conf_obj: ConfigParser) -> Dict[str, Set[str]]:
+    return {
+        directory: parse_comma_delimited_list_from_string(module_string) for directory, module_string in
+        conf_obj['Paths'].items()
+    }
+def get_tags_for_patch_file(patch_file: str, dir_attrs: Dict[str, Set[str]]) -> Set[str]:
+    return set(itertools.chain.from_iterable(
+        tags for directory, tags in dir_attrs.items() if patch_file.startswith(directory)
+    ))
+def get_tags_for_patches(patch_files: Set[str], dir_attrs: Dict[str, Set[str]]) -> Set[str]:
+    return set(itertools.chain.from_iterable(
+        map(lambda patch_file: get_tags_for_patch_file(patch_file, dir_attrs), patch_files)
+    ))
+if len(sys.argv) < 3:
+    print("usage: patch_parser.py <path to patch_parser.cfg> <patch file>...")
+    exit(1)
+conf_obj = ConfigParser()
+patch_files = get_all_files_from_patches(sys.argv[2:])
+dir_attrs = get_dictionary_attributes_from_config_file(conf_obj)
+priority_list = parse_comma_delimited_list_from_string(conf_obj['Priority']['priority_list'])
+unordered_tags: Set[str] = get_tags_for_patches(patch_files, dir_attrs)
+ordered_tags: List[str] = [tag for tag in priority_list if tag in unordered_tags]

More information about the ci mailing list