<div dir="ltr">A bit of fuzz testing found some edge cases where this script crashes or fails to properly parse the patch file. I am currently working on a rewrite using a dedicated library to avoid these and similar issues. </div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Dec 4, 2020 at 2:45 PM Owen Hilyard <<a href="mailto:ohilyard@iol.unh.edu">ohilyard@iol.unh.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This commit contains a script, patch_parser.py, and a config file,<br>
patch_parser.cfg. These are tooling that the UNH CI team has been<br>
testing in order to reduce the number of tests that need to be run<br>
per patch. This resulted from our push to increase the number of<br>
functional tests running in the CI. While working on expanding test<br>
coverage, we found that DTS could easily take over 6 hours to run, so<br>
we decided to begin work on tagging patches and then only running the<br>
required tests.<br>
<br>
The script works by taking in an address for the config file and then<br>
a list of patch files, which it will parse and then produce a list of<br>
tags for that list of patches based on the config file. The config file<br>
is designed to work as a mapping for a base path to a set of tags. It<br>
also contains an ordered list of priorities for tags so that this may<br>
also be used by hierarchical tools rather than modular ones.<br>
<br>
The intention of the UNH team with giving this tooling to the wider<br>
DPDK community is to have people more familiar with the internal<br>
functionality of DPDK provide most of the tagging. This would allow<br>
UNH to have a better turn around time for testing by eliminating<br>
unnecessary tests, while still increasing the number of tests in the<br>
CI.<br>
<br>
The different patch tags are currently defined as such:<br>
<br>
core:<br>
Core DPDK functionality. Examples include kernel modules and<br>
librte_eal. This tag should be used sparingly as it is intended<br>
to signal to automated test suites that it is necessary to<br>
run most of the tests for DPDK and as such will consume CI<br>
resources for a long period of time.<br>
<br>
driver:<br>
For NIC drivers and other hardware interface code. This should be<br>
used as a generic tag with each driver getting it's own tag.<br>
<br>
application:<br>
Used in a similar manner to "driver". This tag is intended for<br>
code used in only in applications that DPDK provides, such as<br>
testpmd or helloworld. This tag should be accompanied by a tag<br>
which denotes which application specifically has been changed.<br>
<br>
documentation:<br>
This is intended to be used as a tag for paths which only contain<br>
documentation, such as "doc/". It's intended use is as a way to<br>
trigger the automatic re-building of the documentation website.<br>
<br>
Signed-off-by: Owen Hilyard <<a href="mailto:ohilyard@iol.unh.edu" target="_blank">ohilyard@iol.unh.edu</a>><br>
---<br>
config/patch_parser.cfg | 25 ++++++++++++++++<br>
tools/patch_parser.py | 64 +++++++++++++++++++++++++++++++++++++++++<br>
2 files changed, 89 insertions(+)<br>
create mode 100644 config/patch_parser.cfg<br>
create mode 100755 tools/patch_parser.py<br>
<br>
diff --git a/config/patch_parser.cfg b/config/patch_parser.cfg<br>
new file mode 100644<br>
index 0000000..5757f9a<br>
--- /dev/null<br>
+++ b/config/patch_parser.cfg<br>
@@ -0,0 +1,25 @@<br>
+# Description of the categories as initially designed<br>
+<br>
+[Paths]<br>
+drivers =<br>
+ driver,<br>
+ core<br>
+kernel = core<br>
+doc = documentation<br>
+lib = core<br>
+meson_options.txt = core<br>
+examples = application<br>
+app = application<br>
+license = documentation<br>
+VERSION = documentation<br>
+build = core<br>
+<br>
+# This is an ordered list of the importance of each patch classification.<br>
+# It should be used to determine which classification to use on tools which<br>
+# do not support multiple patch classifications.<br>
+[Priority]<br>
+priority_list =<br>
+ core,<br>
+ driver,<br>
+ application,<br>
+ documentation<br>
diff --git a/tools/patch_parser.py b/tools/patch_parser.py<br>
new file mode 100755<br>
index 0000000..01fc55d<br>
--- /dev/null<br>
+++ b/tools/patch_parser.py<br>
@@ -0,0 +1,64 @@<br>
+#!/usr/bin/env python3<br>
+<br>
+import itertools<br>
+import sys<br>
+from configparser import ConfigParser<br>
+from typing import List, Dict, Set<br>
+<br>
+<br>
+def get_patch_files(patch_file: str) -> List[str]:<br>
+ with open(patch_file, 'r') as f:<br>
+ lines = list(itertools.takewhile(<br>
+ lambda line: line.strip().endswith('+') or line.strip().endswith('-'),<br>
+ itertools.dropwhile(<br>
+ lambda line: not line.strip().startswith("---"),<br>
+ f.readlines()<br>
+ )<br>
+ ))<br>
+ filenames = map(lambda line: line.strip().split(' ')[0], lines)<br>
+ # takewhile includes the --- which starts the filenames<br>
+ return list(filenames)[1:]<br>
+<br>
+<br>
+def get_all_files_from_patches(patch_files: List[str]) -> Set[str]:<br>
+ return set(itertools.chain.from_iterable(map(get_patch_files, patch_files)))<br>
+<br>
+<br>
+def parse_comma_delimited_list_from_string(mod_str: str) -> List[str]:<br>
+ return list(map(str.strip, mod_str.split(',')))<br>
+<br>
+<br>
+def get_dictionary_attributes_from_config_file(conf_obj: ConfigParser) -> Dict[str, Set[str]]:<br>
+ return {<br>
+ directory: parse_comma_delimited_list_from_string(module_string) for directory, module_string in<br>
+ conf_obj['Paths'].items()<br>
+ }<br>
+<br>
+<br>
+def get_tags_for_patch_file(patch_file: str, dir_attrs: Dict[str, Set[str]]) -> Set[str]:<br>
+ return set(itertools.chain.from_iterable(<br>
+ tags for directory, tags in dir_attrs.items() if patch_file.startswith(directory)<br>
+ ))<br>
+<br>
+<br>
+def get_tags_for_patches(patch_files: Set[str], dir_attrs: Dict[str, Set[str]]) -> Set[str]:<br>
+ return set(itertools.chain.from_iterable(<br>
+ map(lambda patch_file: get_tags_for_patch_file(patch_file, dir_attrs), patch_files)<br>
+ ))<br>
+<br>
+<br>
+if len(sys.argv) < 3:<br>
+ print("usage: patch_parser.py <path to patch_parser.cfg> <patch file>...")<br>
+ exit(1)<br>
+<br>
+conf_obj = ConfigParser()<br>
+conf_obj.read(sys.argv[1])<br>
+<br>
+patch_files = get_all_files_from_patches(sys.argv[2:])<br>
+dir_attrs = get_dictionary_attributes_from_config_file(conf_obj)<br>
+priority_list = parse_comma_delimited_list_from_string(conf_obj['Priority']['priority_list'])<br>
+<br>
+unordered_tags: Set[str] = get_tags_for_patches(patch_files, dir_attrs)<br>
+ordered_tags: List[str] = [tag for tag in priority_list if tag in unordered_tags]<br>
+<br>
+print("\n".join(ordered_tags))<br>
-- <br>
2.27.0<br>
<br>
</blockquote></div>