[dpdk-ci] [PATCH] ci: added patch parser for patch files

Owen Hilyard ohilyard at iol.unh.edu
Thu Jan 14 17:53:04 CET 2021


A bit of fuzz testing found some edge cases where this script crashes or
fails to properly parse the patch file. I am currently working on a
rewrite using a dedicated library to avoid these and similar issues.

On Fri, Dec 4, 2020 at 2:45 PM Owen Hilyard <ohilyard at iol.unh.edu> wrote:

> This commit contains a script, patch_parser.py, and a config file,
> patch_parser.cfg. These are tooling that the UNH CI team has been
> testing in order to reduce the number of tests that need to be run
> per patch. This resulted from our push to increase the number of
> functional tests running in the CI. While working on expanding test
> coverage, we found that DTS could easily take over 6 hours to run, so
> we decided to begin work on tagging patches and then only running the
> required tests.
>
> The script works by taking in an address for the config file and then
> a list of patch files, which it will parse and then produce a list of
> tags for that list of patches based on the config file. The config file
> is designed to work as a mapping for a base path to a set of tags. It
> also contains an ordered list of priorities for tags so that this may
> also be used by hierarchical tools rather than modular ones.
>
> The intention of the UNH team with giving this tooling to the wider
> DPDK community is to have people more familiar with the internal
> functionality of DPDK provide most of the tagging. This would allow
> UNH to have a better turn around time for testing by eliminating
> unnecessary tests, while still increasing the number of tests in the
> CI.
>
> The different patch tags are currently defined as such:
>
> core:
>     Core DPDK functionality. Examples include kernel modules and
>     librte_eal. This tag should be used sparingly as it is intended
>      to signal to automated test suites that it is necessary to
>      run most of the tests for DPDK and as such will consume CI
>      resources for a long period of time.
>
> driver:
>     For NIC drivers and other hardware interface code. This should be
>     used as a generic tag with each driver getting it's own tag.
>
> application:
>     Used in a similar manner to "driver". This tag is intended for
>     code used in only in applications that DPDK provides, such as
>     testpmd or helloworld. This tag should be accompanied by a tag
>     which denotes which application specifically has been changed.
>
> documentation:
>     This is intended to be used as a tag for paths which only contain
>     documentation, such as "doc/". It's intended use is as a way to
>     trigger the automatic re-building of the documentation website.
>
> Signed-off-by: Owen Hilyard <ohilyard at iol.unh.edu>
> ---
>  config/patch_parser.cfg | 25 ++++++++++++++++
>  tools/patch_parser.py   | 64 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 89 insertions(+)
>  create mode 100644 config/patch_parser.cfg
>  create mode 100755 tools/patch_parser.py
>
> diff --git a/config/patch_parser.cfg b/config/patch_parser.cfg
> new file mode 100644
> index 0000000..5757f9a
> --- /dev/null
> +++ b/config/patch_parser.cfg
> @@ -0,0 +1,25 @@
> +# Description of the categories as initially designed
> +
> +[Paths]
> +drivers =
> +    driver,
> +    core
> +kernel = core
> +doc = documentation
> +lib = core
> +meson_options.txt = core
> +examples = application
> +app = application
> +license = documentation
> +VERSION = documentation
> +build = core
> +
> +# This is an ordered list of the importance of each patch classification.
> +# It should be used to determine which classification to use on tools
> which
> +# do not support multiple patch classifications.
> +[Priority]
> +priority_list =
> +    core,
> +    driver,
> +    application,
> +    documentation
> diff --git a/tools/patch_parser.py b/tools/patch_parser.py
> new file mode 100755
> index 0000000..01fc55d
> --- /dev/null
> +++ b/tools/patch_parser.py
> @@ -0,0 +1,64 @@
> +#!/usr/bin/env python3
> +
> +import itertools
> +import sys
> +from configparser import ConfigParser
> +from typing import List, Dict, Set
> +
> +
> +def get_patch_files(patch_file: str) -> List[str]:
> +    with open(patch_file, 'r') as f:
> +        lines = list(itertools.takewhile(
> +            lambda line: line.strip().endswith('+') or
> line.strip().endswith('-'),
> +            itertools.dropwhile(
> +                lambda line: not line.strip().startswith("---"),
> +                f.readlines()
> +            )
> +        ))
> +        filenames = map(lambda line: line.strip().split(' ')[0], lines)
> +        # takewhile includes the --- which starts the filenames
> +        return list(filenames)[1:]
> +
> +
> +def get_all_files_from_patches(patch_files: List[str]) -> Set[str]:
> +    return set(itertools.chain.from_iterable(map(get_patch_files,
> patch_files)))
> +
> +
> +def parse_comma_delimited_list_from_string(mod_str: str) -> List[str]:
> +    return list(map(str.strip, mod_str.split(',')))
> +
> +
> +def get_dictionary_attributes_from_config_file(conf_obj: ConfigParser) ->
> Dict[str, Set[str]]:
> +    return {
> +        directory: parse_comma_delimited_list_from_string(module_string)
> for directory, module_string in
> +        conf_obj['Paths'].items()
> +    }
> +
> +
> +def get_tags_for_patch_file(patch_file: str, dir_attrs: Dict[str,
> Set[str]]) -> Set[str]:
> +    return set(itertools.chain.from_iterable(
> +        tags for directory, tags in dir_attrs.items() if
> patch_file.startswith(directory)
> +    ))
> +
> +
> +def get_tags_for_patches(patch_files: Set[str], dir_attrs: Dict[str,
> Set[str]]) -> Set[str]:
> +    return set(itertools.chain.from_iterable(
> +        map(lambda patch_file: get_tags_for_patch_file(patch_file,
> dir_attrs), patch_files)
> +    ))
> +
> +
> +if len(sys.argv) < 3:
> +    print("usage: patch_parser.py <path to patch_parser.cfg> <patch
> file>...")
> +    exit(1)
> +
> +conf_obj = ConfigParser()
> +conf_obj.read(sys.argv[1])
> +
> +patch_files = get_all_files_from_patches(sys.argv[2:])
> +dir_attrs = get_dictionary_attributes_from_config_file(conf_obj)
> +priority_list =
> parse_comma_delimited_list_from_string(conf_obj['Priority']['priority_list'])
> +
> +unordered_tags: Set[str] = get_tags_for_patches(patch_files, dir_attrs)
> +ordered_tags: List[str] = [tag for tag in priority_list if tag in
> unordered_tags]
> +
> +print("\n".join(ordered_tags))
> --
> 2.27.0
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mails.dpdk.org/archives/ci/attachments/20210114/8fb33e81/attachment.htm>


More information about the ci mailing list