[dpdk-ci] CI reliability
thomas at monjalon.net
Tue May 26 23:10:10 CEST 2020
26/05/2020 22:27, Lincoln Lavoie:
> On Sun, May 24, 2020 at 5:50 AM Thomas Monjalon <thomas at monjalon.net> wrote:
> > Hi all,
> > I think we have a CI reliability issue in general.
> > Perhaps we lack some alert mechanism warning test platform maintainers
> > when too many tests are failing.
> > Recent example: the community lab compilation test is failing on
> > Fedora 31 for at least 2 weeks, and I don't see any action to fix it:
> > https://lab.dpdk.org/results/dashboard/patchsets/11040/
> > Because of such recurring errors, the whole CI becomes irrelevant.
> This has been fixed as of yesterday. The failure was caused by a commit to
> the SPDK repos in how they pull in their dependencies, which was done in a
> way that is not compatible with docker. The team created a work around so
> that case is fixed, but there is always a risk where other commits for
> those type of items could cause a failure in the containers.
Thanks for fixing
> I asked Brandon to change the scripts that run the testing in the
> containers to try and catch failures from docker separately, so they can be
> flagged as infrastructure, compared to failures of the build.
Yes good idea.
When compiling external projects, we can see some errors which
are not due to the DPDK patch.
I guess we validate any upgrade of the external projects
before making them live?
> I'm also very surprised, this was not raised during the CI meeting, or by
> anyone else. I'm wondering if this is caused by the actual error logs
> being a little abstracted from the emails, i.e. they are a link and a zip
> file away for the actual email text, so maybe folks are not really looking
> into the output as closely as they should be. Is this something we can
> make better by including more detail in the email text, so issues are
> caught more quickly?
I think the table in the report is already quite expressive.
As I proposed above, I think we need a better monitoring.
If the same test is failing on many DPDK patches, it should raise an alarm.
More information about the ci