[dpdk-dev] GitHub sandbox for the DPDK community

Thomas Monjalon thomas.monjalon at 6wind.com
Wed May 6 23:09:30 CEST 2015


Hello everyone,

I'm back from mini-holidays and it's good to see that there are
a lot of proposals trying to improve our workflow.
Most of the discussions are focus on process and tools, however
we must keep in mind that submitting clean patches and doing more
reviews can greatly improve the life of the project.
The debate for/against GitHub raises several interesting questions
about different parts of the workflow which deserves some detailed
explanations (and context reminders).

Previously, there was a discussion about the contribution rules and tools:
	http://dpdk.org/ml/archives/dev/2015-March/015499.html
Then a coding rules discussion was started:
	http://dpdk.org/ml/archives/dev/2015-April/016243.html
And a more general thread brought some interesting opinions:
	http://dpdk.org/ml/archives/dev/2015-April/016551.html
As a consequence, we are now discussing the workflow and especially
how GitHub could help us.
Please note that the follow-up of some of these discussions may be done
by submitting & reviewing patches (e.g. guidelines documents,
tools integration, etc).
Now let's talk about the workflow.

When the dpdk.org project was started in 2013, it has been decided to adopt
an email workflow. It is the most common model in projects which are
technically close to DPDK: Linux, Qemu, GLIBC, GCC. So it is a promise to
attract contributors from these projects. Moreover, the number of comments
to this thread tends to prove that emails are not dead ;)
See also the number of contributors of previous versions:
	1.6: 25 (2014, April)
	1.7: 46 (2014, September)
	1.8: 54 (2014, December)
	2.0: 60 (2015, April)

Another choice was done about the number of mailing lists: most of the traffic
is in only one list (dev@) in order to avoid separation between patches and
discussions/reports leading to patches. It also allows user questions to be
read by skilled developers.

The portal to doc, git and mailing list is the website which is managed with
git in order to open it when needed and mature enough.
Please find web traffic evolution in the attached file.
There is also a patchwork web interface to ease browsing patches submitted
to the mailing list. It provides a view on patches status and agregate
discussions on specific patches. Some improvements are in progress:
	http://permalink.gmane.org/gmane.comp.version-control.patchwork/1162
	https://lists.ozlabs.org/pipermail/patchwork/2015-May/001310.html

There are 3 types of git repositories (http://dpdk.org/browse):
  - the main DPDK tree
  - subtrees, created on request or external, may help to scale by providing
    patches ready for merge in the main tree
  - side trees, created on request, e.g. dts or pktgen
Do not hesitate to request creation of a new tree, it's open.
Intel has requested some small subtrees which seems not very useful. We may
try to organize some new subtrees for bigger areas, which would take care
of many sections of the MAINTAINERS file. Maybe that some dedicated mailing
lists should be created. These mailing lists and subtrees may be hosted on
dpdk.org or elsewhere if everybody agree.

There was no bug tracker initially installed to avoid fragmentation with
mailing-list discussions. Now that traffic is becoming huge, it appears to be
a new priority.

Last point in the workflow status: tests and continuous integration.
It's a complicated topic, especially because DPDK requires some expensive
infrastructure for the tests. Some people are working on it at Intel and
6WIND, so I guess we will have a public discussion in the coming weeks.

After carefully reading previous comments about github hosting, I would like
to sort pros/cons below.
Invalidated Pro:
- web pages system: already possible without GitHub
- popularity: why being hosted on GitHub would improve the visibility?
Pros:
- less complicated command lines
- same view for everyone (independent of MUA features)
- more code context when reading patches
- integrated bug tracker
Cons:
- full feature usage implies everybody is forced to use it
- fragmentation between online data and mailing list
- discussions are not threaded, long discussions not clear
- editing in browser may be limited
- no offline access
- difficult to follow history as we rely on user repositories which may change
- GitHub (commercial service) is watching us
- how to leave and migrate data from GitHub?
- administration issues out of control (see snapshot of today's downtime)

I did an abuse report for https://github.com/dpdk in case we want to use this
GitHub account.
My opinion is that GitHub offers some nice tools and toys but some people
won't be comfortable with it.
It may be reasonable to try some features without forcing everyone to migrate,
while keeping consistency between every contributors.
Making some tests in a sandbox seems to be a good approach.

Thanks for reading
-------------- next part --------------
A non-text attachment was scrubbed...
Name: web-sessions-per-month.png
Type: image/png
Size: 22531 bytes
Desc: not available
URL: <http://dpdk.org/ml/archives/dev/attachments/20150506/96a5ea41/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: github-20150506.png
Type: image/png
Size: 68220 bytes
Desc: not available
URL: <http://dpdk.org/ml/archives/dev/attachments/20150506/96a5ea41/attachment-0003.png>


More information about the dev mailing list