[dpdk-dev] GitHub sandbox for the DPDK community

Wiles, Keith keith.wiles at intel.com
Thu May 7 01:49:42 CEST 2015


Hi Thomas, (sorry about the length)

On 5/6/15, 2:37 PM, "Marc Sune" <marc.sune at bisdn.de> wrote:

>
>
>On 06/05/15 23:09, Thomas Monjalon wrote:
>> Hello everyone,
>>
>> I'm back from mini-holidays and it's good to see that there are
>> a lot of proposals trying to improve our workflow.
>> Most of the discussions are focus on process and tools, however
>> we must keep in mind that submitting clean patches and doing more
>> reviews can greatly improve the life of the project.
>> The debate for/against GitHub raises several interesting questions
>> about different parts of the workflow which deserves some detailed
>> explanations (and context reminders).
>>
>> Previously, there was a discussion about the contribution rules and
>>tools:
>> 	http://dpdk.org/ml/archives/dev/2015-March/015499.html
>> Then a coding rules discussion was started:
>> 	http://dpdk.org/ml/archives/dev/2015-April/016243.html
>> And a more general thread brought some interesting opinions:
>> 	http://dpdk.org/ml/archives/dev/2015-April/016551.html
>> As a consequence, we are now discussing the workflow and especially
>> how GitHub could help us.

The emails above show one thing we can not make a decision on how to
proceed. We have no method to decide on a topic, look at coding style we
have yet to make any head way and it is unclear how we can decide on a
path. We can not vote and we do not have a king of the repo to make those
decisions, it just dies with out being resolved.

I was hoping the moving to Github would allow us to have multiple
persons/companies equal access to the repos/web pages and other functions
on a third party site. With this move we would put processes in place to
start fixing these problems. I know we can do this now, but the move IMO
was how we get it started. We should start now anyway.

We are all over the world and it would be good to have a neutral worldwide
site to give everyone a equal foothold into DPDK. I was hoping it would
reduce some cost and time from 6Wind, but maybe it is consider just the
cost of doing business for 6Wind.

>> Please note that the follow-up of some of these discussions may be done
>> by submitting & reviewing patches (e.g. guidelines documents,
>> tools integration, etc).
>> Now let's talk about the workflow.
>>
>> When the dpdk.org project was started in 2013, it has been decided to
>>adopt
>> an email workflow. It is the most common model in projects which are
>> technically close to DPDK: Linux, Qemu, GLIBC, GCC. So it is a promise
>>to
>> attract contributors from these projects. Moreover, the number of
>>comments
>> to this thread tends to prove that emails are not dead ;)
>> See also the number of contributors of previous versions:
>> 	1.6: 25 (2014, April)
>> 	1.7: 46 (2014, September)
>> 	1.8: 54 (2014, December)
>> 	2.0: 60 (2015, April)
>>
>> Another choice was done about the number of mailing lists: most of the
>>traffic
>> is in only one list (dev@) in order to avoid separation between patches
>>and
>> discussions/reports leading to patches. It also allows user questions
>>to be
>> read by skilled developers.
>>
>> The portal to doc, git and mailing list is the website which is managed
>>with
>> git in order to open it when needed and mature enough.
>> Please find web traffic evolution in the attached file.
>> There is also a patchwork web interface to ease browsing patches
>>submitted
>> to the mailing list. It provides a view on patches status and agregate
>> discussions on specific patches. Some improvements are in progress:
>> 	http://permalink.gmane.org/gmane.comp.version-control.patchwork/1162
>> 	https://lists.ozlabs.org/pipermail/patchwork/2015-May/001310.html

The patchwork site would not be required for Github as you can review and
see all of the pull requests. Also the pull requested are quickly accessed
to sort and manage the patches IMO better then patchwork. The feature is
built into GitHub and we do not need to maintain that site or tool. The
pull requests can also be placed into given states just like patchwork.
The patchwork interface is clunky to me as it seems to be odd to manage
patches, maybe they can fix the usability issues. The filter button is not
very visible and when you need to change a set of patches you have to do a
lot of clicks and back pages to change them all. Maybe I do not know how
to use the site, but I do not think that is the problem IMO. The GitHub
one works today without having to fix anything.

>>
>> There are 3 types of git repositories (http://dpdk.org/browse):
>>    - the main DPDK tree
>>    - subtrees, created on request or external, may help to scale by
>>providing
>>      patches ready for merge in the main tree
>>    - side trees, created on request, e.g. dts or pktgen

I like the idea of going to the GitHub page and being able to scroll down
the page to see all of the repos at the same time. This way people notice
the other tools and subtrees quickly. I know you can modify the web page
to make it easier to see, but Github already has it done.

I seem to have to tell people where Pktgen is located on the site just
about every time I talk about it, being on the GitHub list of repos seems
much more obvious to me. I want to make it easy for someone to be added to
the team to help improve the code and a Github team seems like the easiest
way.

>> Do not hesitate to request creation of a new tree, it's open.
>> Intel has requested some small subtrees which seems not very useful. We
>>may
>> try to organize some new subtrees for bigger areas, which would take
>>care
>> of many sections of the MAINTAINERS file. Maybe that some dedicated
>>mailing
>> lists should be created. These mailing lists and subtrees may be hosted
>>on
>> dpdk.org or elsewhere if everybody agree.

I agree with Neil on a few more repos for subtrees or submodules, this
allows us in Github to have different teams and members on those repos as
committers. Adding new persons to teams is quick and easy for anyone on
the team to add or modify someone on the team. The owners have full power
for all teams and adding/removing contributors plus creating or deleting
repos and other functions.

>>
>> There was no bug tracker initially installed to avoid fragmentation with
>> mailing-list discussions. Now that traffic is becoming huge, it appears
>>to be
>> a new priority.
>>
>> Last point in the workflow status: tests and continuous integration.
>> It's a complicated topic, especially because DPDK requires some
>>expensive
>> infrastructure for the tests. Some people are working on it at Intel and
>> 6WIND, so I guess we will have a public discussion in the coming weeks.

GitHub or the current system does not address this concern, but I do not
see that Github would restrict anything. I am not saying you made that
point, but pointing out it needs to be address and is not a pro/con.

>>
>> After carefully reading previous comments about github hosting, I would
>>like
>> to sort pros/cons below.
>> Invalidated Pro:
>> - web pages system: already possible without GitHub

With the github pages we can have anyone modify the pages and does not
have to be you or someone at 6Wind. The Github web page support is already
present and is contained in the repo as a branch for each repo, if we need
it. To me it just seems easier for someone in the ³wed-page" team to
modify quickly.

>> - popularity: why being hosted on GitHub would improve the visibility?
>> Pros:
>> - less complicated command lines
>> - same view for everyone (independent of MUA features)
>> - more code context when reading patches

 Has two modes side by side diff or standard inline diff support.

>> - integrated bug tracker

The bug tracker is something we need now that we have more patches and
users.

>> Cons:
>> - full feature usage implies everybody is forced to use it

I use GitHub for Pktgen for a long time and the only time I really needed
to touch the 

>> - fragmentation between online data and mailing list

The fragmentation is something I want to solve as have seen some comments
about integrating the two systems with some open source code, which I
would hope will solve that problem. More investigation needs to be done.

>> - discussions are not threaded, long discussions not clear
>> - editing in browser may be limited

Personally I only used the web based editing of files to update the
version number on the readme when I forgot. I would not suggest you do all
of your coding in the web browser, but on your local repo copy and editor
of choose.

>> - no offline access

You have the local repo as your offline access, is this what you mean? I
site may go down, but I expect it would be the same for any site. If
GitHub is causing the down time this is a different problem IMO.

>> - difficult to follow history as we rely on user repositories which may
>>change

The pull requests have the history just like patches do today, it should
not be any different. How do you think we will lose history?

>> - GitHub (commercial service) is watching us

If you have not figured it out yet, everyone is out to get everyone else
now in this global Internet Fad :-)

>> - how to leave and migrate data from GitHub?

I think that is kind of easy just clone the repo is that not the case? I
can see we would lose some of the comments, but I am not sure they are
that worth wild. Besides we can always keep the site as a mirror if we
decide to move.

>> - administration issues out of control (see snapshot of today's
>>downtime)
>>
>> I did an abuse report for https://github.com/dpdk in case we want to
>>use this
>> GitHub account.

OK, great you are the one that created that account I created the
https://github.com/dpdk-org one not knowing who created it.

>> My opinion is that GitHub offers some nice tools and toys but some
>>people
>> won't be comfortable with it.

I think the same is for others not being very comfortable with creating
patches and emailing them as well, so this one is tie IMO.

>> It may be reasonable to try some features without forcing everyone to
>>migrate,
>> while keeping consistency between every contributors.
>> Making some tests in a sandbox seems to be a good approach.

The sandbox I created was for that purpose and anyone is welcome to play
with the site, just let me know. https://github.com/dpdk-org

Thomas, just send me a GitHub login name and I can add you as an owner to
the dpdk-org site or anyone else you want to have as an owner. I have been
adding most as contributors.

>
>Hi Thomas,
>
>Thanks for the detailed explanation. As the official "maintainer" of
>DPDK, and I think strongly in favour of the current mail-based workflow,
>I would like to know how would you see a hybrid approach like:
>
>http://dpdk.org/ml/archives/dev/2015-May/017283.html
>
>if we would manage to make it work reliably.

+1, I too believe we can make this stable or use the other open source
Github project maybe the place to start.

https://github.com/google/pull-request-mailer

https://github.com/rust-lang/rust/pull/25058#discussion_r29548050

>
>
>
>Best
>Marc
>
>>
>> Thanks for reading
>



More information about the dev mailing list