[dpdk-dev] GitHub sandbox for the DPDK community

Wiles, Keith keith.wiles at intel.com
Thu May 7 05:37:33 CEST 2015


I did not finish a thought for some reason.

On 5/6/15, 4:49 PM, "Wiles, Keith" <keith.wiles at intel.com> wrote:

>Hi Thomas, (sorry about the length)
>
>On 5/6/15, 2:37 PM, "Marc Sune" <marc.sune at bisdn.de> wrote:
>
>>
>>
>>On 06/05/15 23:09, Thomas Monjalon wrote:
>>> Hello everyone,
>>>
>>> I'm back from mini-holidays and it's good to see that there are
>>> a lot of proposals trying to improve our workflow.
>>> Most of the discussions are focus on process and tools, however
>>> we must keep in mind that submitting clean patches and doing more
>>> reviews can greatly improve the life of the project.
>>> The debate for/against GitHub raises several interesting questions
>>> about different parts of the workflow which deserves some detailed
>>> explanations (and context reminders).
>>>
>>> Previously, there was a discussion about the contribution rules and
>>>tools:
>>> 	http://dpdk.org/ml/archives/dev/2015-March/015499.html
>>> Then a coding rules discussion was started:
>>> 	http://dpdk.org/ml/archives/dev/2015-April/016243.html
>>> And a more general thread brought some interesting opinions:
>>> 	http://dpdk.org/ml/archives/dev/2015-April/016551.html
>>> As a consequence, we are now discussing the workflow and especially
>>> how GitHub could help us.
>
>The emails above show one thing we can not make a decision on how to
>proceed. We have no method to decide on a topic, look at coding style we
>have yet to make any head way and it is unclear how we can decide on a
>path. We can not vote and we do not have a king of the repo to make those
>decisions, it just dies with out being resolved.
>
>I was hoping the moving to Github would allow us to have multiple
>persons/companies equal access to the repos/web pages and other functions
>on a third party site. With this move we would put processes in place to
>start fixing these problems. I know we can do this now, but the move IMO
>was how we get it started. We should start now anyway.
>
>We are all over the world and it would be good to have a neutral worldwide
>site to give everyone a equal foothold into DPDK. I was hoping it would
>reduce some cost and time from 6Wind, but maybe it is consider just the
>cost of doing business for 6Wind.
>
>>> Please note that the follow-up of some of these discussions may be done
>>> by submitting & reviewing patches (e.g. guidelines documents,
>>> tools integration, etc).
>>> Now let's talk about the workflow.
>>>
>>> When the dpdk.org project was started in 2013, it has been decided to
>>>adopt
>>> an email workflow. It is the most common model in projects which are
>>> technically close to DPDK: Linux, Qemu, GLIBC, GCC. So it is a promise
>>>to
>>> attract contributors from these projects. Moreover, the number of
>>>comments
>>> to this thread tends to prove that emails are not dead ;)
>>> See also the number of contributors of previous versions:
>>> 	1.6: 25 (2014, April)
>>> 	1.7: 46 (2014, September)
>>> 	1.8: 54 (2014, December)
>>> 	2.0: 60 (2015, April)
>>>
>>> Another choice was done about the number of mailing lists: most of the
>>>traffic
>>> is in only one list (dev@) in order to avoid separation between patches
>>>and
>>> discussions/reports leading to patches. It also allows user questions
>>>to be
>>> read by skilled developers.
>>>
>>> The portal to doc, git and mailing list is the website which is managed
>>>with
>>> git in order to open it when needed and mature enough.
>>> Please find web traffic evolution in the attached file.
>>> There is also a patchwork web interface to ease browsing patches
>>>submitted
>>> to the mailing list. It provides a view on patches status and agregate
>>> discussions on specific patches. Some improvements are in progress:
>>> 	http://permalink.gmane.org/gmane.comp.version-control.patchwork/1162
>>> 	https://lists.ozlabs.org/pipermail/patchwork/2015-May/001310.html
>
>The patchwork site would not be required for Github as you can review and
>see all of the pull requests. Also the pull requested are quickly accessed
>to sort and manage the patches IMO better then patchwork. The feature is
>built into GitHub and we do not need to maintain that site or tool. The
>pull requests can also be placed into given states just like patchwork.
>The patchwork interface is clunky to me as it seems to be odd to manage
>patches, maybe they can fix the usability issues. The filter button is not
>very visible and when you need to change a set of patches you have to do a
>lot of clicks and back pages to change them all. Maybe I do not know how
>to use the site, but I do not think that is the problem IMO. The GitHub
>one works today without having to fix anything.
>
>>>
>>> There are 3 types of git repositories (http://dpdk.org/browse):
>>>    - the main DPDK tree
>>>    - subtrees, created on request or external, may help to scale by
>>>providing
>>>      patches ready for merge in the main tree
>>>    - side trees, created on request, e.g. dts or pktgen
>
>I like the idea of going to the GitHub page and being able to scroll down
>the page to see all of the repos at the same time. This way people notice
>the other tools and subtrees quickly. I know you can modify the web page
>to make it easier to see, but Github already has it done.
>
>I seem to have to tell people where Pktgen is located on the site just
>about every time I talk about it, being on the GitHub list of repos seems
>much more obvious to me. I want to make it easy for someone to be added to
>the team to help improve the code and a Github team seems like the easiest
>way.
>
>>> Do not hesitate to request creation of a new tree, it's open.
>>> Intel has requested some small subtrees which seems not very useful. We
>>>may
>>> try to organize some new subtrees for bigger areas, which would take
>>>care
>>> of many sections of the MAINTAINERS file. Maybe that some dedicated
>>>mailing
>>> lists should be created. These mailing lists and subtrees may be hosted
>>>on
>>> dpdk.org or elsewhere if everybody agree.
>
>I agree with Neil on a few more repos for subtrees or submodules, this
>allows us in Github to have different teams and members on those repos as
>committers. Adding new persons to teams is quick and easy for anyone on
>the team to add or modify someone on the team. The owners have full power
>for all teams and adding/removing contributors plus creating or deleting
>repos and other functions.
>
>>>
>>> There was no bug tracker initially installed to avoid fragmentation
>>>with
>>> mailing-list discussions. Now that traffic is becoming huge, it appears
>>>to be
>>> a new priority.
>>>
>>> Last point in the workflow status: tests and continuous integration.
>>> It's a complicated topic, especially because DPDK requires some
>>>expensive
>>> infrastructure for the tests. Some people are working on it at Intel
>>>and
>>> 6WIND, so I guess we will have a public discussion in the coming weeks.
>
>GitHub or the current system does not address this concern, but I do not
>see that Github would restrict anything. I am not saying you made that
>point, but pointing out it needs to be address and is not a pro/con.
>
>>>
>>> After carefully reading previous comments about github hosting, I would
>>>like
>>> to sort pros/cons below.
>>> Invalidated Pro:
>>> - web pages system: already possible without GitHub
>
>With the github pages we can have anyone modify the pages and does not
>have to be you or someone at 6Wind. The Github web page support is already
>present and is contained in the repo as a branch for each repo, if we need
>it. To me it just seems easier for someone in the ³wed-page" team to
>modify quickly.
>
>>> - popularity: why being hosted on GitHub would improve the visibility?
>>> Pros:
>>> - less complicated command lines
>>> - same view for everyone (independent of MUA features)
>>> - more code context when reading patches
>
> Has two modes side by side diff or standard inline diff support.
>
>>> - integrated bug tracker
>
>The bug tracker is something we need now that we have more patches and
>users.
>
>>> Cons:
>>> - full feature usage implies everybody is forced to use it

I use GitHub for Pktgen for a long time and the only time I really needed
to touch the GitHub web page was if I wanted to verify the README or
something was correct after I pushed the commits. Normally I just used
command line pull/commit to update my local repo, do all of my
testing/development then commit and push the changes to the GitHub report.
At this point everyone one that was following would get a notice. To me it
seemed like Github was just a remote repo in my day to day work looks like
what I believe everyone is doing now. Maybe someone can explain what is so
different for the day to day workflow.

> 
>
>>> - fragmentation between online data and mailing list
>
>The fragmentation is something I want to solve as have seen some comments
>about integrating the two systems with some open source code, which I
>would hope will solve that problem. More investigation needs to be done.
>
>>> - discussions are not threaded, long discussions not clear
>>> - editing in browser may be limited
>
>Personally I only used the web based editing of files to update the
>version number on the readme when I forgot. I would not suggest you do all
>of your coding in the web browser, but on your local repo copy and editor
>of choose.
>
>>> - no offline access
>
>You have the local repo as your offline access, is this what you mean? I
>site may go down, but I expect it would be the same for any site. If
>GitHub is causing the down time this is a different problem IMO.
>
>>> - difficult to follow history as we rely on user repositories which may
>>>change
>
>The pull requests have the history just like patches do today, it should
>not be any different. How do you think we will lose history?
>
>>> - GitHub (commercial service) is watching us
>
>If you have not figured it out yet, everyone is out to get everyone else
>now in this global Internet Fad :-)
>
>>> - how to leave and migrate data from GitHub?
>
>I think that is kind of easy just clone the repo is that not the case? I
>can see we would lose some of the comments, but I am not sure they are
>that worth wild. Besides we can always keep the site as a mirror if we
>decide to move.
>
>>> - administration issues out of control (see snapshot of today's
>>>downtime)
>>>
>>> I did an abuse report for https://github.com/dpdk in case we want to
>>>use this
>>> GitHub account.
>
>OK, great you are the one that created that account I created the
>https://github.com/dpdk-org one not knowing who created it.
>
>>> My opinion is that GitHub offers some nice tools and toys but some
>>>people
>>> won't be comfortable with it.
>
>I think the same is for others not being very comfortable with creating
>patches and emailing them as well, so this one is tie IMO.
>
>>> It may be reasonable to try some features without forcing everyone to
>>>migrate,
>>> while keeping consistency between every contributors.
>>> Making some tests in a sandbox seems to be a good approach.
>
>The sandbox I created was for that purpose and anyone is welcome to play
>with the site, just let me know. https://github.com/dpdk-org
>
>Thomas, just send me a GitHub login name and I can add you as an owner to
>the dpdk-org site or anyone else you want to have as an owner. I have been
>adding most as contributors.
>
>>
>>Hi Thomas,
>>
>>Thanks for the detailed explanation. As the official "maintainer" of
>>DPDK, and I think strongly in favour of the current mail-based workflow,
>>I would like to know how would you see a hybrid approach like:
>>
>>http://dpdk.org/ml/archives/dev/2015-May/017283.html
>>
>>if we would manage to make it work reliably.
>
>+1, I too believe we can make this stable or use the other open source
>Github project maybe the place to start.
>
>https://github.com/google/pull-request-mailer
>
>https://github.com/rust-lang/rust/pull/25058#discussion_r29548050
>
>>
>>
>>
>>Best
>>Marc
>>
>>>
>>> Thanks for reading
>>
>



More information about the dev mailing list