[dpdk-dev] GitHub sandbox for the DPDK community

Wiles, Keith keith.wiles at intel.com
Fri May 1 20:01:42 CEST 2015



On 5/1/15, 11:45 AM, "Neil Horman" <nhorman at tuxdriver.com> wrote:

>On Fri, May 01, 2015 at 03:56:32PM +0000, Wiles, Keith wrote:
>> Hi Everyone,
>> 
>> I believe the DPDK community would benefit from moving to GitHub as the
>> primary DPDK site. http://github.com
>> 
>I'm not explicitly opposed to this, but I'm having trouble matching up the
>technical and governance issues raised on the list as of late with the
>benefits
>you indicate github provides.  Thoughts inline.
>
>> I believe the DPDK community can benefit from being at a very well know
>> world wide site. GitHub seems to have the most eyes of any of the open
>> source Git repos today and it appears they have more then twice as many
>> developers. GitHub has a number of features I see as some good
>>additions to
>> our community using the GitHub organization account type.
>> 
>
>Do you think that the current site dpdk.org lacks visibility?  Do we have
>analytics on the site, or anecdotal evidence to suggest that more
>visiblity can
>be had by moving to github?  It seems to me that people in search of a
>dataplane
>library google it, and dpdk is in the top 10 results, along with its
>wikipedia
>page, etc:
>https://www.google.com/#q=dataplane+library
>
>Not sure how using github brings on additional visibility.

Google is a great tool, you can find anything in the world with Google and
will continue to be how most find items on the web. Being able to use
Google is not the right question here you should be asking.

The question is how do we promote and get higher visitability for the DPDK
community? I believe moving DPDK.org to a well known location and a well
known open source location should be a benefit for the DPDK community as a
whole. If you are using GitHub today then I think you understand this
point already.

>
>
>> The cost for an organization account is $0 as long as we do not need
>>more
>> then 5 private repos. 10 private repos is $25/month and had other plans
>> for more. I do not see us needing more then 5 private repos today and
>>the
>> only reason I can see having a private repo is to do some prep work on
>>the
>> repo before making public. Every contributor would need to create a
>>GitHub
>> personal account, which is at no cost unless you need more then 5
>>private
>> repos. In both accounts you can have unlimited public repos.
>> 
>
>Given that dpdk is a public project, why would we need _any_ private
>repositories?  They should all be public, no?

Private repos (5) of them come for free and I pointed out the only reason
I thought we needed one as a temporary place for a repo before making
public. I agree we do not really need them and all repos should be public.

>
>> 
>>https://help.github.com/articles/where-can-i-find-open-source-projects-to
>>-w
>> ork-on/
>> 
>> http://www.sitepoint.com/using-git-open-source-projects/
>> 
>> - Adding more committers can lead to a security problems for 6Wind (I
>> assume).
>In what way?  Are you advocating for a single comitter here, and how does
>Github
>provide that?  FWIW, I think subtree maintainers is an excellent strategy
>for
>more efficient workflow (getting patches accepted faster has been an
>identified
>problem), and allowing subtree maintainers with a comitter for each is a
>good
>way to solve that.  Thats implementable with github or any other git based
>solution, mind you, so its neither an argument for or against github.

Maybe you mis-read this point. I am not suggesting only one committer,
what I am suggesting is adding committers and logins to a 6Wind controlled
machine could be a security issue for 6Wind. Maybe not, but moving to
GitHub removes any possible hacks to 6Wind is my belief and possible
liability issues for 6Wind. This was a very minor point.


As for multiple subtrees can quickly and easily be added to GitHub and you
could even make this happen if you want to be one of the persons helping
build the GitHub site. From other GitHub sites I see a lot of repos and
sub repos to the primary tree and personally I agree having a few more
subtrees will not effect DPDK and could possible help define teams around
these subtrees.

>
>> - 6Wind appearing to own DPDK.org is not a good message to the
>>community.
>Why not?  They're graciously hosting the site, and not advertizing on it
>(at
>least they shouldn't be, and I don't it egregiously displayed).  Netcraft
>will
>show you lots of open source projects that host their site on a server
>operated
>by a participating company.  Care has to be taken about bias, but its not
>uncommon.

I do not believe the point is around if 6Wind loans us machines and
storage and internet connect bandwidth. My point is GitHub is big company
and they have a lot of resources to make sure everyone remains connected
to the repo(s), Backups, support, tools and any number of other items to
make the DPDK community better in the long run. I believe it comes down to
resources and freeing up resources for 6Wind by moving to a bigger company
which is its sole job is to host sites like this one.

>
>>   - Not assuming 6Wind¹s dpdk.org site will disappear only where the
>> community stores the master repos and how the community interacts with
>>the
>> master.
>That just sounds like going back to the situation we had between dpdk.org
>and
>01.org, where there was confusion over the canonical location to go to
>for dpdk
>information, I think we want to avoid that.

I agree, but I can not tell 6Wind to discontinue its site, that would not
be right all I can do is make sure we promote the correct open source
location for DPDK community.

>
>> - Permission and access levels in dpdk.org is only one level and we can
>> benefit from having 4 levels and teams as well.
>Not sure what you mean by this.  What access levels are you envisioning,
>and how
>is it they are not achievable with what we have today?

GitHub has different permission levels (4 of them) and any number of teams
we can use to manage how the repos/site can be managed around how we allow
access to repos along with which team you belong. Currently 6Wind is the
sole owner of the site, but with GitHub we can have as many owners of the
site we need. Placing more then one person/company at this level seems
reasonable as this is a community of people/companies. Using the other
permission levels gives us a few more options. Personally having everyone
or only one as the owner is not have will help the community.

I personally do not want to be an owner of the DPDK repo, but a
contributor to that repo. The owner has some permissions that we would not
want everyone to be able to change, but if that is what everyone wants I
am OK with it. Being on a team for Pktgen with admin permissions seems
reasonable along with anyone else that would like admin rights.

> 
>
>> - The patch process today suffers from timely reviews, which will not be
>> fixed by moving.
>>   - GitHub has a per pull request discussions area, which gives a clean
>> way to review all discussions on a specific change.
>>     - The current patch model is clone/modify/commit/send patch set
>>     - The model with GitHub is fork on GitHub/modify/commit/send pull
>> request
>> - The patchwork web site is reasonable, but has some draw backs in
>> maintaining the site.
>
>Can you ennumerate?

I could enumerate on my personal issues with patchwork, but the point is
we would not require patchwork using GitHub and the process/tools on
GitHub are maintained my GitHub.

>
>>   - GitHub manages the patches via pull requests and can be easily seen
>> via a web browser.
>>   - The down side is you do have to use a web browser to do some work,
>>but
>> the bulk of the everyday work would be done as it is today.
>>     - I think we all have a web browser now :-)
>Yes, but as you said above, using a web browser doesn't make reviewing
>patches
>faster.  In fact, I would assert that it slows the process down, as it
>prevents
>quick, easy command line access to patch review (as you have with a
>properly
>configured MUA).  That seems like we're going in the opposite direction
>of at
>least one problem we would like to solve.

I was playing with the patch process on GitHub and it does provide a good
way to manage discussions and inlining comments directly or overall to a
patch.

I understand moving from a email base solution to GitHub maybe a change,
but it affords a lot more IMO.

I just read Matthew’s response and he stated the issues for me very well +1

―――――――― Pasted from Matthew’s email
Normally I'm a big command-line supporter. However I have found reviewing
patches by email for me is about the most painful workflow.

The emails are pages and pages.

The replies from commenters are buried in the walls of text.

Replies to replies keep shifting farther off the edge of the screen. The
code 
gets weirder and weirder to try to read.

Quickly reading over the patchset by scrolling through to get the flavor
of 
it, to see if I'm qualified to review it, and look at the parts I actually
know about is much harder.

I can go to one place to see every candidate patchset out there, the GH
Pull 
Request page. Then I can just sync up the branch and test it on my own
systems 
to see if it works, not just try to read it.

Github automatically minimizes old comments that are already fixed, so
they 
don't keep consuming space and mental bandwidth from the review.

All in all, I'd be able to review more DPDK patches faster with the GH
interface than having them in the mailing list.

Matthew.
――――――――

Sometimes people can snip and modify emails as they are sent/replied and
to me that can lead to re-writing history or points of view. Not a big
concern here on this list.

>
>> - GitHub has team support and gives a group better control plus
>> collaboration is much easier as we have a external location to work.
>I don't understand what you mean by an external location to work.  Why is
>that
>beneficial, and why can you not just do that today if you find it
>beneficial.

A well known public site managed by a company as its sole reason to exist
makes it easier for two or more persons to find each other and collaborate
IMO.

>
>>   - Most companies have some pretty high security level and being to
>> collaborate between two or more companies is very difficult if one
>>company
>> is hosting the repo behind a firewall.
>If one company is hosting a git repo behind a firewall, that seems like
>their
>problem to fix.  Not sure how dpdk moving to github helps that.

In GitHub you can create a team and set the permissions for the members of
that team to better control that repo while still make it publicly
available to others at a read only. The others to not have to find the
email address join, then search wildly in the history to find commits,
patches and discussions around that repo or patch.

GitHub makes this process easier and it is not perfect, but better then an
email thread and will help the community I believe.

>
>>   - Using GitHub and teams would make collaboration a lot easier or
>> collaboration between two or more user accounts as well.
>You mentioned that above already, and it still seems like an unfinished
>thought.
>What is github providing here in terms of collaboration tools that we
>don't
>already have?  We have git, we have email, we can send pull requests, we
>have a
>canonical location to discuss change.  Whats missing?

Please visit the GitHub docs for more details as I do not need to list
them here and would just deflect the discussion in the wrong direction.
https://help.github.com/


>
>
>> - GitHub has a Web Page system, which can be customized for the
>>community
>> needs via a public or private repo.
>Thsi is a fair point.  It might be nice to have a wiki, and github gives
>you
>that for free.  Though we could easily set one up on dpdk.org.
>
>> - We still need a dpdk.org email list I believe as I did not find one at
>> GitHub.
>>   - We can also forward GitHub emails to the list.
>>   - I believe you can reply to an email from GitHub and the email will
>>get
>And that sort of undoes the advantages of using github, as it means
>people need
>to check multiple locations for dpdk development information.  They need
>to use
>the web site to get information about pull requests so they can review
>patches
>(github, never sends patches via email), but you still have to check the
>email
>list for discussions not pertaining to patches.

I agree, but again I can not tell 6Wind what to do with its email list and
the email list may still be reasonable. Maybe someone else can suggest a
solution to this issue and how it was solved in other GitHub open source
projects.

>
>As noted above, I'm not explicitly opposed to using github, I use it for
>several
>projects myself, and it does provide some nice features, but I'm not
>seeing how
>those features address the concerns that have been brought up on the list
>here.

I believe if you look at it from the community point of view it may make
more sense to you at least it does to me.

>
>Neil



More information about the dev mailing list