[dpdk-dev] Beyond DPDK 2.0

Marc Sune marc.sune at bisdn.de
Mon Apr 27 11:52:24 CEST 2015



On 27/04/15 03:41, Wiles, Keith wrote:
>
> On 4/26/15, 4:56 PM, "Neil Horman" <nhorman at tuxdriver.com> wrote:
>
>> On Sat, Apr 25, 2015 at 04:08:23PM +0000, Wiles, Keith wrote:
>>>
>>> On 4/25/15, 8:30 AM, "Marc Sune" <marc.sune at bisdn.de> wrote:
>>>
>>>>
>>>> On 24/04/15 19:51, Matthew Hall wrote:
>>>>> On Fri, Apr 24, 2015 at 12:39:47PM -0500, Jay Rolette wrote:
>>>>>> I can tell you that if DPDK were GPL-based, my company wouldn't be
>>>>>> using
>>>>>> it. I suspect we wouldn't be the only ones...
>>>>>>
>>>>>> Jay
>>>>> I could second this, from the past employer where I used it. Right
>>> now
>>>>> I am
>>>>> using it in an open source app, I have a bit of GPL here and there
>>> but
>>>>> I'm
>>>>> trying to get rid of it or confine it to separate address spaces,
>>> where
>>>>> it
>>>>> won't impact the core code written around DPDK, as I don't want to
>>> cause
>>>>> headaches for any downstream users I attract someday.
>>>>>
>>>>> Hard-core GPL would not be possible for most. LGPL could be possible,
>>>>> but I
>>>>> don't think it could be worth the relicensing headache for that small
>>>>> change.
>>>>>
>>>>> Instead we should make the patch process as easy as humanly possible
>>> so
>>>>> people
>>>>> are encouraged to send us the fixes and not cart them around their
>>>>> companies
>>>>> constantly.
>>> +1 and besides the GPL or LGPL ship has sailed IMHO and we can not go
>>> back.
>> Actually, IANAL, but I think we can.  The BSD license allows us to fork
>> and
>> relicense the code I think, under GPL or any other license.  I'm not
>> advocating
>> for that mind you, just suggesting that its possible should it ever become
>> needed.
>>
>>>> I agree. My feeling is that as the number of patches in the mailing
>>> list
>>>> grows, keeping track of them gets more and more complicated. Patchwork
>>>> website was a way to try to address this issue. I think it was an
>>>> improvement, but to be honest, patchwork lacks a lot of functionality,
>>>> such as properly tracking multiple versions of the patch (superseding
>>>> them automatically), and it lacks some filtering capabilities e.g. per
>>>> user, per tag/label or library, automatically track if it has been
>>>> merged, give an overall status of the pending vs merged patches, set
>>>> milestones... Is there any alternative tool or improved version for
>>> that?
>>>
>> Agreed, this has come up before, off list unfortunately.  The volume of
>> patches
>> seems to be increasing at such a rate that a single maintainer has
>> difficulty
>> keeping up.  I proposed that the workload be split out to multiple
>> subtrees,
>> with prefixes being added to patch subjects on the list for local
>> filtering to
>> stem the tide.  Specifically I had proposed that the PMD's be split into a
>> separate subtree, but that received pushback in favor of having each
>> library
>> having its own separate subtree, with a pilot program being made out of
>> the I40e
>> driver (which you might note sends pull requests to the list now).  I'd
>> still
>> like to see all PMD's come under a single subtree, but thats likely an
>> argument
>> for later.
>>
>> That said, Do you think that this patch latency is really a contributor
>> to low
>> project participation?  It definately a problem, but it seems to me that
>> this
>> sort of issue would lead to people trying to parcitipate, then giving up
>> (i.e.
>> we would see 1-2 emails from an individual, then not see them again).
>> I'd need
>> to look through the mailing list for such a pattern, but anecdotally I've
>> not
>> seen that happen.  The problem you describe above is definately a
>> problem, but
>> its one for those individuals who are participating, not for those who are
>> simply choosing not to.  And I think we need to address both.
>>
>>> I agree patchwork has some limitation, but I think the biggest issue is
>>> keeping up with the patches. Getting patches introduced into the main
>>> line
>>> is very slow. A patch submitted today may not get applied for weeks or
>>> months, then when another person submits a patch he is starting to run a
>>> very high risk of having to redo that patch, because a pervious patch
>>> makes his fail weeks/months later. I would love to see a better tool
>>> then
>>> patchwork, but the biggest issue is we have a huge backlog of patches.
>>> Personally I am not sure how Thomas or any is able to keep up with the
>>> patches.
>>>
>> This is absolutely a problem.  I'd like to think, more than a tool like
>> patchwork, a subtree organization to allow some modicum of parallel
>> review and
>> integration would really be a benefit here.
> Subtrees could work, but the real problem I think is the number of
> committers must be higher then one. Something like GitHub (and I assume
> Linux Foundation) have a method to add committers to a project. In the
> case of GitHub they just have to have a free GitHub account and they can
> become committers of the project buying the owner of the project enables
> them.
>
> On GitHub they have personal accounts and organization accounts I know
> only about the personal accounts, but they allow for 5 private repos and
> any number of public repos. The organization account has a lot of extra
> features that seem better for a DPDK community IMO and should be the one
> we use if we decide it is the right direction. We can always give it a
> shot for while and keep the dpdk.org and use dev at dpdk.org and its repo
> mirrored from GitHub as a transition phase. This way we can fall back to
> dpdk.org or move one to something else if we like.
>
> https://help.github.com/categories/organizations/
>
> The developers could still send patches via email list, but creating a
> repo and forking dpdk is easy, then send a pull request.

For the github "community" or free service, organization accounts just 
allow you to set teams, where each time can be assigned to one or more 
repositories. The differences are summarized here:

https://help.github.com/articles/what-s-the-difference-between-user-and-organization-accounts/

And the permission schema, per team, is summarized here:

https://help.github.com/articles/permission-levels-for-an-organization-repository/

Some limitations: i) only if the team has write permissions (IOW push 
permissions) you can manage issues ii) there cannot be per-branch ACLs.

>
>
>>> The other problem I see is how patches are agreed on to be included in
>>> the
>>> mainline. Today it is just an ACK or a NAK on the mailing list. Then I
>>> see
>>> what I think to be only a few people ACKing or NAKing patches. This
>>> process has a lot of problems from a patch being ignore for some reason
>>> or
>>> someone having negative feed back on very minor detail or no way to
>>> push a
>>> patch forward a single NAK or comment.
>>>
>> So, this is an interesting issue in ideal meritocracies.  Currently
>> is/should be
>> looking for ACKs/NAK/s from the individuals listed in the MAINTAINER
>> files, and
>> those people should be the definitive subject matter experts on the code
>> they
>> cover.  As such, I would agrue that they should be entitled to a modicum
>> of
>> stylistic/trivial leeway.  That is to say, if they choose to block a patch
>> around a very minor detail, then between them changing their position,
>> and the
>> patch author changing the code, the latter is likely the easier course of
>> action, especially if the author can't make an argument for their
>> position.
>> That said, if such patch blockage becomes so egregious that individuals
>> stop
>> contributing, that needs to be known as well.  If you as a patch author:
>>
>> 1) Have tried to submit patches
>> 2) Had them blocked for what you consider trivial reasons
>> 3) Plan to not contribute further because of this
>> 4) Still rely on the DPDK for your product
>>
>> Please, say something.  People in charge need to know when they're pushing
>> contributors away.
>>
>> FWIW, I've tried to do some correlation between the git history and the
>> mailing
>> list.  I need to do more searches, but I have a feeling that early on, the
>> majority of people who stopped contributing, did so because their patches
>> weren't expressely blocked, but rather because they were simply ignored.
>> No one
>> working on DPDK bothered to review those patches, and so they never got
>> merged.
>> Hopefully that problem has been addressed somewhat now.
I agree 100%
>>
>>> I would like to see some type of layering process to allow patches to be
>>> applied in a timely manner a few weeks not months or completely ignored.
>>> Maybe some type of voting is reasonable, but we need to do something to
>>> turn around the patches in clean reasonable manner.
>>>
>>> Think we need some type of group meeting every week to look at the
>>> patches
>>> and determining which ones get applied, this gives quick feedback to the
>>> submitter as to the status of the patch.
>>>
>> I think a group meeting is going to be way too much overhead to manage
>> properly.
>> You'll get different people every week with agenda that may not line up
>> with
>> code quality, which is really what the review is meant to provide.  I
>> think
> I was only suggesting the maintainers attend the meeting. Of course they
> have to attend or have someone attend for them, just to get the voting
> done. If you do not attend then you do not get to vote or something like
> that is reasonable. Not that we should try and define the process here.
>
>> perhaps a better approach would be to require that that code owners from
>> the
>> maintainer file provide and ACK/NAK on their patches within 3-4 days, and
>> require a corresponding tree maintainer to apply the patch within 7 or
>> so.  That
>> would cap our patch latency.  Likewise, if a patch slips in creating a
>> regression, the author needs to be alerted and given a time window in
>> which to
>> fix the problem before the offending patch is reverted during the QE
>> cycle.
>>
>>
>>>> On the other side, since user questions, community discussions and
>>>> development happens in the same mailing list, things get really
>>>> complicated, specially for users seeking for help. Even though I think
>>>> the average skills of the users of DPDK is generally higher than in
>>>> other software projects, if DPDK wants to attract more users, having a
>>>> better user support is key, IMHO.
>>>>
>>>> So I would see with good eyes a separation between, at least, dpdk-user
>>>> and dpdk-dev.
>> I wouldn't argue with this separation, seems like a reasonable approach.
>>
>>> I do not remember seeing too many users on the list and making a list
>>> just
>>> for then is OK if everyone is fine with a list that has very few emails.
>>>> If the number of patches keeps growing, splitting the "dev" mailing
>>>> lists into different categories (eal and common, pmds, higher level
>>>> abstractions...) could be an option. However, this last point opens a
>>>> lot of questions on how to minimize interference between the different
>>>> parts and API/ABI compatibility during the development.
>>> I believe if we just make sure we use tags in the subject line then we
>>> can
>>> have our email clients do the splitting of the emails instead of adding
>>> more emails lists.
>>>
>> Agreed

I think it is a good idea too. Maybe we can standardize some format e.g. 
[TAG][PATCH vX], or something like that.

>>
>>>>> Perhaps it means having some ReviewBoard type of tools, a clone in
>>>>> Github or
>>>>> Bitbucket where the less hardcore kernel-workflow types could send
>>> back
>>>>> their
>>>>> small bug fixes a bit more easily, this kind of stuff. Google has
>>> been
>>>>> getting
>>>>> good uptake since they moved most of their open source across to
>>> Github,
>>>>> because the contribution workflow was more convenient than Google
>>> Code
>>>>> was.
>>> I like GitHub it is a much better designed tool then patchwork, plus it
>>> could get more eyes as it is very well know to the developer community
>>> in
>>> general. I feel GitHub has many advantages over the current systems in
>>> place but, it does not solve the all patch issues.
>>>
>> Github is actually a bit irritating for this sort of thing, as it
>> presumes a web
>> based interface for discussion.  They have some modicum of email
>> forwarding
>> enabled, but it has never quite worked right, or integrated properly.

An alternative to githubs and bitbuckets is a self-hosted forge, like 
gitlab:

https://about.gitlab.com/

To be honest, I mostly work on open-source repositories, and in our 
organization we use only gitlab for private repositories, so I haven't 
played that much with it. But it seems to do its job and has almost all 
of the features of the "community" github, if not more. I don't know if 
you can even integrate it with github's accounts somehow, to prevent to 
have to register.

However, one of the important points of using github/bitbucket is 
visibility and ease the contribution process. By using an self-hosted 
solution, even if it is similar to github and well advertised in DPDK's 
website, you kind of loose part of that advantage.

> Email forwarding has seemed to work for me and in one case it took a bit
> to have GitHub stop sending me emails on a repo I did not want anymore :-)
>>> The only way we can get patch issues resolved is to put a bit more
>>> process
>>> in place.
>>>> Although I agree, we have to be careful on how github or bitbucket is
>>>> used. Having issues or even (e.g. github) pull requests *in addition*
>>> to
>>>> the normal contribution workflow can be a nightmare to deal with, in
>>>> terms of synchronization and preventing double work. So I guess setting
>>>> up an official github or bitbucket mirror would be fine, via some
>>> simple
>>>> cronjob, but I guess it would end-up not using PRs or issues in github
>>>> like the Linux kernel does.
>> 100% agree, we can't be split about this.  Allowing contributions from n
>> channels just means most developers will only see/reviews 1/nth of the
>> patches
>> of interest to them.
> If we setup a GitHub or some other site, we would need to make Github the
> primary site to remove this type of problem IMO.

You mean changing the workflow from email based to issues and pull-req 
or github pull req? Do you really think this is possible?

>>>  From what I can tell GitHub seems to be a better solution for a free
>>> open
>>> environment. Bitbucket I have never used and GitHub seems more popular
>>> from one article I read.
>>>
>>>
>>> https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-
>>> 8#
>>> q=bitbucket%20vs%20github
>>>
>>>
>>>> Btw, is this github organization already registered by Intel or some
>>>> other company of the community?
>>>>
>>>> https://github.com/dpdk
>>>>
> I was hoping someone would own up to the GitHub dpdk site.

Just wanted to know if this was the case. But, even if that would not be 
the case, I *guess* that, as it happens with other services like 
twitter, facebook..., Intel could claim the user, since it has the 
registered trademark.

marc

>
>>>> Marc
>>> If we can used the above that would be great, but a name like
>>> Œdpdk-community¹ or something could work too.
>>>
>>> We can host the web site here and have many sub-projects like
>>> Pktgen-DPDK
>>> :-) under the same page. Not to say anything bad about our current web
>>> pages as I find it difficult to use sometimes and find things like
>>> patchwork link. Maintaining a web site is a full time job and GitHub
>>> does
>>> maintain the site, plus we can collaborate on host web page on the
>>> GitHub
>>> site easier.
>>>
>>> Moving to the Linux Foundation is an option as well as it is very well
>>> know and has some nice ways to get your project promoted. It does have a
>>> few drawbacks in process handling and cost to state a few. The process
>>> model is all ready defined, which is good and bad it just depends on
>>> your
>>> needs IMO.
>>>
>>> Regards,
>>> ++Keith
>>>
>>>>> Matthew.
>>>



More information about the dev mailing list