[dpdk-dev] git trees organization

Ferruh Yigit ferruh.yigit at intel.com
Wed Sep 13 15:21:00 CEST 2017

On 9/13/2017 1:25 PM, Adrien Mazarguil wrote:
> On Wed, Sep 13, 2017 at 12:38:37PM +0100, Ferruh Yigit wrote:
>> On 9/13/2017 8:58 AM, Adrien Mazarguil wrote:
>>> Hi,
>>> On Tue, Sep 12, 2017 at 09:32:07AM +0100, Bruce Richardson wrote:
>>>> On Tue, Sep 12, 2017 at 12:03:30AM +0200, Thomas Monjalon wrote:
>>>>> Hi all,
>>>>> As you know I am currently the only maintainer of the master tree.
>>>>> It is very convenient because I need to synchronize with others
>>>>> only when pulling "next-*" trees.
>>>>> But the drawback is that I should be available very often to
>>>>> avoid stalled patches waiting in patchwork backlog.
>>>>> I feel it is the good time to move to a slightly different organization.
>>>>> I am working closely with Ferruh Yigit for almost one year, as next-net
>>>>> maintainer, and I think it would be very efficient to delegate him some
>>>>> work for the master tree.
>>>> I think Ferruh has been doing an excellent job on the net tree, and
>>>> would be an excellent candidate to help with the workload on the master
>>>> tree.
>>>>> I mean that I would use the patchwork delegation to explicitly divide
>>>>> the workload given our different experiences.
>>>>> Ferruh, do you agree taking this new responsibility?
>>>>> At the same time, we can think how to add more git sub-trees:
>>>> In principle, I'm in favour, but I think that the subtrees of the master
>>>> tree should be at a fairly coarse granularity, and not be too many of
>>>> them. The more subtrees, the more likely we are to have issues with
>>>> patchsets needing to be split across trees, or having to take bits from
>>>> multiple trees in order to test if everything is working.
>>> <snip>
>>> About that, how about we start allowing true merge commits instead of
>>> rebasing (rewriting history) in order to ease things for maintainers?
>>> This approach makes pull requests show up as a merge commits that contain
>>> the (ideally trivial) changes needed to resolve any conflicts; this has the
>>> following benefits:
>>> - The work done by a maintainer during that merge is tracked, not silently
>>>   ignored or lost. The merge commit itself is signed-off by its author.
>>> - This allows tracing mistakes or bugs to the conflict resolution itself.
>>> - Upstream can reject pull requests on the basis that merging it is not
>>>   trivial enough (i.e. downstream must merge upstream changes first).
>>> - Sub-trees can merge among themselves in case they need features that
>>>   encompass several trees, not necessarily always against the master
>>>   tree. Everything is tracked.
>>> - Maintainers do not ever modify the commits they get from other trees,
>>>   which keep their SHAs unmodified as part of the history. A given commit ID
>>>   is truly unique among all trees (back-port trees remain the only exception
>>>   since commits are cherry-picked).
>>> - It shifts the entire responsibility to the maintainers of sub-trees.
>>> The only downside is that commits have several parents, history becomes a
>>> graph that developers need to get used to (some might call it a mess),
>>> however that's probably not an issue for those already used to Linux kernel
>>> development and other large projects.
>>> I know this was already discussed in the past, however I think adding more
>>> sub-trees will make rebasing too complex otherwise>
>>> Thoughts?
>> Using git merge looks more proper git usage, but I have one question /
>> concern:
>> For next-net, sometimes there are dependent patches in main tree, and
>> what I am doing is rebasing sub-tree on top of latest main tree.
>> When switched to merge method, how dependent patches can be get into the
>> sub-tree? Merge from main tree to sub-tree?
> Yes, that's the idea. On the other hand, as a maintainer, you are not
> responsible for the contents of what's merged from other official
> trees. Commits are taken as they are, this implies trust between tree
> maintainers.
>> Won't this bidirectional merging confusing?
> Probably at first, this certainly needs some getting used to. We can attempt
> to avoid such merges as much as possible with proper coordination. Avoiding
> them is likely not possible though if we want to keep history intact.

Let's assume I need to merge from main tree to next-net three times
before the integration, because of dependencies.
When Thomas merged next-net to main tree for rc1, will those three merge
commits visible in main tree?

For Linux I guess sub-trees not merging from Linus' tree because of
dependencies, I assume they are only merging after release to get new
commits not in their sub-tree.
But for DPDK main tree is also getting new patches on its own.

>> And following are notes from my current experience:
>> - Having re-writable history gives some flexibility to sub-trees.
>> Possible to update commit logs and amend patches even after pushed.
> This is both a good and a bad thing. Thanks to that, history is currently
> linear and extremely clean, I think we haven't had a single patch that
> doesn't compile or a bad merge artifact for a very long time.
> On the other hand, if you look at the effort required to maintain a single
> sub-tree that way, it likely becomes exponential for several. All of them
> need to constantly rebase their stuff, with conflicts to address. The more
> people, the more mistakes and so on.
> Once part of an official tree, a commit cannot be amended anymore, it's too
> late; revert commits will be more common. We need to accept this first.
>> - It is hard to confirm pulled commits in main tree, I guess merge
>> commit will make this easier.
>> - To track main tree, continuously rebasing and continuously re-writing
>> history, I am doing this almost daily, this may be hard for people
>> working on top of next-net.
> Yes that's one of the "bad" things with the current approach. Consider
> automatic non-regression testing against disappearing commit IDs. Tracking
> their status is difficult when history is changing. For instance we usually
> add local tags to track CI successes/failures. All of them now point to
> nonexistent commits.
>> - Conflict resolving done by sub-trees during rebase, instead of done by
>> main tree during merge. So this may be more distributed effort.
> It's not too different actually. With merge you can reject PRs on the basis
> that there are too many conflicts to take care of, not unlike a request for
> rebase.
> On the plus side if you make any changes yourself in order to solve them,
> they are made part of the merge commit, not in the original commits which
> remain unmodified.

You are right, we are loosing original code when there is merge
conflict, and I think there is no way to trace back to the original
commit in repo, but from mail list and patchwork perhaps.

And very hard to find out what has been changed for conflict resolving,
and if something went wrong!

>> - Rebasing gives more straight forward history in main repo, merge
>> commits looks more confusing, although I would expect it won't be as
>> complex as Linux tree, so may not be a problem.
> Right, that's the main drawback. I think there's no way to know the impact
> unless we attempt it as an experiment.

More information about the dev mailing list