[dpdk-dev] git trees organization

Adrien Mazarguil adrien.mazarguil at 6wind.com
Wed Sep 13 16:54:02 CEST 2017


On Wed, Sep 13, 2017 at 02:21:00PM +0100, Ferruh Yigit wrote:
> On 9/13/2017 1:25 PM, Adrien Mazarguil wrote:
> > On Wed, Sep 13, 2017 at 12:38:37PM +0100, Ferruh Yigit wrote:
> >> On 9/13/2017 8:58 AM, Adrien Mazarguil wrote:
> >>> Hi,
> >>>
> >>> On Tue, Sep 12, 2017 at 09:32:07AM +0100, Bruce Richardson wrote:
> >>>> On Tue, Sep 12, 2017 at 12:03:30AM +0200, Thomas Monjalon wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> As you know I am currently the only maintainer of the master tree.
> >>>>> It is very convenient because I need to synchronize with others
> >>>>> only when pulling "next-*" trees.
> >>>>> But the drawback is that I should be available very often to
> >>>>> avoid stalled patches waiting in patchwork backlog.
> >>>>>
> >>>>> I feel it is the good time to move to a slightly different organization.
> >>>>> I am working closely with Ferruh Yigit for almost one year, as next-net
> >>>>> maintainer, and I think it would be very efficient to delegate him some
> >>>>> work for the master tree.
> >>>>
> >>>> I think Ferruh has been doing an excellent job on the net tree, and
> >>>> would be an excellent candidate to help with the workload on the master
> >>>> tree.
> >>>>
> >>>>> I mean that I would use the patchwork delegation to explicitly divide
> >>>>> the workload given our different experiences.
> >>>>> Ferruh, do you agree taking this new responsibility?
> >>>>>
> >>>>> At the same time, we can think how to add more git sub-trees:
> >>>>
> >>>> In principle, I'm in favour, but I think that the subtrees of the master
> >>>> tree should be at a fairly coarse granularity, and not be too many of
> >>>> them. The more subtrees, the more likely we are to have issues with
> >>>> patchsets needing to be split across trees, or having to take bits from
> >>>> multiple trees in order to test if everything is working.
> >>> <snip>
> >>>
> >>> About that, how about we start allowing true merge commits instead of
> >>> rebasing (rewriting history) in order to ease things for maintainers?
> >>>
> >>> This approach makes pull requests show up as a merge commits that contain
> >>> the (ideally trivial) changes needed to resolve any conflicts; this has the
> >>> following benefits:
> >>>
> >>> - The work done by a maintainer during that merge is tracked, not silently
> >>>   ignored or lost. The merge commit itself is signed-off by its author.
> >>>
> >>> - This allows tracing mistakes or bugs to the conflict resolution itself.
> >>>
> >>> - Upstream can reject pull requests on the basis that merging it is not
> >>>   trivial enough (i.e. downstream must merge upstream changes first).
> >>>
> >>> - Sub-trees can merge among themselves in case they need features that
> >>>   encompass several trees, not necessarily always against the master
> >>>   tree. Everything is tracked.
> >>>
> >>> - Maintainers do not ever modify the commits they get from other trees,
> >>>   which keep their SHAs unmodified as part of the history. A given commit ID
> >>>   is truly unique among all trees (back-port trees remain the only exception
> >>>   since commits are cherry-picked).
> >>>
> >>> - It shifts the entire responsibility to the maintainers of sub-trees.
> >>>
> >>> The only downside is that commits have several parents, history becomes a
> >>> graph that developers need to get used to (some might call it a mess),
> >>> however that's probably not an issue for those already used to Linux kernel
> >>> development and other large projects.
> >>>
> >>> I know this was already discussed in the past, however I think adding more
> >>> sub-trees will make rebasing too complex otherwise>
> >>> Thoughts?
> >>>
> >>
> >> Using git merge looks more proper git usage, but I have one question /
> >> concern:
> >>
> >> For next-net, sometimes there are dependent patches in main tree, and
> >> what I am doing is rebasing sub-tree on top of latest main tree.
> >>
> >> When switched to merge method, how dependent patches can be get into the
> >> sub-tree? Merge from main tree to sub-tree?
> > 
> > Yes, that's the idea. On the other hand, as a maintainer, you are not
> > responsible for the contents of what's merged from other official
> > trees. Commits are taken as they are, this implies trust between tree
> > maintainers.
> > 
> >> Won't this bidirectional merging confusing?
> > 
> > Probably at first, this certainly needs some getting used to. We can attempt
> > to avoid such merges as much as possible with proper coordination. Avoiding
> > them is likely not possible though if we want to keep history intact.
> 
> Let's assume I need to merge from main tree to next-net three times
> before the integration, because of dependencies.
> When Thomas merged next-net to main tree for rc1, will those three merge
> commits visible in main tree?

Yes, all merge commits will remain visible and part of history. There's only
one case when such commits are optional: fast-forwards. For instance
assuming the HEAD of your current branch is part of upstream's history, you
may not get a merge commit if you pull from upstream before applying a
series instead of doing the reverse.

Whoever gets the fast-forward wins, therefore merging often is better.

Even with many subsequent merges, it's not all that bad. Graph views such as
"git log --oneline --graph" help a lot in clarifying things (once you get
used to them).

> For Linux I guess sub-trees not merging from Linus' tree because of
> dependencies, I assume they are only merging after release to get new
> commits not in their sub-tree.

Linux does that all the time and not necessarily after a release, even among
sub-trees. We don't plan to have as many different trees as Linux so it
should remain much simpler for us in any case.

> But for DPDK main tree is also getting new patches on its own.

Same for Linux even if those come in minority, I think it's not a problem
either way, Git is really good at merging histories.

Note that a maintainer can also add glue commits of his own on top of his
tree in order to simplify a subsequent merge. This is better than addressing
everything in the merge commit itself in non-trivial cases (although it
should be the job of the requester).

> >> And following are notes from my current experience:
> >>
> >> - Having re-writable history gives some flexibility to sub-trees.
> >> Possible to update commit logs and amend patches even after pushed.
> > 
> > This is both a good and a bad thing. Thanks to that, history is currently
> > linear and extremely clean, I think we haven't had a single patch that
> > doesn't compile or a bad merge artifact for a very long time.
> > 
> > On the other hand, if you look at the effort required to maintain a single
> > sub-tree that way, it likely becomes exponential for several. All of them
> > need to constantly rebase their stuff, with conflicts to address. The more
> > people, the more mistakes and so on.
> > 
> > Once part of an official tree, a commit cannot be amended anymore, it's too
> > late; revert commits will be more common. We need to accept this first.
> > 
> >> - It is hard to confirm pulled commits in main tree, I guess merge
> >> commit will make this easier.
> >>
> >> - To track main tree, continuously rebasing and continuously re-writing
> >> history, I am doing this almost daily, this may be hard for people
> >> working on top of next-net.
> > 
> > Yes that's one of the "bad" things with the current approach. Consider
> > automatic non-regression testing against disappearing commit IDs. Tracking
> > their status is difficult when history is changing. For instance we usually
> > add local tags to track CI successes/failures. All of them now point to
> > nonexistent commits.
> > 
> >> - Conflict resolving done by sub-trees during rebase, instead of done by
> >> main tree during merge. So this may be more distributed effort.
> > 
> > It's not too different actually. With merge you can reject PRs on the basis
> > that there are too many conflicts to take care of, not unlike a request for
> > rebase.
> > 
> > On the plus side if you make any changes yourself in order to solve them,
> > they are made part of the merge commit, not in the original commits which
> > remain unmodified.
> 
> You are right, we are loosing original code when there is merge
> conflict, and I think there is no way to trace back to the original
> commit in repo, but from mail list and patchwork perhaps.
> 
> And very hard to find out what has been changed for conflict resolving,
> and if something went wrong!
> 
> > 
> >> - Rebasing gives more straight forward history in main repo, merge
> >> commits looks more confusing, although I would expect it won't be as
> >> complex as Linux tree, so may not be a problem.
> > 
> > Right, that's the main drawback. I think there's no way to know the impact
> > unless we attempt it as an experiment.

-- 
Adrien Mazarguil
6WIND


More information about the dev mailing list