[dpdk-dev] Should we disallow running secondaries after primary has died?
Burakov, Anatoly
anatoly.burakov at intel.com
Fri Jul 26 11:53:58 CEST 2019
On 26-Jul-19 10:50 AM, Ananyev, Konstantin wrote:
>
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
>> Sent: Friday, July 26, 2019 10:40 AM
>> To: Burakov, Anatoly <anatoly.burakov at intel.com>
>> Cc: dev at dpdk.org; Thomas Monjalon <thomas at monjalon.net>
>> Subject: Re: [dpdk-dev] Should we disallow running secondaries after primary has died?
>>
>> On Fri, Jul 26, 2019 at 10:05:02AM +0100, Burakov, Anatoly wrote:
>>> Hi all,
>>>
>>> While investigating this bug:
>>>
>>> https://bugs.dpdk.org/show_bug.cgi?id=284
>>>
>>> I came across a realization that, when primary process dies, very little
>>> actually works. There are some documented issues that are already present
>>> when secondary processes keep running, like memory map becoming static, and
>>> hotplug not working any more.
>>>
>>> What is less known (and documented) is that VFIO also completely stops
>>> working when initializing processes, because some time since 18.xx releases,
>>> we've fixed a long standing VFIO-related bug that had to do with creating
>>> new containers every time a secondary is run - secondary processes will now
>>> reuse the primary process's container instead.
>>>
>>> Meaning, for VFIO devices, secondary process *initialization* will fail
>>> after primary process has died, because there is no longer a process from
>>> which we can get the VFIO container from. Things will still sort-of work
>>> with igb_uio or in vfio-noiommu mode, but again - no memory map updates, no
>>> hotplug, potentially other things that i don't even know about.
>>>
>>> Therefore, while ideally we would like people to have primary process always
>>> running, the least we can do to avoid documenting a complex matrix of "what
>>> is supported in which case" is to disallow secondary process initialization
>>> after primary process has died.
>>>
>>> ("disallow" as in "explicitly document it as unsupported", although we can
>>> also outright prevent it if we want - rte_eal_primary_proc_alive() will tell
>>> us that)
>>>
>> Documenting this limitation seems a good thing to do. I'm not sure that
>> it's worthwhile trying to make the scenario (of running a secondary after a
>> primary has terminated) supported.
>>
>> /Bruce
>
> NP to disallow it.
> In fact, I think it would be easier for everyone just to drop current DPDK MP model,
> and keep just standalone DPDK instances.
That's the dream, but i don't think it'll ever come to fruition, at
least not without a huge push from the community.
--
Thanks,
Anatoly
More information about the dev
mailing list