[PATCH v4 1/2] eal: support dmabuf

Burakov, Anatoly anatoly.burakov at intel.com
Thu Feb 12 14:57:41 CET 2026


On 2/4/2026 4:50 PM, Cliff Burdick wrote:
> dmabuf is a modern Linux kernel feature to allow DMA transfers between
> two drivers. Common examples of usage are streaming video devices and
> NIC to GPU transfers. Prior to dmabuf users had to load proprietary
> drivers to expose the DMA mappings. With dmabuf the proprietary drivers
> are no longer required.
> 
> A new api function rte_extmem_register_dmabuf is introduced to create
> the mapping from a dmabuf file descriptor. dmabuf uses a file descriptor
> and an offset that has been pre-opened with the kernel. The kernel uses
> the file descriptor to map to a VA pointer. To avoid ABI changes, a
> static struct is used inside of eal_common_memory.c, and lookups are
> done on this struct rather than from the rte_memseg_list.
> 
> Ideally we would like to add both the dmabuf file descriptor and offset
> to rte_memseg_list, but it's not clear if we can reuse existing fields
> when using the dmabuf API.
> 
> We could rename the external flag to a more generic "properties" flag
> where "external" is the lowest bit, then we can use the second bit to
> indicate the presence of dmabuf. In the presence of the flag for
> dmabuf we could reuse the base_va address field for the dmabuf offset,
> and the socket_id for the file descriptor.
> 
> Signed-off-by: Cliff Burdick <cburdick at nvidia.com>
> ---

Hi,

A few random thoughts about the patchset.

For one, this API is obviously Linux-only. This in itself is not a 
problem (we do have VFIO API...) but I would really like to avoid that 
if possible.

For another, I don't see any support for secondary processes - the 
dmabuf array is process-local, and calling register() from secondary 
process would presumably either fail or create a duplicate segment, 
depending on exactly what you pass into the register call. If this 
scenario isn't supported, it should at least be explicitly disallowed 
and documented to be such.

My biggest concern is that this is creating another type of external 
memory segment and thus segregating the API, but isn't doing it in a way 
that is generic. I can see a valid usecase for this, but what we're 
essentially doing here is storing some metadata together with the 
segment. So, perhaps, this is what we should do? That would seem like a 
cleanest solution for me, and it would extend usefulness of the API to 
other use cases where there may be a requirement to store some 
metadata/fd/whatever with the segment.

You could then build another API on top of this (a library?) that would 
handle things like secondary process synchronization with IPC, so that 
you have all fd's valid in all processes.

Thoughts?
-- 
Thanks,
Anatoly


More information about the dev mailing list