[dpdk-dev] [PATCH] vfio: fix expanding DMA area in ppc64le

Takeshi T Yoshimura TYOS at jp.ibm.com
Fri Jun 28 13:38:56 CEST 2019

>To: "Mo, YufengX" <yufengx.mo at intel.com>, dev at dpdk.org
>From: "Burakov, Anatoly" <anatoly.burakov at intel.com>
>Date: 06/26/2019 06:43PM
>Cc: drc at ibm.com, pradeep at us.ibm.com, Takeshi Yoshimura
><tyos at jp.ibm.com>
>Subject: [EXTERNAL] Re: [dpdk-dev] [PATCH] vfio: fix expanding DMA
>area in ppc64le
>On 18-Jun-19 3:37 AM, Mo, YufengX wrote:
>> From: Takeshi Yoshimura <tyos at jp.ibm.com>
>> In ppc64le, expanding DMA areas always fail because we cannot
>> a DMA window. As a result, we cannot allocate more than one memseg
>> ppc64le. This is because vfio_spapr_dma_mem_map() doesn't unmap all
>> the mapped DMA before removing the window. This patch fixes this
>> incorrect behavior.
>> I added a global variable to track current window size since we do
>> not have better ways to get exact size of it than doing so. sPAPR
>> IOMMU seems not to provide any ways to get window size with ioctl
>> interfaces. rte_memseg_walk*() is currently used to calculate
>> size, but it walks memsegs that are marked as used, not mapped. So,
>> we need to determine if a given memseg is mapped or not, otherwise
>> the ioctl reports errors due to attempting to unregister memory
>> addresses that are not registered. The global variable is excluded
>> in non-ppc64le binaries.
>> Similar problems happen in user maps. We need to avoid attempting
>> unmap the address that is given as the function's parameter. The
>> compaction of user maps prevents us from passing correct length for
>> unmapping DMA at the window recreation. So, I removed it in
>> I also fixed the order of ioctl for unregister and unmap. The ioctl
>> for unregister sometimes report device busy errors due to the
>> existence of mapped area.
>> Signed-off-by: Takeshi Yoshimura <tyos at jp.ibm.com>
>> ---
>OK there are three patches, and two v1's with two different authors
>reply to the same original patch. There's too much going on here, i 
>can't review this. Needs splitting.
>Also, #ifdef-ing out the map merging seems highly suspect.
>With regards to "walking used memsegs, not mapped", unless i'm 
>misunderstanding something, these are the same - whenever a segment
>mapped, it is marked as used, and whenever it is unmapped, it is
>as free. Could you please explain what is the difference and why is
>Is the point of contention here being the fact that whenever the
>callback arrives, the segments still appear used when iterating over
>map? If that's the case, then i think it would be OK to mark them as 
>unused *before* triggering callbacks, and chances are some of this
>wouldn't be needed. That would require a deprecation notice though, 
>because the API behavior will change (even if this fact is not 
>documented properly).

I am the author of this patch. We should ignore a patch from YufengX Mo.

>From my code reading, a memsg is at first marked as used when it is allocated. Then, the memseg is passed to vfio_spapr_dma_mem_map(). The callback iterates all the used (i.e., allocated) memsegs and call ioctl for mapping VA to IOVA. So, when vfio_spapr_dma_mem_map() is called, passed memsegs can be non-mapped but marked as used. As a result, an attempt to unmap non-mapped area happens during DMA window expansion. This is the difference and why this fix was needed.

> i think it would be OK to mark them as unused *before* triggering callbacks

Yes, my first idea was the same as yours, but I was also worried that it might cause inconsistent API behavior as you also pointed out. If you think so, I think I can rewrite the patch without ugly #ifdef. 

Unfortunately, I don't have enough time for writing code next week and next next week. So, I will resubmit the revised patch weeks later.


More information about the dev mailing list