[dpdk-dev] [EXT] Re: [PATCH v6 4/4] kernel/linux/kni: add IOVA support in kni module

Ferruh Yigit ferruh.yigit at intel.com
Fri Jul 12 13:10:03 CEST 2019


On 7/12/2019 11:38 AM, Vamsi Krishna Attunuru wrote:
> 
> 
> 
> --------------------------------------------------------------------------------
> *From:* Ferruh Yigit <ferruh.yigit at intel.com>
> *Sent:* Thursday, July 11, 2019 10:00 PM
> *To:* Vamsi Krishna Attunuru; dev at dpdk.org
> *Cc:* olivier.matz at 6wind.com; arybchenko at solarflare.com; Kiran Kumar Kokkilagadda
> *Subject:* [EXT] Re: [PATCH v6 4/4] kernel/linux/kni: add IOVA support in kni
> module
>  
> External Email
> 
> ----------------------------------------------------------------------
> On 6/25/2019 4:57 AM, vattunuru at marvell.com wrote:
>> From: Kiran Kumar K <kirankumark at marvell.com>
>> 
>> Patch adds support for kernel module to work in IOVA = VA mode,
>> the idea is to get physical address from iova address using
>> iommu_iova_to_phys API and later use phys_to_virt API to
>> convert the physical address to kernel virtual address.
>> 
>> When compared with IOVA = PA mode, there is no performance
>> drop with this approach.
>> 
>> This approach does not work with the kernel versions less
>> than 4.4.0 because of API compatibility issues.
>> 
>> Signed-off-by: Kiran Kumar K <kirankumark at marvell.com>
>> Signed-off-by: Vamsi Attunuru <vattunuru at marvell.com>
> 
> <...>
> 
>> @@ -351,15 +354,56 @@ kni_ioctl_create(struct net *net, uint32_t ioctl_num,
>>        strncpy(kni->name, dev_info.name, RTE_KNI_NAMESIZE);
>>  
>>        /* Translate user space info into kernel space info */
>> -     kni->tx_q = phys_to_virt(dev_info.tx_phys);
>> -     kni->rx_q = phys_to_virt(dev_info.rx_phys);
>> -     kni->alloc_q = phys_to_virt(dev_info.alloc_phys);
>> -     kni->free_q = phys_to_virt(dev_info.free_phys);
>> -
>> -     kni->req_q = phys_to_virt(dev_info.req_phys);
>> -     kni->resp_q = phys_to_virt(dev_info.resp_phys);
>> -     kni->sync_va = dev_info.sync_va;
>> -     kni->sync_kva = phys_to_virt(dev_info.sync_phys);
>> +     if (dev_info.iova_mode) {
>> +#if KERNEL_VERSION(4, 4, 0) > LINUX_VERSION_CODE
> 
> We have "kni/compat.h" to put the version checks, please use abstracted feature
> checks only in the code.
> From experience this goes ugly quickly with the addition to distro kernels and
> their specific versioning, so better to hide these all from the source code.
> 
> And this version requirement needs to be documented in kni doc.
> 
> ack
> 
>> +             (void)pci;
>> +             pr_err("Kernel version is not supported\n");
> 
> Can you please include 'iova_mode' condition into the message log, because this
> kernel version is supported if user wants to use via 'iova_mode == 0' condition.
> 
> ack
> 
>> +             return -EINVAL;
>> +#else
>> +             pci = pci_get_device(dev_info.vendor_id,
>> +                                  dev_info.device_id, NULL);
>> +             while (pci) {
>> +                     if ((pci->bus->number == dev_info.bus) &&
>> +                         (PCI_SLOT(pci->devfn) == dev_info.devid) &&
>> +                         (PCI_FUNC(pci->devfn) == dev_info.function)) {
>> +                             domain = iommu_get_domain_for_dev(&pci->dev);
>> +                             break;
>> +                     }
>> +                     pci = pci_get_device(dev_info.vendor_id,
>> +                                          dev_info.device_id, pci);
>> +             }
> 
> What if 'pci' is NULL here?
> In kni it is not required to provide a device at all.
> 
> Ack, will add a NULL check.
> other point is not clear to me, device info is absolutely required at least
> for  IOVA=VA mode, since it requires to procure iommu domain details.

"device info is absolutely required" *only* for IOVA=VA mode, so user may skip
to provide it.

> Any thoughts or ways to address this without device.?

Return error if 'iova_mode' requested but device info not?

But you didn't replied to passing 'iova_mode' from application, I would like
hear what you are thinking about it..

> 
> <...>
> 
>> @@ -186,7 +202,10 @@ kni_fifo_trans_pa2va(struct kni_dev *kni,
>>                        return;
>>  
>>                for (i = 0; i < num_rx; i++) {
>> -                     kva = pa2kva(kni->pa[i]);
>> +                     if (likely(kni->iova_mode == 1))
>> +                             kva = iova2kva(kni, kni->pa[i]);
>> +                     else
>> +                             kva = pa2kva(kni->pa[i]);
> 
> To reduce the churn, what about updating the 'pa2kva()' and put the
> "(kni->iova_mode == 1)" check there? Does it help? (not only 'pa2kva()' but its
> friends also, and if it makes more sense agree to rename the functions)
> 
> No, in VA mode, kni->pa[i] points to iova address, pa2kva() of iova address might
> crash, hence the if..else check is added.

I understand that part.
What I am suggestion is something like this:

kva = common_function(kni, kni->pa[i]);

---

common_function() {
	if (unlikely(kni->iova_mode == 1))
		return iova2kva(kni, kni->pa[i]);
	return pa2kva(kni->pa[i]);
}

To hide the check in the function and make code more readable

> 
> And btw, why 'likely' case is "kni->iova_mode == 1"?
> 
> no specific case other than branch predict, will remove this if it's really
> harmful to PA mode.



More information about the dev mailing list