[dpdk-users] Older DPDK guest performance with 4.14+ host kernels
    Tim Shearer 
    TShearer at advaoptical.com
       
    Wed May 16 18:39:21 CEST 2018
    
    
  
All,
I have found that running older DPDK applications (prior to approx 16.11) as a KVM guest on systems with a 4.14 or later host kernel may result in a significant performance penalty. For example, the l2fwd application is unable to reliably pass traffic at any rate despite running on a dedicated, pinned VCPU, isolated from Linux tasks with isolcpus on both the guest and host. 
Using Perf on the host, it became apparent that KVM was regularly exiting into userspace. These context switches are expensive and resulted in drops:
             CPU-8929  [005]  5403.733805: kvm_exit:             reason IO_INSTRUCTION rip 0x483215 info 800040 0
             CPU-8929  [005]  5403.733806: kvm_pio:              pio_write at 0x80 size 1 count 1 val 0x0 
             CPU-8929  [005]  5403.733808: kvm_userspace_exit:   reason KVM_EXIT_IO (2)
The root cause of this to a kernel patch submitted a few months ago, which disables VMX handling of I/O port 0x80 writes, forcing it instead to be emulated by QEMU: https://patchwork.kernel.org/patch/10087713/. So, the DPDK guest app is generating a lot of 0x80 writes, and KVM isn't handling them anymore.
The cause of the 0x80 writes are glibc's outw_p function. This was called extensively by the virtio_pci driver prior to this change: http://www.dpdk.org/ml/archives/dev/2016-February/032782.html
I've attempted to push for a kernel fix to the KVM/VMX module to allow port 80 to be handled in hardware, perhaps as a configurable parameter passed in by QEMU. This isn't getting any traction, hence this email - basically if you're using a 4.14 or later host kernel, you'll need to use relatively modern VNFs (based on DDPK 16.11 or later), or alternatively, revert the kernel patch linked to above. Be advised that the patch addresses a potential DoS issue, so this would only be advisable for trusted guests.
Feel free to message me if you want further details.
Thanks,
Tim Shearer
    
    
More information about the users
mailing list