[dpdk-users] Query on handling packets

Wiles, Keith keith.wiles at intel.com
Sun Dec 30 01:19:51 CET 2018



> On Dec 29, 2018, at 4:03 PM, Harsh Patel <thadodaharsh10 at gmail.com> wrote:
> 
> Hello,
> As suggested, we tried profiling the application using Intel VTune Amplifier. We aren't sure how to use these results, so we are attaching them to this email.
> 
> The things we understood were 'Top Hotspots' and 'Effective CPU utilization'. Following are some of our understandings:
> 
> Top Hotspots
> 
> Function        Module  CPU Time
> rte_delay_us_block      librte_eal.so.6.1       15.042s
> eth_em_recv_pkts        librte_pmd_e1000.so     9.544s
> ns3::DpdkNetDevice::Read        libns3.28.1-fd-net-device-debug.so      3.522s
> ns3::DpdkNetDeviceReader::DoRead        libns3.28.1-fd-net-device-debug.so      2.470s
> rte_eth_rx_burst        libns3.28.1-fd-net-device-debug.so      2.456s
> [Others]                6.656s
> 
> We knew about other methods except `rte_delay_us_block`. So we investigated the callers of this method:
> 
> Callers Effective Time  Spin Time       Overhead Time   Effective Time  Spin Time       Overhead Time   Wait Time: Total        Wait Time: Self
> e1000_enable_ulp_lpt_lp 45.6%   0.0%    0.0%    6.860s  0usec   0usec
> e1000_write_phy_reg_mdic        32.7%   0.0%    0.0%    4.916s  0usec   0usec
> e1000_read_phy_reg_mdic 19.4%   0.0%    0.0%    2.922s  0usec   0usec
> e1000_reset_hw_ich8lan  1.0%    0.0%    0.0%    0.143s  0usec   0usec
> eth_em_link_update      0.7%    0.0%    0.0%    0.100s  0usec   0usec
> e1000_post_phy_reset_ich8lan.part.18    0.4%    0.0%    0.0%    0.064s  0usec   0usec
> e1000_get_cfg_done_generic      0.2%    0.0%    0.0%    0.037s  0usec   0usec
> 
> We lack sufficient knowledge to investigate more than this.
> 
> Effective CPU utilization
> 
> Interestingly, the effective CPU utilization was 20.8% (0.832 out of 4 logical CPUs). We thought this is less. So we compared this with the raw-socket version of the code, which was even less, 8.0% (0.318 out of 4 logical CPUs), and even then it is performing way better.
> 
> It would be helpful if you give us insights on how to use these results or point us to some resources to do so. 

I tracked down the rte_delay_us_block to SendFrom() function calling IsLinkUp() function and it appears calling that routine on every SendFrom() call, which for the e1000 it must be very expensive call. So rework your code to not call IsLinkUp() except every so often. I believe you can enable link status interrupt in DPDK to take an interrupt on link status change, which would be better then calling this routine. How you do that I am not sure, but it should be in the docs someplace.

For now I would remove the IsLinkUp() call and just assume it is up after you it the first time in Setup call function.

> 
> Thank you 
> 
> Regards
> Harsh & Hrishikesh
> 

Regards,
Keith



More information about the users mailing list