[dpdk-dev] Rx Can't receive anymore packet after received 1.5 billion packet.

Dumitrescu, Cristian cristian.dumitrescu at intel.com
Wed Jul 19 20:43:37 CEST 2017



> -----Original Message-----
> From: vuonglv at viettel.com.vn [mailto:vuonglv at viettel.com.vn]
> Sent: Tuesday, July 18, 2017 2:37 AM
> To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>
> Cc: users at dpdk.org; dev at dpdk.org
> Subject: Re: [dpdk-dev] Rx Can't receive anymore packet after received 1.5
> billion packet.
> 
> 
> 
> On 07/17/2017 05:31 PM, cristian.dumitrescu at intel.com wrote:
> >
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of
> >> vuonglv at viettel.com.vn
> >> Sent: Monday, July 17, 2017 3:04 AM
> >> Cc: users at dpdk.org; dev at dpdk.org
> >> Subject: [dpdk-dev] Rx Can't receive anymore packet after received 1.5
> >> billion packet.
> >>
> >> Hi DPDK team,
> >> Sorry when I send this email to both of group users and dev. But I have
> >> big problem: Rx core on my application can not receive anymore packet
> >> after I did the stress test to it (~1 day Rx core received ~ 1.5 billion
> >> packet). Rx core still alive but didn't receive any packet and didn't
> >> generate any log. Below is my system configuration:
> >> - OS: CentOS 7
> >> - Kernel: 3.10.0-514.16.1.el7.x86_64
> >> - Huge page: 32G: 16384 page 2M
> >> - NIC card: Intel 85299
> >> - DPDK version: 16.11
> >> - Architecture: Rx (lcore 1) received packet then queue to the ring
> >> ----- Worker (lcore 2) dequeue packet in the ring and free it (use
> >> rte_pktmbuf_free() function).
> >> - Mempool create: rte_pktmbuf_pool_create (
> >>                                            "rx_pool",                  /*
> >> name */
> >>                                            8192,                     /*
> >> number of elemements in the mbuf pool */
> >> 256,                                            /* Size of per-core
> >> object cache */
> >> 0,                                                 /* Size of
> >> application private are between rte_mbuf struct and data buffer */
> >>                                            RTE_MBUF_DEFAULT_BUF_SIZE, /*
> >> Size of data buffer in each mbuf (2048 + 128)*/
> >> 0                                                   /* socket id */
> >>                               );
> >> If I change "number of elemements in the mbuf pool" from 8192 to 512,
> Rx
> >> have same problem after shorter time (~ 30s).
> >>
> >> Please tell me if you need more information. I am looking forward to
> >> hearing from you.
> >>
> >>
> >> Many thanks,
> >> Vuong Le
> > Hi Vuong,
> >
> > This is likely to be a buffer leakage problem. You might have a path in your
> code where you are not freeing a buffer and therefore this buffer gets
> "lost", as the application is not able to use this buffer any more since it is not
> returned back to the pool, so the pool of free buffers shrinks over time up to
> the moment when it eventually becomes empty, so no more packets can be
> received.
> >
> > You might want to periodically monitor the numbers of free buffers in your
> pool; if this is the root cause, then you should be able to see this number
> constantly decreasing until it becomes flat zero, otherwise you should be
> able to the number of free buffers oscillating around an equilibrium point.
> >
> > Since it takes a relatively big number of packets to get to this issue, it is
> likely that the code path that has this problem is not executed very
> frequently: it might be a control plane packet that is not freed up, or an ARP
> request/reply pkt, etc.
> >
> > Regards,
> > Cristian
> Hi Cristian,
> Thanks for your response, I am doing your ideal. But let me show you
> another case i have tested before. I changed architecture of my
> application as below:
> - Architecture: Rx (lcore 1) received packet then queue to the ring
> ----- after that: Rx (lcore 1) dequeue packet in the ring and free it
> immediately.
> (old architecture as above)
> With new architecture Rx still receive packet after 2 day and everything
> look good.  Unfortunately, My application must run in old architecture.
> 
> Any ideal for me?
> 
> 
> Many thanks,
> Vuong Le

I am not sure I understand the old architecture and the new architecture you are referring to, can you please clarify them.

Regards,
Cristian


More information about the dev mailing list