[dpdk-dev] CPU hog & memory leak on FreeBSD
Lewis Donzis
lew at perftech.com
Tue Aug 4 01:51:50 CEST 2020
Sure, that would be great. This is from 18.11.9.
Also, the entire modified file is attached.
Thanks very much!
lew
84,86c84,86
< struct rte_intr_callback *callback = NULL;
< struct rte_intr_source *src = NULL;
< int ret, add_event;
---
> struct rte_intr_callback *callback;
> struct rte_intr_source *src;
> int ret, add_event = 0;
99,106c99
< /* allocate a new interrupt callback entity */
< callback = calloc(1, sizeof(*callback));
< if (callback == NULL) {
< RTE_LOG(ERR, EAL, "Can not allocate memory\n");
< return -ENOMEM;
< }
< callback->cb_fn = cb;
< callback->cb_arg = cb_arg;
---
> rte_spinlock_lock(&intr_lock);
108,110c101
< rte_spinlock_lock(&intr_lock);
<
< /* check if there is at least one callback registered for the fd */
---
> /* find the source for this intr_handle */
112,115c103,105
< if (src->intr_handle.fd == intr_handle->fd) {
< /* we had no interrupts for this */
< if (TAILQ_EMPTY(&src->callbacks))
< add_event = 1;
---
> if (src->intr_handle.fd == intr_handle->fd)
> break;
> }
117,121c107,121
< TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
< ret = 0;
< break;
< }
< }
---
> /* If this is an alarm interrupt and it already has a callback, then we don't want to create a new callback
> * because the only thing on the list should be eal_alarm_callback() and we may be called just to reset the timer.
> */
> if (src != NULL && src->intr_handle.type == RTE_INTR_HANDLE_ALARM && !TAILQ_EMPTY(&src->callbacks)) {
> callback = NULL;
> } else {
> /* allocate a new interrupt callback entity */
> callback = calloc(1, sizeof(*callback));
> if (callback == NULL) {
> RTE_LOG(ERR, EAL, "Can not allocate memory\n");
> ret = -ENOMEM;
> goto fail;
> }
> callback->cb_fn = cb;
> callback->cb_arg = cb_arg;
123,134c123,137
< /* no existing callbacks for this - add new source */
< if (src == NULL) {
< src = calloc(1, sizeof(*src));
< if (src == NULL) {
< RTE_LOG(ERR, EAL, "Can not allocate memory\n");
< ret = -ENOMEM;
< goto fail;
< } else {
< src->intr_handle = *intr_handle;
< TAILQ_INIT(&src->callbacks);
< TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
< TAILQ_INSERT_TAIL(&intr_sources, src, next);
---
> if (src == NULL) {
> src = calloc(1, sizeof(*src));
> if (src == NULL) {
> RTE_LOG(ERR, EAL, "Can not allocate memory\n");
> ret = -ENOMEM;
> goto fail;
> } else {
> src->intr_handle = *intr_handle;
> TAILQ_INIT(&src->callbacks);
> TAILQ_INSERT_TAIL(&intr_sources, src, next);
> }
> }
>
> /* we had no interrupts for this */
> if (TAILQ_EMPTY(&src->callbacks))
136,137c139,140
< ret = 0;
< }
---
>
> TAILQ_INSERT_TAIL(&(src->callbacks), callback, next);
177c180
< return ret;
---
> return 0;
181c184,185
< TAILQ_REMOVE(&(src->callbacks), callback, next);
---
> if (callback != NULL)
> TAILQ_REMOVE(&(src->callbacks), callback, next);
----- On Aug 3, 2020, at 6:22 PM, Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com wrote:
> Hello,
> I can take a look if you can post the patch.
>
>> -----Original Message-----
>> From: dev <dev-bounces at dpdk.org> On Behalf Of Lewis Donzis
>> Sent: Monday, August 3, 2020 2:43 PM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] CPU hog & memory leak on FreeBSD
>>
>> Hello.
>>
>> We've posted about this before, but I'm hoping to find someone willing to
>> commit a patched version of lib/librte_eal/bsdapp/eal/eal_interrupts.c that
>> corrects a memory leak and 100% CPU hog that is immediately noticeable
>> with the i40e driver, at a minimum. We have modified this file to correct the
>> problem, and would be happy to provide it back to whomever is a committer
>> for this.
> If you can send a patch, I can take a look.
>
>>
>> The detailed explanation is:
>>
>> Currently, s etting an alarm with rte_eal_alarm_set() registers an alarm
>> interrupt by calling rte_intr_callback_register(), which links the callback
>> function (eal_alarm_callback) onto a list for that source and sets up a one-
>> shot timer via kevent. Setting additional alarms links them on to the
>> alarm_list, but also calls rte_eal_alarm_set() again, which links the callback
>> function onto the source callback list again.
>>
>> When the alarm triggers and eal_alarm_callback() gets called, it goes down
>> the list of pending alarms, calling as many callback functions as possible and
>> removing each one from the list until it reaches one which has not yet expired.
>> Once it's done, if alarm_list is not empty, it calls
>> rte_intr_callback_register(),
>> which then links yet another callback onto the interrupt source's list, thus
>> creating an infinite loop.
>>
>> The problem is that the source callback list grows infinitely under this
>> condition (essentially, when more than one alarm is queued). However, the
>> call is necessary in order to reset the kevent timer.
>>
>> The proposed fix recognizes and leverages the fact that an alarm interrupt in
>> FreeBSD should never have more than one callback on its list, so if
> Is your fix applicable only for FreeBSD?
>
>> rte_intr_callback_register() is called with an interrupt handle type of
>> RTE_INTR_HANDLE_ALARM, and if such an interrupt type already has a non-
>> empty list, then a new callback is not created, but the kevent timer is
>> restarted properly.
>>
>> A much simpler change is possible if we don't mind the overhead of allocating,
>> filling-in, linking, de-linking, and freeing a callback unnecessarily. This
> > proposed change makes every attempt to avoid that overhead.
More information about the dev
mailing list