[dpdk-dev] [PATCH v2] timer: fix resource leak in finalize
anatoly.burakov at intel.com
Wed May 8 10:49:58 CEST 2019
On 07-May-19 11:04 PM, Carrillo, Erik G wrote:
> Hi Anatoly,
> Thanks for the review. Comments in-line:
>>> #define RTE_MAX_DATA_ELS 64
>>> +static const struct rte_memzone *rte_timer_data_mz; static
>>> +rte_atomic16_t *rte_timer_mz_refcnt;
>>> static struct rte_timer_data *rte_timer_data_arr;
>>> static const uint32_t default_data_id;
>>> static uint32_t rte_timer_subsystem_initialized; @@ -155,6 +157,7 @@
>>> struct rte_timer_data *data;
>>> int i, lcore_id;
>>> static const char *mz_name = "rte_timer_mz";
>>> + size_t data_arr_size = RTE_MAX_DATA_ELS *
>> nitpicking, but... const?
> No problem - I'll make this change if this line persists into the next version.
>>> @@ -205,8 +216,11 @@
>> BIND_DEFAULT_SYMBOL(rte_timer_subsystem_init, _v1905, 19.05);
>>> void __rte_experimental
>>> - if (rte_timer_data_arr)
>>> - rte_free(rte_timer_data_arr);
>>> + if (!rte_timer_subsystem_initialized)
>>> + return;
>>> + if (rte_atomic16_dec_and_test(rte_timer_mz_refcnt))
>>> + rte_memzone_free(rte_timer_data_mz);
>> I think there's a race here. You may get preempted after test but before
>> free, where another secondary could initialize. As far as i know, we also
> Indeed, thanks for catching this.
>> support a case when secondary initializes after primary stops running.
>> Let's even suppose that we allow secondary processes to initialize the timer
>> subsystem by reserving memzone and checking rte_errno. You would still
>> have a chance of two init/deinit conflicting, because there's a hole between
>> memzone allocation and atomic increment.
>> I don't think this race can be resolved in a safe way, so we might just have to
>> settle for a memory leak.
> I don't see a solution here currently either. I'll look at removing the memzone_free()
> call and possibly the rte_timer_subsystem_finalize() API, since it seems like
> there's no reason for it to exist if it can't free the allocations.
I wonder if there are other places in DPDK where this pattern is used.
Technically, this kind of thing /could/ be resolved by having something
in our multiprocess shared memory outside of DPDK heap. I.e. store
something in rte_eal_memconfig like some other things do. This change,
however, would require an ABI break, so while changing this particular
API won't need a deprecation notice, the change itself would.
>>> rte_timer_subsystem_initialized = 0;
More information about the dev