[dpdk-dev] [PATCH v2] eal: add madvise to avoid dump memory

Li Feng fengli at smartx.com
Fri Apr 24 14:03:15 CEST 2020


Thanks,

Feng Li

Burakov, Anatoly <anatoly.burakov at intel.com> 于2020年4月24日周五 下午7:00写道:
>
> On 24-Apr-20 10:33 AM, Feng Li wrote:
> > Bruce Richardson <bruce.richardson at intel.com> 于2020年4月24日周五 下午5:14写道:
> >>
> >> On Fri, Apr 24, 2020 at 10:12:10AM +0100, Burakov, Anatoly wrote:
> >>> On 23-Apr-20 9:04 PM, David Marchand wrote:
> >>>> On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
> >>>> <anatoly.burakov at intel.com> wrote:
> >>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
> >>>>>> index cc7d54e0c..2d9564b28 100644
> >>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
> >>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
> >>>>>> @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
> >>>>>>                 after_len = RTE_PTR_DIFF(map_end, aligned_end);
> >>>>>>                 if (after_len > 0)
> >>>>>>                         munmap(aligned_end, after_len);
> >>>>>> +
> >>>>>> +             /*
> >>>>>> +              * Exclude this pages from a core dump.
> >>>>>> +              */
> >>>>>> +             if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
> >>>>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> >>>>>> +                             strerror(errno));> +   } else {
> >>>>>> +             /*
> >>>>>> +              * Exclude this pages from a core dump.
> >>>>>> +              */
> >>>>>> +             if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
> >>>>>> +                     RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: %s\n",
> >>>>>> +                             strerror(errno));
> >>>>>>         }
> >>>>>>
> >>>>>>         return aligned_addr;
> >>>>>>
> >>>>>
> >>>>> For the contents of this patch,
> >>>>
> >>>> MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
> >>>> be a MADV_NOCORE option on FreeBSD.
> >>>> 1: http://inbox.dpdk.org/dev/CAJFAV8y9YtT-7njUz+mD6U8+3XUqYrgp28KD7jy2923EpAcXrg@mail.gmail.com/
> >>>>
> >>>>
> >>>
> >>> Oh, right, so this would probably not compile on FreeBSD. Perhaps this
> >>> function would have to be OS-specific after all (or call into an OS-specific
> >>> madvise() after reserving the memory area).
> >>>
> >>
> >> Is it just a differently named flag? If so, I think a single #ifdef macro
> >> won't kill us in the common code.
> >>
> > Just the flag name is different.
> > I should use RTE_EXEC_ENV_FREEBSD and RTE_EXEC_ENV_LINUX, right?
>
> Yes, but we need this in two places, so a function call is still necessary.
>
> >
> > Another question, in `eal_memalloc.c:alloc_seg`, I should undo the
> > DONTMAP of the memory region.
> > Right? @Anatoly
>
> I don't think it's necessary. When you map different memory into that
> region, madvise() flags no longer apply. To be sure, i just tested this
> by adding another mmap() call after madvise() (in your test app) and
> remapping the same memory with MAP_FIXED, and the core dump was back to
> 1GB of size. So, no, i don't think you should undo anything - the system
> does so automatically.
Got it.
>
> >
> > Just few minutes, I have prepared a patch for the OS-specific code:
> > --- a/lib/librte_eal/common/eal_private.h
> > +++ b/lib/librte_eal/common/eal_private.h
> > @@ -443,4 +443,20 @@ rte_option_usage(void);
> >   uint64_t
> >   eal_get_baseaddr(void);
> >
> > +/**
> > + * @internal
> > + * Exclude this pages from a core dump.
> > + *
> > + * @param addr
> > + *  The memory region starts.
> > + *
> > + * @param len
> > + *  The memory region length..
> > + *
> > + * @return
> > + * returns 0 or -errno
> > + */
> > +int
> > +eal_madvise_dontdump(void* addr, size_t len);
> > +
> >   #endif /* _EAL_PRIVATE_H_ */
> > diff --git a/lib/librte_eal/freebsd/eal_memory.c
> > b/lib/librte_eal/freebsd/eal_memory.c
> > index a97d8f0f0..585042dde 100644
> > --- a/lib/librte_eal/freebsd/eal_memory.c
> > +++ b/lib/librte_eal/freebsd/eal_memory.c
> > @@ -534,3 +534,9 @@ rte_eal_memseg_init(void)
> >    memseg_primary_init() :
> >    memseg_secondary_init();
> >   }
> > +
> > +int
> > +eal_madvise_dontdump(void* addr, size_t len)
> > +{
> > + return madvise(addr, len, MADV_NOCORE);
> > +}
> > diff --git a/lib/librte_eal/linux/eal_memory.c
> > b/lib/librte_eal/linux/eal_memory.c
> > index 7a9c97ff8..cfdbfccfe 100644
> > --- a/lib/librte_eal/linux/eal_memory.c
> > +++ b/lib/librte_eal/linux/eal_memory.c
> > @@ -2479,3 +2479,9 @@ rte_eal_memseg_init(void)
> >   #endif
> >    memseg_secondary_init();
> >   }
> > +
> > +int
> > +eal_madvise_dontdump(void* addr, size_t len)
> > +{
> > + return madvise(addr, len, MADV_DONTDUMP);
> > +}
> >
>
> That would work as well (with added FreeBSD code of course), however if
> everyone else is OK with it, i'll settle for an #ifdef in common code.
>
> --
> Thanks,
> Anatoly

-- 
The SmartX email address is only for business purpose. Any sent message 
that is not related to the business is not authorized or permitted by 
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.




More information about the dev mailing list