[dpdk-users] Larger number of hugepages causes bus error.

Sushil Adhikari sushil446 at gmail.com
Thu Feb 23 18:02:59 CET 2017


Thank you Keith and Monroy, with your help I was able to track down the
problem, My var/run was too small to hold the hugepage information so when
I increased its size, it worked. Thank you so much.

On Thu, Feb 23, 2017 at 10:35 AM, Sergio Gonzalez Monroy <
sergio.gonzalez.monroy at intel.com> wrote:

> As Keith suggested, gdb is probably your best bet now.
> You could also do 'strace' to see if something shows up there.
>
> If you are running as root, the application is opening a file in /var/run
> to store some hugepage information, then it memsets to 0.
>
> What distro and kernel are you running on?
>
>
>
> On 23/02/2017 16:19, Sushil Adhikari wrote:
>
>> I didn't understand what you mean by hugepage value, if you mean number of
>> hugepages here's what it looks like
>> [~]$ grep -ri hugepages /proc/meminfo
>> AnonHugePages:         0 kB
>> HugePages_Total:     512
>> HugePages_Free:      512
>> HugePages_Rsvd:        0
>> HugePages_Surp:        0
>> Hugepagesize:       2048 kB
>>
>> And the linux version is 4.4.20.
>>
>> On Thu, Feb 23, 2017 at 9:17 AM, Wiles, Keith <keith.wiles at intel.com>
>> wrote:
>>
>> On Feb 22, 2017, at 7:18 PM, Sushil Adhikari <sushil446 at gmail.com>
>>>>
>>> wrote:
>>>
>>>> Thank you Keith for the response,
>>>>
>>>> Yes it should be line 1142 not 1405, I was using 16.11 and now I'm using
>>>>
>>> 17.02 and still getting the same error.
>>>
>>> Not sure what to say here, it looks like some type of system
>>> configuration
>>> issue as I do not see it on my machine.
>>>
>>> Can you tell if the hugepage has a value and is it sane? The next thing
>>> is
>>> to see where in that memory is it failing start, end or middle someplace.
>>> Use GDB and compile the code with ‘make install
>>> T=x86_64-native-lunixapp-gcc EXTRA_CFLAGS=“-g -O0”' then set a break
>>> point
>>> on ‘b eal_memory.c:1142’ and inspect the memory pointer hugepage. I do
>>> not
>>> think it is overrun error meaning the size for memset is different then
>>> what was allocated and just stepping off the end.
>>>
>>> Also you did not tell me the linux version you are using?
>>>
>>> On Wed, Feb 22, 2017 at 8:46 PM, Wiles, Keith <keith.wiles at intel.com>
>>>>
>>> wrote:
>>>
>>>> On Feb 22, 2017, at 6:43 PM, Wiles, Keith <keith.wiles at intel.com>
>>>>>
>>>> wrote:
>>>
>>>> On Feb 22, 2017, at 6:30 PM, Sushil Adhikari <sushil446 at gmail.com>
>>>>>>
>>>>> wrote:
>>>
>>>> I used the basic command line option "dpdkTimer -c 0xf -n 4"
>>>>>> And to update on my findings so far I have narrowed down to this
>>>>>>
>>>>> line(1405)
>>>
>>>> memset(hugepage, 0, nr_hugefiles * sizeof(struct hugepage_file));
>>>>>> of function rte_eal_hugepage_init() in file
>>>>>>
>>>>> dpdk\lib\librte_eal\linuxapp\eal\eal_memory.c
>>>
>>>> What version of DPDK are you using? I was looking at the file at 1405
>>>>>
>>>> and I do not see a memset() call.
>>>
>>>> I found the memset call at 1142 in my 17.05-rc0 code. Please try the
>>>>
>>> latest version and see if you get the same problem.
>>>
>>>> Yes I have the hugepages of size 2MB(2048) and when I calculate the
>>>>>>
>>>>> memory this memset function is trying to set, it comes out to
>>> 512(nr_hugefiles) * 4144 ( sizeof(struct hugepage_file) ) = 2121728 which
>>> larger than 2MB, so my doubt is that the hugepages I have
>>> allocated(512*2MB) is not contiguous 1GB memory its trying to access
>>> memory
>>> thats not part of hugepage, is that a possibility, even though I am
>>> setting
>>> up hugepages during boot time by providing it through kernel option.
>>>
>>>>
>>>>>> On Wed, Feb 22, 2017 at 8:05 PM, Wiles, Keith <keith.wiles at intel.com>
>>>>>>
>>>>> wrote:
>>>
>>>> On Feb 22, 2017, at 3:05 PM, Sushil Adhikari <sushil446 at gmail.com>
>>>>>>>
>>>>>> wrote:
>>>
>>>> Hi,
>>>>>>>
>>>>>>> I was trying to run dpdk timer app by setting 512 2MB hugepages but
>>>>>>>
>>>>>> the
>>>
>>>> application crashed with following error
>>>>>>> EAL: Detected 4 lcore(s)
>>>>>>> EAL: Probing VFIO support...
>>>>>>> Bus error (core dumped)
>>>>>>>
>>>>>>> If I reduce the number of hugepages to 256 it works fine. I
>>>>>>>
>>>>>> wondering what
>>>
>>>> could be the problem here. Here's my cpu info
>>>>>>>
>>>>>> I normally run with 2048 x 2 or 2048 per socket on my machine. What
>>>>>>
>>>>> is the command line you are using to start the application?
>>>
>>>> processor       : 0
>>>>>>> vendor_id       : GenuineIntel
>>>>>>> cpu family      : 6
>>>>>>> model           : 26
>>>>>>> model name      : Intel(R) Core(TM) i7 CPU         950  @ 3.07GHz
>>>>>>> stepping        : 5
>>>>>>> microcode       : 0x11
>>>>>>> cpu MHz         : 2794.000
>>>>>>> cache size      : 8192 KB
>>>>>>> physical id     : 0
>>>>>>> siblings        : 4
>>>>>>> core id         : 0
>>>>>>> cpu cores       : 4
>>>>>>> apicid          : 0
>>>>>>> initial apicid  : 0
>>>>>>> fpu             : yes
>>>>>>> fpu_exception   : yes
>>>>>>> cpuid level     : 11
>>>>>>> wp              : yes
>>>>>>> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>>>>>>>
>>>>>> pge mca
>>>
>>>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
>>>>>>>
>>>>>> syscall nx
>>>
>>>> rdtscp lm constant_tsc arch_
>>>>>>> perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni
>>>>>>>
>>>>>> dtes64
>>>
>>>> monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt
>>>>>>> lahf_lm ida dtherm tpr_shadow vnm
>>>>>>> i flexpriority ept vpid
>>>>>>> bugs            :
>>>>>>> bogomips        : 5600.00
>>>>>>> clflush size    : 64
>>>>>>> cache_alignment : 64
>>>>>>> address sizes   : 36 bits physical, 48 bits virtual
>>>>>>> power management:
>>>>>>>
>>>>>>> processor       : 1
>>>>>>> vendor_id       : GenuineIntel
>>>>>>> cpu family      : 6
>>>>>>> model           : 26
>>>>>>> model name      : Intel(R) Core(TM) i7 CPU         950  @ 3.07GHz
>>>>>>> stepping        : 5
>>>>>>> microcode       : 0x11
>>>>>>> cpu MHz         : 2794.000
>>>>>>> cache size      : 8192 KB
>>>>>>> physical id     : 0
>>>>>>> siblings        : 4
>>>>>>> core id         : 1
>>>>>>> cpu cores       : 4
>>>>>>> apicid          : 2
>>>>>>> initial apicid  : 2
>>>>>>> fpu             : yes
>>>>>>> fpu_exception   : yes
>>>>>>> cpuid level     : 11
>>>>>>> wp              : yes
>>>>>>> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>>>>>>>
>>>>>> pge mca
>>>
>>>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
>>>>>>>
>>>>>> syscall nx
>>>
>>>> rdtscp lm constant_tsc arch_
>>>>>>> perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni
>>>>>>>
>>>>>> dtes64
>>>
>>>> monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt
>>>>>>> lahf_lm ida dtherm tpr_shadow vnm
>>>>>>> i flexpriority ept vpid
>>>>>>> bugs            :
>>>>>>> bogomips        : 5600.00
>>>>>>> clflush size    : 64
>>>>>>> cache_alignment : 64
>>>>>>> address sizes   : 36 bits physical, 48 bits virtual
>>>>>>> power management:......
>>>>>>>
>>>>>>> And Here's my meminfo
>>>>>>>
>>>>>>> MemTotal:       24679608 kB
>>>>>>> MemFree:        24014156 kB
>>>>>>> MemAvailable:   23950600 kB
>>>>>>> Buffers:            3540 kB
>>>>>>> Cached:            31436 kB
>>>>>>> SwapCached:            0 kB
>>>>>>> Active:            21980 kB
>>>>>>> Inactive:          22256 kB
>>>>>>> Active(anon):      10760 kB
>>>>>>> Inactive(anon):     2940 kB
>>>>>>> Active(file):      11220 kB
>>>>>>> Inactive(file):    19316 kB
>>>>>>> Unevictable:           0 kB
>>>>>>> Mlocked:               0 kB
>>>>>>> SwapTotal:             0 kB
>>>>>>> SwapFree:              0 kB
>>>>>>> Dirty:                32 kB
>>>>>>> Writeback:             0 kB
>>>>>>> AnonPages:          9252 kB
>>>>>>> Mapped:            11912 kB
>>>>>>> Shmem:              4448 kB
>>>>>>> Slab:              27712 kB
>>>>>>> SReclaimable:      11276 kB
>>>>>>> SUnreclaim:        16436 kB
>>>>>>> KernelStack:        2672 kB
>>>>>>> PageTables:         1000 kB
>>>>>>> NFS_Unstable:          0 kB
>>>>>>> Bounce:                0 kB
>>>>>>> WritebackTmp:          0 kB
>>>>>>> CommitLimit:    12077660 kB
>>>>>>> Committed_AS:     137792 kB
>>>>>>> VmallocTotal:   34359738367 kB
>>>>>>> VmallocUsed:           0 kB
>>>>>>> VmallocChunk:          0 kB
>>>>>>> HardwareCorrupted:     0 kB
>>>>>>> AnonHugePages:      2048 kB
>>>>>>> CmaTotal:              0 kB
>>>>>>> CmaFree:               0 kB
>>>>>>> HugePages_Total:     256
>>>>>>> HugePages_Free:        0
>>>>>>> HugePages_Rsvd:        0
>>>>>>> HugePages_Surp:        0
>>>>>>> Hugepagesize:       2048 kB
>>>>>>> DirectMap4k:       22000 kB
>>>>>>> DirectMap2M:    25133056 kB
>>>>>>>
>>>>>> Regards,
>>>>>> Keith
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>> Keith
>>>>>
>>>> Regards,
>>>> Keith
>>>>
>>>>
>>>> Regards,
>>> Keith
>>>
>>>
>>>
>


More information about the users mailing list