[dpdk-dev] [PATCH v2 23/41] mempool: add support for the new allocation methods

Olivier Matz olivier.matz at 6wind.com
Mon Mar 19 18:11:31 CET 2018


Hi Anatoly,

Please find some comments below.

On Wed, Mar 07, 2018 at 04:56:51PM +0000, Anatoly Burakov wrote:
> If a user has specified that the zone should have contiguous memory,
> use the new _contig allocation API's instead of normal ones.
> Otherwise, account for the fact that unless we're in IOVA_AS_VA
> mode, we cannot guarantee that the pages would be physically
> contiguous, so we calculate the memzone size and alignments as if
> we were getting the smallest page size available.
> 
> Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com>

[...]

> @@ -563,10 +585,46 @@ rte_mempool_populate_default(struct rte_mempool *mp)
>  	/* update mempool capabilities */
>  	mp->flags |= mp_flags;
>  
> -	if (rte_eal_has_hugepages()) {
> -		pg_shift = 0; /* not needed, zone is physically contiguous */
> +	no_contig = mp->flags & MEMPOOL_F_NO_PHYS_CONTIG;
> +	force_contig = mp->flags & MEMPOOL_F_CAPA_PHYS_CONTIG;
> +
> +	/*
> +	 * there are several considerations for page size and page shift here.

I would add a little word here to describe what page size and page shift
are used for:

  These values impact the result of rte_mempool_xmem_size() (*), which
  returns the amount of memory that should be allocated to store the
  desired number of objects. When not zero, it allocates more memory
  for the padding between objects, to ensure that an object does not
  cross a page boundary.

(*) it is renamed in Andrew's patchset about mempool_ops API, but it
seems the memory rework may be pushed before.

> +	 *
> +	 * if we don't need our mempools to have physically contiguous objects,
> +	 * then just set page shift and page size to 0, because the user has
> +	 * indicated that there's no need to care about anything.
> +	 *
> +	 * if we do need contiguous objects, there is also an option to reserve
> +	 * the entire mempool memory as one contiguous block of memory, in
> +	 * which case the page shift and alignment wouldn't matter as well.
> +	 *
> +	 * if we require contiguous objects, but not necessarily the entire
> +	 * mempool reserved space to be contiguous, then there are two options.
> +	 *
> +	 * if our IO addresses are virtual, not actual physical (IOVA as VA
> +	 * case), then no page shift needed - our memory allocation will give us
> +	 * contiguous physical memory as far as the hardware is concerned, so
> +	 * act as if we're getting contiguous memory.
> +	 *
> +	 * if our IO addresses are physical, we may get memory from bigger
> +	 * pages, or we might get memory from smaller pages, and how much of it
> +	 * we require depends on whether we want bigger or smaller pages.
> +	 * However, requesting each and every memory size is too much work, so
> +	 * what we'll do instead is walk through the page sizes available, pick
> +	 * the smallest one and set up page shift to match that one. We will be
> +	 * wasting some space this way, but it's much nicer than looping around
> +	 * trying to reserve each and every page size.
> +	 */

This comment is helpful to understand, thanks.

(by the way, reading it makes me think we should rename
MEMPOOL_F_*_PHYS_CONTIG as MEMPOOL_F_*_IOVA_CONTIG)


> +
> +	if (no_contig || force_contig || rte_eal_iova_mode() == RTE_IOVA_VA) {
>  		pg_sz = 0;
> +		pg_shift = 0;
>  		align = RTE_CACHE_LINE_SIZE;
> +	} else if (rte_eal_has_hugepages()) {
> +		pg_sz = get_min_page_size();
> +		pg_shift = rte_bsf32(pg_sz);
> +		align = pg_sz;
>  	} else {
>  		pg_sz = getpagesize();
>  		pg_shift = rte_bsf32(pg_sz);
> @@ -585,23 +643,34 @@ rte_mempool_populate_default(struct rte_mempool *mp)
>  			goto fail;
>  		}
>  
> -		mz = rte_memzone_reserve_aligned(mz_name, size,
> -			mp->socket_id, mz_flags, align);
> -		/* not enough memory, retry with the biggest zone we have */
> -		if (mz == NULL)
> -			mz = rte_memzone_reserve_aligned(mz_name, 0,
> +		if (force_contig) {
> +			/*
> +			 * if contiguous memory for entire mempool memory was
> +			 * requested, don't try reserving again if we fail.
> +			 */
> +			mz = rte_memzone_reserve_aligned_contig(mz_name, size,
> +				mp->socket_id, mz_flags, align);
> +		} else {
> +			mz = rte_memzone_reserve_aligned(mz_name, size,
>  				mp->socket_id, mz_flags, align);
> +			/* not enough memory, retry with the biggest zone we
> +			 * have
> +			 */
> +			if (mz == NULL)
> +				mz = rte_memzone_reserve_aligned(mz_name, 0,
> +					mp->socket_id, mz_flags, align);
> +		}

This is not wrong, but at first glance I think it is not required,
because we have this in populate_iova():

	/* Detect pool area has sufficient space for elements */
	if (mp_capa_flags & MEMPOOL_F_CAPA_PHYS_CONTIG) {
		if (len < total_elt_sz * mp->size) {
			RTE_LOG(ERR, MEMPOOL,
				"pool area %" PRIx64 " not enough\n",
				(uint64_t)len);
			return -ENOSPC;
		}
	}



Thanks,
Olivier


More information about the dev mailing list