[dpdk-dev] [PATCH v2 10/10] eal/windows: implement basic memory management

Dmitry Kozlyuk dmitry.kozliuk at gmail.com
Sat Apr 11 03:16:50 CEST 2020


[snip]
> > +static int
> > +alloc_seg(struct rte_memseg *ms, void *requested_addr, int socket_id,
> > +	struct hugepage_info *hi)
> > +{
> > +	HANDLE current_process;
> > +	unsigned int numa_node;
> > +	size_t alloc_sz;
> > +	void *addr;
> > +	rte_iova_t iova = RTE_BAD_IOVA;
> > +	PSAPI_WORKING_SET_EX_INFORMATION info;
> > +	PSAPI_WORKING_SET_EX_BLOCK *page;
> > +
> > +	if (ms->len > 0) {
> > +		/* If a segment is already allocated as needed, return it. */
> > +		if ((ms->addr == requested_addr) &&
> > +			(ms->socket_id == socket_id) &&
> > +			(ms->hugepage_sz == hi->hugepage_sz)) {
> > +			return 0;
> > +		}
> > +
> > +		/* Bugcheck, should not happen. */
> > +		RTE_LOG(DEBUG, EAL, "Attempted to reallocate segment %p "
> > +			"(size %zu) on socket %d", ms->addr,
> > +			ms->len, ms->socket_id);
> > +		return -1;
> > +	}
> > +
> > +	current_process = GetCurrentProcess();
> > +	numa_node = eal_socket_numa_node(socket_id);
> > +	alloc_sz = hi->hugepage_sz;
> > +
> > +	if (requested_addr == NULL) {
> > +		/* Request a new chunk of memory and enforce address hint. */  
> 
> Does requested_addr being NULL means that no hint was provided? It also looks like eal_mem_alloc_socket
> ignores the address hint anyway and just calls VirtualAllocExNuma with NULL for lpAddress. Maybe remove
> the second part of the comment.

Correct, also see below.

> 
> > +		addr = eal_mem_alloc_socket(alloc_sz, socket_id);
> > +		if (addr == NULL) {
> > +			RTE_LOG(DEBUG, EAL, "Cannot allocate %zu bytes "
> > +				"on socket %d\n", alloc_sz, socket_id);
> > +			return -1;
> > +		}
> > +
> > +		if (addr != requested_addr) {  
> 
> requested_addr is NULL on this branch and we confirmed with the previous 'if' that addr is not NULL.
> Should this branch be removed, since requested_addr is NULL, so there is no hint provided?

Good catch, this check doesn't belong here. In fact, apart from an allocation
failure, address hint might not be respected iff the hint is not aligned to
hugepage boundary, which would be an allocator bug (use of an incorrect MSL).

> > +			RTE_LOG(DEBUG, EAL, "Address hint %p not respected, "
> > +				"got %p\n", requested_addr, addr);
> > +			goto error;
> > +		}
> > +	} else {
> > +		/* Requested address is already reserved, commit memory. */
> > +		addr = eal_mem_commit(requested_addr, alloc_sz, socket_id);
> > +		if (addr == NULL) {
> > +			RTE_LOG(DEBUG, EAL, "Cannot commit reserved memory %p "
> > +				"(size %zu)\n", requested_addr, alloc_sz);
> > +			goto error;  
> 
> Execution jumps to 'error' with an invalid addr, so it will try to call eal_mem_decommit with NULL as parameter.
> Instead of 'goto error', maybe we should return here.

Will fix in v3.

I'd like to also clarify the following. On Windows, eal_mem_commit() is not
atomic: it may first split a reserved region, then commit it. If splitting
succeeds and commit fails, the reserved region remains split into parts.
eal_mem_decommit() does not coalesce these parts intentionally: they're
hugepage-sized, so the next eal_mem_commit() on the same region just doesn't
have to split it.

> 
> > +		}
> > +	}
> > +
> > +	/* Force OS to allocate a physical page and select a NUMA node.
> > +	 * Hugepages are not pageable in Windows, so there's no race
> > +	 * for physical address.
> > +	 */
> > +	*(volatile int *)addr = *(volatile int *)addr;
> > +
> > +	/* Only try to obtain IOVA if it's available, so that applications
> > +	 * that do not need IOVA can use this allocator.
> > +	 */
> > +	if (rte_eal_using_phys_addrs()) {
> > +		iova = rte_mem_virt2iova(addr);
> > +		if (iova == RTE_BAD_IOVA) {
> > +			RTE_LOG(DEBUG, EAL,
> > +				"Cannot get IOVA of allocated segment\n");
> > +			goto error;
> > +		}
> > +	}
> > +
> > +	/* Only "Ex" function can handle hugepages. */
> > +	info.VirtualAddress = addr;
> > +	if (!QueryWorkingSetEx(current_process, &info, sizeof(info))) {
> > +		RTE_LOG_WIN32_ERR("QueryWorkingSetEx()");
> > +		goto error;
> > +	}
> > +
> > +	page = &info.VirtualAttributes;
> > +	if (!page->Valid || !page->LargePage) {
> > +		RTE_LOG(DEBUG, EAL, "Got regular page instead of hugepage\n");
> > +		goto error;
> > +	}
> > +	if (page->Node != numa_node) {
> > +		RTE_LOG(DEBUG, EAL,
> > +			"NUMA node hint %u (socket %d) not respected, got %u\n",
> > +			numa_node, socket_id, page->Node);
> > +		goto error;
> > +	}
> > +
> > +	ms->addr = addr;
> > +	ms->hugepage_sz = hi->hugepage_sz;
> > +	ms->len = alloc_sz;
> > +	ms->nchannel = rte_memory_get_nchannel();
> > +	ms->nrank = rte_memory_get_nrank();
> > +	ms->iova = iova;
> > +	ms->socket_id = socket_id;
> > +
> > +	return 0;
> > +
> > +error:
> > +	/* Only jump here when `addr` and `alloc_sz` are valid. */
> > +	eal_mem_decommit(addr, alloc_sz);
> > +	return -1;
> > +}
[snip]

-- 
Dmitry Kozlyuk


More information about the dev mailing list