[dpdk-dev] [PATCH v1 2/3] net/hyperv: implement core functionality

Adrien Mazarguil adrien.mazarguil at 6wind.com
Mon Dec 18 18:59:00 CET 2017


On Mon, Dec 18, 2017 at 05:04:23PM +0000, Wiles, Keith wrote:
> > On Dec 18, 2017, at 10:46 AM, Adrien Mazarguil <adrien.mazarguil at 6wind.com> wrote:
> > 
> > As described in more details in the attached documentation (see patch
> > contents), this virtual device driver manages NetVSC interfaces in virtual
> > machines hosted by Hyper-V/Azure platforms.
> > 
> > This driver does not manage traffic nor Ethernet devices directly; it acts
> > as a thin configuration layer that automatically instantiates and controls
> > fail-safe PMD instances combining tap and PCI sub-devices, so that each
> > NetVSC interface is exposed as a single consolidated port to DPDK
> > applications.
> > 
> > PCI sub-devices being hot-pluggable (e.g. during VM migration),
> > applications automatically benefit from increased throughput when present
> > and automatic fallback on NetVSC otherwise without interruption thanks to
> > fail-safe's hot-plug handling.
> > 
> > Once initialized, the sole job of the hyperv driver is to regularly scan
> > for PCI devices to associate with NetVSC interfaces and feed their
> > addresses to corresponding fail-safe instances.
> > 
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil at 6wind.com>
> > ---
> > doc/guides/nics/hyperv.rst  |  65 ++++
> > drivers/net/hyperv/Makefile |   4 +
> > drivers/net/hyperv/hyperv.c | 654 ++++++++++++++++++++++++++++++++++++++-
> > 3 files changed, 722 insertions(+), 1 deletion(-)
<snip>
> > diff --git a/drivers/net/hyperv/hyperv.c b/drivers/net/hyperv/hyperv.c
> > index 2f940c76f..bad224be9 100644
> > --- a/drivers/net/hyperv/hyperv.c
> > +++ b/drivers/net/hyperv/hyperv.c
<snip>
> > +/**
> > + * Convert a MAC address string to binary form.
> > + *
> > + * Note: this function should be exposed by rte_ether.h as the reverse of
> > + * ether_format_addr().
> > + *
> > + * Several MAC string formats are supported on input for convenience:
> > + *
> > + * 1. "12:34:56:78:9a:bc"
> > + * 2. "12-34-56-78-9a-bc"
> > + * 3. "123456789abc"
> > + * 4. Upper/lowercase hexadecimal.
> > + * 5. Any combination of the above, e.g. "12:34-5678-9aBC".
> > + * 6. Partial addresses are allowed, with low-order bytes filled first:
> > + *    - "5:6:78c" translates to "00:00:05:06:07:8c",
> > + *    - "5678c" translates to "00:00:00:05:67:8c".
> > + *
> > + * Non-hexadecimal characters, unknown separators and strings specifying
> > + * more than 6 bytes are not allowed.
> > + *
> > + * @param[out] eth_addr
> > + *   Pointer to conversion result buffer.
> > + * @param[in] str
> > + *   MAC address string to convert.
> > + *
> > + * @return
> > + *   0 on success, -EINVAL in case of unsupported format.
> > + */
> > +static int
> > +ether_addr_from_str(struct ether_addr *eth_addr, const char *str)
> > +{
> > +	static const uint8_t conv[0x100] = {
> > +		['0'] = 0x80, ['1'] = 0x81, ['2'] = 0x82, ['3'] = 0x83,
> > +		['4'] = 0x84, ['5'] = 0x85, ['6'] = 0x86, ['7'] = 0x87,
> > +		['8'] = 0x88, ['9'] = 0x89, ['a'] = 0x8a, ['b'] = 0x8b,
> > +		['c'] = 0x8c, ['d'] = 0x8d, ['e'] = 0x8e, ['f'] = 0x8f,
> > +		['A'] = 0x8a, ['B'] = 0x8b, ['C'] = 0x8c, ['D'] = 0x8d,
> > +		['E'] = 0x8e, ['F'] = 0x8f, [':'] = 0x40, ['-'] = 0x40,
> > +		['\0'] = 0x60,
> > +	};
> > +	uint64_t addr = 0;
> > +	uint64_t buf = 0;
> > +	unsigned int i = 0;
> > +	unsigned int n = 0;
> > +	uint8_t tmp;
> > +
> > +	do {
> > +		tmp = conv[(int)*(str++)];
> > +		if (!tmp)
> > +			return -EINVAL;
> > +		if (tmp & 0x40) {
> > +			i += (i & 1) + (!i << 1);
> > +			addr = (addr << (i << 2)) | buf;
> > +			n += i;
> > +			buf = 0;
> > +			i = 0;
> > +		} else {
> > +			buf = (buf << 4) | (tmp & 0xf);
> > +			++i;
> > +		}
> > +	} while (!(tmp & 0x20));
> > +	if (n > 12)
> > +		return -EINVAL;
> > +	i = RTE_DIM(eth_addr->addr_bytes);
> > +	while (i) {
> > +		eth_addr->addr_bytes[--i] = addr & 0xff;
> > +		addr >>= 8;
> > +	}
> > +	return 0;
> > +}
> 
> You already called this out above, why not just push this into rte_ether.h file. I know I could use it if it were public.

Hehe, that was to highlight how this driver didn't require any modifications
in public APIs. I planned to do just that in v2 or in a subsequent patch.

<snip>
> > +/**
> > + * Retrieve the last component of a path.
> > + *
> > + * This is a simplified basename() that does not modify its input buffer to
> > + * handle trailing backslashes.
> > + *
> > + * @param[in] path
> > + *    Path to retrieve the last component from.
> > + *
> > + * @return
> > + *    Pointer to the last component.
> > + */
> > +static const char *
> > +hyperv_basename(const char *path)
> > +{
> > +	const char *tmp = path;
> > +
> > +	while (*tmp)
> > +		if (*(tmp++) == '/')
> > +			path = tmp;
> > +	return path;
> > +}
> 
> Why not just user rindex() to find the last ‘/‘ instead of this routine? I know it is not performance critical.

Right, however both rindex() and strrchr() return NULL when no '/' is
present. strchrnul() works but is GNU-specific (i.e. probably not found on
BSD), I didn't want to perform an additional check for that, so actually
given the size of that function I didn't give it a second thought. I can
modify that if needed.

<snip>
> > +/**
> > + * Probe a network interface to associate with hyperv context.
> > + *
> > + * This function determines if the network device matches the properties of
> > + * the NetVSC interface associated with the hyperv context and communicates
> > + * its bus address to the fail-safe PMD instance if so.
> > + *
> > + * It is normally used with hyperv_foreach_iface().
> > + *
> > + * @param[in] iface
> > + *   Pointer to netdevice description structure (name and index).
> > + * @param[in] eth_addr
> > + *   MAC address associated with @p iface.
> > + * @param ap
> > + *   Variable arguments list comprising:
> > + *
> > + *   - struct hyperv_ctx *ctx:
> > + *     Context to associate network interface with.
> > + *
> > + * @return
> > + *   A nonzero value when interface matches, 0 otherwise or in case of
> > + *   error.
> > + */
> > +static int
> > +hyperv_device_probe(const struct if_nameindex *iface,
> > +		    const struct ether_addr *eth_addr,
> > +		    va_list ap)
> > +{
> > +	struct hyperv_ctx *ctx = va_arg(ap, struct hyperv_ctx *);
> > +	char buf[RTE_MAX(sizeof(ctx->yield), 256u)];
> > +	const char *addr;
> > +	size_t len;
> > +	int ret;
> > +
> > +	/* Skip non-matching or unwanted NetVSC interfaces. */
> > +	if (ctx->if_index == iface->if_index) {
> > +		if (!strcmp(ctx->if_name, iface->if_name))
> > +			return 0;
> > +		DEBUG("NetVSC interface \"%s\" (index %u) renamed \"%s\"",
> > +		      ctx->if_name, ctx->if_index, iface->if_name);
> > +		strncpy(ctx->if_name, iface->if_name, sizeof(ctx->if_name));
> > +		return 0;
> > +	}
> > +	if (hyperv_iface_is_netvsc(iface))
> > +		return 0;
> > +	if (!is_same_ether_addr(eth_addr, &ctx->if_addr))
> > +		return 0;
> > +	/* Look for associated PCI device. */
> > +	ret = hyperv_sysfs_readlink(buf, sizeof(buf), iface->if_name,
> > +				    "device/subsystem");
> > +	if (ret)
> > +		return 0;
> > +	if (strcmp(hyperv_basename(buf), "pci"))
> > +		return 0;
> > +	ret = hyperv_sysfs_readlink(buf, sizeof(buf), iface->if_name,
> > +				    "device");
> > +	if (ret)
> > +		return 0;
> > +	addr = hyperv_basename(buf);
> > +	len = strlen(addr);
> > +	if (!len)
> > +		return 0;
> > +	/* Send PCI device argument to fail-safe PMD instance if updated. */
> > +	if (!strcmp(addr, ctx->yield))
> > +		return 1;
> > +	DEBUG("associating PCI device \"%s\" with NetVSC interface \"%s\""
> > +	      " (index %u)",
> > +	      addr, ctx->if_name, ctx->if_index);
> > +	memmove(buf, addr, len + 1);
> > +	addr = buf;
> > +	buf[len] = '\n';
> > +	ret = write(ctx->pipe[1], addr, len + 1);
> > +	buf[len] = '\0';
> > +	if (ret == -1) {
> > +		if (errno == EINTR || errno == EAGAIN)
> > +			return 1;
> > +		WARN("cannot associate PCI device name \"%s\" with interface"
> > +		     " \"%s\": %s",
> > +		     addr, ctx->if_name, rte_strerror(errno));
> > +		return 1;
> > +	}
> > +	if ((size_t)ret != len + 1) {
> > +		/*
> > +		 * Attempt to override previous partial write, no need to
> > +		 * recover if that fails.
> > +		 */
> > +		ret = write(ctx->pipe[1], "\n", 1);
> > +		(void)ret;
> > +		return 1;
> > +	}
> > +	fsync(ctx->pipe[1]);
> > +	memcpy(ctx->yield, addr, len + 1);
> > +	return 1;
> > +}
> 
> Not to criticize style, but a few blank lines could help in readability for these files IMHO. Unless blank lines are illegal :-)

It's a matter of taste, I think people tend to add random blank lines where
they think doing so clarifies things for themselves, resulting in
inconsistent coding style not much clearer for everyone after several
iterations.

As a maintainer I've grown tired of discussions related to blank lines while
reviewing patches. That's why except for a few special cases, I now enforce
exactly the bare minimum of one blank line between variable declarations and
the rest of the code inside each block.

If doing so makes a function unreadable then perhaps it needs to be split :)
I'm sure you'll understand!

Regards,

-- 
Adrien Mazarguil
6WIND


More information about the dev mailing list