[dpdk-dev] [PATCH v6 00/10] Virtual Machine Power Management

Thomas Monjalon thomas.monjalon at 6wind.com
Wed Nov 26 17:41:17 CET 2014

Hi Pablo and Alan,

2014-11-25 16:18, Pablo de Lara:
> Virtual Machine Power Management.
> The following patches add two DPDK sample applications and an alternate
> implementation of librte_power for use in virtualized environments.
> The idea is to provide librte_power functionality from within a VM to address
> the lack of MSRs to facilitate frequency changes from within a VM.
> It is ideally suited for Haswell which provides per core frequency scaling.
> The current librte_power affects frequency changes via the acpi-cpufreq
> 'userspace' power governor, accessed via sysfs.
> General Overview:(more information in each patch that follows).
> The VM Power Management solution provides two components:
>  1)VM: Allows for the a DPDK application in a VM to reuse the librte_power
>  interface. Each lcore opens a Virto-Serial endpoint channel to the host,
>  where the re-implementation of librte_power simply forwards the requests for
>  frequency change to a host based monitor. The host monitor itself uses
>  librte_power.
>  Each lcore channel corresponds to a
>  serial device '/dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>'
>  which is opened in non-blocking mode.
>  While each Virtual CPU can be mapped to multiple physical CPUs it is
>  recommended that each vCPU should be mapped to a single core only.
>  2)Host: The host monitor is managed by a CLI, it allows for adding qemu/KVM
>  virtual machines and associated channels to the monitor, manually changing
>  CPU frequency, inspecting the state of VMs, vCPU to pCPU pinning and managing
>  channels.
>  Host channel endpoints are Virto-Serial endpoints configured as AF_UNIX file
>  sockets which follow a specific naming convention
>  i.e /tmp/powermonitor/<vm_name>.<channel_number>,
>  each channel has an 1:1 mapping to a VM endpoint
>  i.e. /dev/virtio-ports/virtio.serial.port.poweragent.<lcore_num>
>  Host channel endpoints are opened in non-blocking mode and are monitored via epoll.
>  Requests over each channel to change frequency are forwarded to the original
>  librte_power.
> Channels must be manually configured as qemu-kvm command line arguments or
> libvirt domain definition(xml) e.g.
> <controller type='virtio-serial' index='0'>
>  <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
> </controller>
> <channel type='unix'>
>   <source mode='bind' path='/tmp/powermonitor/<vm_name>.<channel_num>'/>
>   <target type='virtio' name='virtio.serial.port.poweragent.<channel_num>/>
>   <address type='virtio-serial' controller='0' bus='0' port='<N>'/>
> </channel>
> Where multiple channels can be configured by specifying multiple <channel>
> elements, by replacing <vm_name>, <channel_num>.
> <N>(port number) should be incremented by 1 for each new channel element.
> More information on Virtio-Serial can be found here:
> http://fedoraproject.org/wiki/Features/VirtioSerial
> To enable the Hypervisor creation of channels, the host endpoint directory
> must be created with qemu permissions:
> mkdir /tmp/powermonitor
> chown qemu:qemu /tmp/powermonitor
> The host application runs on two separate lcores:
> Core N) CLI: For management of Virtual Machines adding channels to Monitor thread,
>  inspecting state and manually setting CPU frequency [PATCH 02/09]
> Core N+1) Monitor Thread: An epoll based infinite loop that waits on channel events
>  from VMs and calls the corresponding librte_power functions.
> A sample application is also provided to run on Virtual Machines, this
> application provides a CLI to manually set the frequency of a 
> vCPU[PATCH 08/09]
> The current l3fwd-power sample application can also be run on a VM.
> Changes in V6:
>  Fixed typos and missing some identations and blank lines
> Changes in V5:
>  Fixed default target in sample app Makefiles
> Changes in V4:
>  Fixed double free of channel during VM shutdown.
> Changes in V3:
>  Fixed crash in Guest CLI when host application is not running.
>  Renamed #defines to be more specific to the module they belong
>  Added vCPU pinning via CLI
> Changes in V2:
>  Runtime selection of librte_power implementations.
>  Updated Unit tests to cover librte_power changes.
>  PATCH[0/3] was sent twice, again as PATCH[0/4]
>  Miscellaneous fixes.
> Alan Carew (10):
>   Channel Manager and Monitor for VM Power Management(Host).
>   VM Power Management CLI(Host).
>   CPU Frequency Power Management(Host).
>   VM Power Management application and Makefile.
>   VM Power Management CLI(Guest).
>   VM communication channels for VM Power Management(Guest).
>   librte_power common interface for Guest and Host
>   Packet format for VM Power Management(Host and Guest).
>   Build system integration for VM Power Management(Guest and Host)
>   VM Power Management Unit Tests

Thanks to my shiny updated checkpatch, I was able to fix these 2 typos:

WARNING:MISSING_SPACE: break quoted strings at a space character
#831: FILE: examples/vm_power_manager/channel_manager.c:722:
+               RTE_LOG(ERR, CHANNEL_MANAGER, "Error connecting to %s, connection"
+                               "already established\n", path);

WARNING:MISSING_SPACE: break quoted strings at a space character
#1424: FILE: examples/vm_power_manager/channel_monitor.c:181:
+               RTE_LOG(ERR, CHANNEL_MONITOR, "Unable to rte_malloc for"
+                               "epoll events\n");

This codebase is really too big to be properly reviewed.

As discussed earlier, it's a workaround for a missing feature in Qemu/KVM.
It's now applied in DPDK but it would be really more convenient for everyone if
it could be fixed upstream. I hope you'll be able to sustain this work for the
goodness of every implied communities.

Thank you

More information about the dev mailing list