[dpdk-dev] [RFC] Virtual Machine Power Management

Carew, Alan alan.carew at intel.com
Thu Sep 11 17:53:52 CEST 2014


Hi folks,

I am currently working on a Power Management example application for a Virtual Machine environment running on qemu/KVM and would appreciate any feedback(with code to share shortly).

The basic idea is to provide librte_power functionality from within a VM to address the lack(for good reason) of MSRs to facilitate frequency changes from within a VM.
For those unfamiliar, librte_power affects frequency changes via the "acpi-cpufreq" userspace power governor, accessed via sysfs.

The VM implementation allows for DPDK applications to request frequency changes via the librte_power API, however requests are forwarded over a message bus to a host monitor daemon which manages frequency changes for any number of VMs, the daemon itself uses librte_power then to honour the VM requests.

VM: rte_power_freq_max ----> guest_channel_send_msg(pkt) ----> HOST
    HOST: epoll_wait() ----> read(pkt) ----> validate_and_process_request() ----> get_pcpus_mask(vCPU) ----> power_manager_scale_core_max(pCPU_mask);

The architecture requires a number of components to achieve this:

Message Bus:
A means of forwarding frequency change requests to the host. I am using Virtio-Serial, it gives us a secure channel that can be configured in a number of ways. Each lcore in the VM has exclusive access to a channel. Each channel is configured as a serial device on the VM and as an AF_UNIX socket on the host. Both endpoints support poll/select/epoll. More information on Virtio-Serial is here: http://fedoraproject.org/wiki/Features/VirtioSerial

VM Application:
For each lcore, a channel is opened in non-blocking mode and frequency changes are just packets send via "write" to the channel. The existing l3fwd-power application be reused. Each packet has format of command(Power), resource(core) and amount(min/max/up/down).

Host Monitor:
Epoll based monitor to manage channel requests: frequency changes(after conversion of vCPU to pCPU), VM shutdown and error events

Management CLI:
For channel management, adding channels to host monitor, disabling/re-enabling VM requests to allow for manual core frequency management(via CLI) and inspecting vCPU to physical CPU pinning.

Power Management:
A wrapper around librte_power to enable frequency changes for a mask of cores, however running a virtual CPU on multiple physical CPUs is not ideal, but is supported. The sharing of a physical CPU with multiple VMs is not supported, while it can be attempted there is no coordination of requests from different VMs.

Thanks,
Alan



More information about the dev mailing list