13 April 2020 Reading time: 12 minutes

VMmanager 6 — modern virtualization platform

Konstantin Slastnoy

Konstantin Slastnoy

Software developer

ISPSystem
Since the last year's meeting in Moscow, VMmanager has passed the stages from open beta-testing to commercial release. Our developer Konstantin Slastnoi has told in details about the first requests of users, advantages and the further plans of product development.

Advantages of VMmanager 6

The design, implementation and transition to the sixth generation of VMmanager was based solely on pragmatic considerations. They have become the foundation for the following advantages:

Common interface to manage all clusters

Separate installation of VMmanager on each cluster is no longer required. It is now possible to control all sites from a single interface.

Fault-tolerant infrastructure

The system components are as independent of each other as possible. For example, if the monitoring service fails, the virtual machine provisioning will not stop and other business-critical services will continue to work.

A convenient Rest API system

We have departed from XML for JSON, which is a more common data format. All modern development systems support it by default and provide convenient tools for client-server interaction.

Stability at high loads

Thanks to the redesigned architecture, now all operations in VMmanager are executed in parallel and do not block the interface.

Fast delivery of features

The new architecture is more service-friendly, so we can respond to your business requests faster and you can bring new services to market for your customers faster.

How does the 6th platform simplify system maintenance?

Expedient server provisioning

The speed of server provisioning is the first thing that clients evaluate. They used to wait for dozens of minutes, but now they get a ready-to-go virtual machine much faster:

  1. a Linux-based VM – 1 minute;
  2. a Windows-based VM – 5 minutes.

Convenient user interface

We use a modern approach to UX-design, so users feel comfortable working with VMmanager 6: it is easy to find the right section, and operations take less time.

How does the 6th platform simplify system maintenance?

Prediction of node capacity

There is no need to manually calculate how many more virtual machines can be hosted in a node. We have added a forecast tool that checks how many VMs there are, what parameters they have, and tells how many similar machines can still be created on the node.

Gathering information on client tasks

No more manual log analysis is required. The information about operations on client VMs, errors and task execution parameters is displayed in the platform interface.

Monitoring, statistics and analysis

VMmanager 6 dashboard is your entire infrastructure on one page. The system collects information on four widgets.

1. Cluster widget

This widget provides summary information of the cluster: storage type, number of virtual nodes and machines, total RAM and disk space. The forecast tool is also located here to show you how many more virtual machines can be created.

2. Nodes widget

All nodes connected to the cluster are displayed here. You can see the changes in the critical indicators: CPU load, RAM, disk and network load. The widget provides a chart with indicators in real time and with trends of changes over the last day. Each indicator can be sorted and viewed separately.

There are detailed statistics for each node. If you click on a node, you will see the tasks status and a list of the virtual machines with the highest load, sorted by disk, CPU or RAM. You can select any machine and set limits for it, for example, a network limit.

3. Self-diagnostics widget

Here you can see the details of the server where VMmanager is installed: CPU, RAM, traffic resources and license status.

4. Tasks widget

Shows the list and status of tasks: which have been successfully completed and which require the administrator's attention.

To enable the administrator to deal with each incident, we have added integration with the most popular data visualization tool – Grafana.
We understand that each hosting provider monitors their own metrics. Therefore, we collect data from nodes and virtual machines and put them into a Graphite database, to which Grafana is connected. This way, providers can create and analyze the metrics they need on an unlimited number of charts.
You can also set up error notifications in Grafana for the selected indicators. However, for this you need to have your own notification service (for example, a Telegram bot), connect it to Grafana and configure it – which is not very convenient. That is why we plan to provide a handy notification setup tool that works "out of the box".

Improvements – solving the requests by first users

After the release of the commercial version, we conducted a survey and implemented the features that users lacked to migrate their equipment to VMmanager 6:
  1. Two network operation schemes to support Hetzner and OVH data centers.
  2. SystemRescueCD for virtual machines recovery. We took the most popular of rescue images and added the ability to connect it to a VM in one click.
  3. Support of multiple network interfaces on nodes and virtual machines. This feature allows you to manage private networks: Add IP addresses to virtual machines and merge them into private networks (UPD: version 4.1.3, 12.03.2020).
  4. Issuing IPv6 addresses by subnets, since network equipment cannot manage to issue individual IPv6 addresses.

VMmanager development plans

Now VMmanager works with tens of thousands of virtual machines and thousands of nodes for 200+ clients from different countries. We predict good product growth rates to continue in the future.
Predicted growth of VMmanager

Planned features

A powerful error notification system for business

The system will allow the administrator to configure notifications for any indicator of the node, virtual machine or tasks in a convenient way: via e-mail, Slack, Telegram or Mattermost. We are also planning to add an automatic action response to the event. For example, when a node runs out of space, a script that you have assigned to this event will automatically run to clean it up.

Easy-to-use error notification system for client

We are planning to create an easier to configure system for end users, which will also allow them to configure notifications on different indicators for VM: CPU, RAM, HDD. Users will be informed promptly when their resources are exceeded. We believe that such functionality is up to the spirit of the times and is attractive to clients.

SDN and network storages

We are planning to create an easier to configure system for end users, which will also allow them to configure notifications on different indicators for VM: CPU, RAM, HDD. Users will be informed promptly when their resources are exceeded. We believe that such functionality is up to the spirit of the times and is attractive to clients.

On the outset, it will be a tool for administrators, and in the future, it will be possible to use the "virtual network" as a service for clients.

Advanced user

Modern clients do not think in such terms as network equipment, physical server, routing and the architecture of the data center. Therefore, we plan to provide even more abstraction in our platform. Advanced user is a user who can create virtual machines, merge them into networks, create images and add IP addresses within the limits purchased in the billing system: RAM, CPU, storage, image. Everything is configured automatically and "under the hood", giving the client the opportunity to manage their virtual infrastructure with maximum ease and without restrictions. Eventually, it will be a role for work in the full "Self-service portal" within the product.

More platform customization options

Working with your own operating systems, automatic execution of scripts on nodes when they are added to the cluster – to configure the firewall or NAT. In addition, this includes the ability to use GPUs or solve other specific tasks by customizing the libvirt-domain.

Fault tolerance at the virtual machine level

The first disaster recovery mechanism, for manual VM recovery: when the data center is unavailable due to weather disasters, and you need to manually move the clients' virtual machines to another DC.

The second high availability mechanism is designed for automatic VM recovery. The system determines automatically that the node has left the quorum and moves the virtual machines to another running node.

Migration from VMmanager 5 to version 6

We have developed a semi-automatic migration tool used by our team to move the client’s infrastructure to a new platform. The tool checks the current infrastructure for compatibility with VMmanager 6. Our team migrates compatible nodes; while for incompatible nodes, the system indicates which functions are not yet available in platform 6. We pass this information on to the client and discuss how critical the features are.

Q&A (from the audience)

Question 1. How does monitoring work, through sensors? Can I measure Load Average from hypervisor level?

All data for monitoring are collected from the node by means of Libvirt, sensors inside the VM are not installed or used. LA can be measured, but at the node level, not hypervisor level.

Question 2. Is it possible to create an unlimited number of private virtual networks with the same IP, without interference of the addresses between clients?

So far, we have only plans for software-defined networks, but soon we will release the ability to assign private and public IP addresses to a virtual machine simultaneously.

Question 3. Clarification regarding migration between 5th and 6th generation: do I need to power off the virtual machines?

No, we do not power the VMs off. The migration is "live", invisible to the client.

Question 4. It is interesting to know the difference between the fifth and sixth generations when working with mass operations. How many cluster nodes can I connect to VMmanager 6? From my experience in the fifth generation, only 3 nodes could be connected.

In platform six, a separate service is responsible for long operations, which we can use to determine which operations can be performed simultaneously and which cannot. The platform interface remains responsive and does not freeze. As for the number of nodes that can be connected, during the migration from platform 5 to 6 we managed to observe the infrastructure on VMmanager 5 with 17 nodes – the interface was very slow. In VMmanager 6 with the same number of nodes, there is no interface slowdown.

Question 5. Are you planning to add a convenient tool to update the software (e.g. OS) on the cluster in order to avoid having to update the cluster manually before migrating from VMmanager version 5 to 6?

There is no separate tool, but you can easily update the cluster through scripts in the platform.

Question 6. When migrating from the 5th to the 6th generation, you cannot migrate virtual machines with multiple drives now, are you planning to add this feature?

We are collecting user cases and reviewing our internal statistics and requests. The most popular cases will be the first to be implemented.