A VM virtual disk is an image of the hard drive. Virtual disks are kept on a network or local devices (a storage).
This article describes the types and functionality of network storages supported by VMmanager.
Supported network storages:
- iSCSI;
- NFS;
- network LVM-storage;
- RBD;
- GlusterFS.
RBD and GlusterFS can be considered fault-tolerant network storages.
iSCSI
A remote server is used as a storage. Access to this remote server is established over iSCSI, the block level protocol. The iSCSI protocol describes transfer of SCSI packages via the stack of TCP/IP protocols.
SCSI (the Small Computer Systems Interface) is a set of protocols allowing to work with input/output devices, especially with data storage systems. SCSI has the client-server architecture. Clients (initiators) execute SCSI commands to create requests to components (logical units) of servers which are called targets, or target devices.
VMmanager uses Open-iSCSI, one of the versions of iSCSI protocol for GNU/Linux.
iSCSI documentation is available on the website
iSCSI storage settings are described in our documentation: iSCSI-storage.
NFS
A remote server is used for a storage. Access to this server is established over NFS (the Network File System), the file level protocol.
NFS utilizes the system of remote procedure calling - ONC RPC (Open Network Computing Remote Procedure Call). This system is based on RPC (Procedure Call Model), the protocol of remote procedure calls.
The model of remote procedure call can be described in the following way. The control flow connects the process of calling of the server from the client and the server process. The client process sends a message with the call and the request to the server process, and then it waits for the return message. The request includes procedure parameters, while the return message includes the results of the procedure. After the return message has been received, the procedure results are extracted and sent to the client. The process on the server side becomes inactive and awaits the call. During calls, the server process extracts procedure parameters, calculates the results, sends the return message, and then waits for the next call.
Transfer in ONC RPC is performed via the protocols TCP, UDP, and XDR.
The current version of NFS is NFSv4.
NFS documentation is available on the website
You can find more information about the installation and setting of the NFS storage in our documentation NFS-storage.
Network LVM storage
The network LVM storage is another version of the standard LVM. Here, the physical volume with a group of volumes is the network device.
Access to the remote server is established over the iSCSI protocol.
Installation and setting of the network LVM storage are described in our documentation Network LVM-storage.
Fault tolerant network storages
Fault tolerant storages are distributed file systems allowing to create a scalable cluster of servers with different roles, for storage or data replication purposes and load distribution, which guarantees high availability and reliability.
A fault-tolerant storage performs the following functions:
- data storage
- data replication,
- load distribution between cluster nodes when reading and writing data.
If one of the disks, nodes, or a group of nodes fail, fault-tolerant storages save the data and restore the lost copies on the other available nodes automatically until the failed nodes or disks are replaced. The cluster stays online without a second downtime or any issues for the customers.
VMmanager KVM supports Ceph and GlusterFS as fault tolerant storages.
Ceph
Ceph gives a few options for access to data: block device, file system, or object storage. VMmanager supports RBD, the distributed block device with kernel client and QEMU/KVM driver. With RBD, virtual disks are distributed among several objects and stored in the distributed Ceph (RADOS) storage.
Ceph RBD provides two types of data storage: replication and erasure. The number of copies and data size vary depending on the selected type.
In the case of replication, several replicas of incoming data are saved on different cluster nodes.
In the case of erasure, the incoming data are divided into K part of the same size, and the system creates a certain number of parts to restore the data (M). M-part have the same size as the K-parts. All the part are distributed among K+M cluster nodes — one part per a cluster node. The cluster can support the fault tolerance of the system without data loss if M nodes fail.
Functionality and operation of a Ceph cluster are supported by the following Ceph-services:
- Monitors (MON);
- Managers (MGR);
- Object Storage Daemon (OSD);
- Metadata Server (MDS).
Ceph RBD doesn't require MDS.
You can use one server with two roles in small clusters, e.g. for storage or as a monitor. In large-scale clusters, it is recommended to launch utilities on different machines.
A VMmanager node can be used as a Ceph cluster node. However, we cannot recommend it since it can lead to high server load, or to storage or VMmanager node restoration if such a server goes out of service.
An RBD storage supports only the RAW format of virtual disks.
You can find Ceph documentation on the website
Read more in the article Ceph RBD.
GlusterFS
With GlusterFS, virtual disks are distributed over a few objects and are stored in the distributed storage this way. The distributed storage consists of volumes which are physically located on different servers. There are a few ways to record data to the volumes:
- Distributed — data are distributed evenly across all servers without duplication. Such volumes are not protected by GlusterFS. If one of the servers or its disks fails, its data will not be available;
- Replicated — data are recorded at least on two servers.;
- Striped — the incoming data are divided into parts which are recorded to different servers. When a request comes, data is written off servers recursively. If one of the servers or its disks goes out of service, the volume will be out until the failed node is restored again;
- Distributed Striped — data are distributed evenly between subvolumes. Inside each subvolume, data are divided into parts and recorded to different servers parallelly;
- Distributed Replicated — data are distributed evenly between subvolumes. Inside each subvolume, data is recorded at least to two servers. This option is the most reliable.
GlusterFS does not need the centralized server for metadata.
GlusterFS supports both RAW and Qcow2.
You can find more information about GlusterFS settings in the article GlusterFS-storage.
Find documentation on GlusterFS on the official website.