======KVM hypervisor host======
Instructions to setup the host system (the iron, of the frame).
=====Distribution: Debian 6 (Squeeze) or 7 (Wheezy) or 8 (jessie)=====
===preparation===
Install a minimal debian system with a static IP-number.
Configure locale (for perl and apt-get tooling):
export LANGUAGE=en_US.UTF-8
export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8
locale-gen en_US.UTF-8
dpkg-reconfigure locales
Install the software:
apt-get install qemu-kvm libvirt-bin virtinst virt-top
For multiple networking cards:
apt-get install bridge-utils vlan
Install usefull tooling:
apt-get install htop iftop iotop dnsutils tcpdump kpartx mc xfsprogs
===network config===
Add to /etc/network/interfaces
auto br0
iface br0 inet static
bridge_ports eth0
bridge_stp off
bridge_maxwait 0
address 10.11.12.10
netmask 255.255.255.0
broadcast 10.11.12.255
gateway 10.11.12.1
dns-nameservers 10.11.12.66 10.110.12.66
dns-search intra.example.com
auto br1
iface br1 inet manual
bridge_ports eth1
bridge_stp off
bridge_maxwait 0
auto br2
iface br2 inet manual
bridge_ports eth2
bridge_stp off
bridge_maxwait 1
## dot1q Trunk
iface eth3.11 inet manual
vlan-raw-device eth3
iface eth3.12 inet manual
vlan-raw-device eth3
auto br3-vlan11
iface br3-vlan11 inet manual
bridge_ports eth3.11
bridge_stp off
bridge_maxwait 0
bridge_fd 9
bridge_hello 2
bridge_maxage 12
up /sbin/ifconfig $IFACE up || /bin/true
===automatic suspend/resume===
== Debian 7==
Enable automatic save2disk at shutdown + resume at startup in /etc/default/libvirt-guests by setting:
ON_BOOT=start
ON_SHUTDOWN=suspend
== Debian 6==
Configure automatic suspend/resume for the guests:
cd /etc/init.d
cp /usr/share/doc/libvirt-bin/examples/libvirt-suspendonreboot virsh-suspend
update-rc.d libvirt-suspendonreboot defaults 29 71
mkdir /var/lib/libvirt/autosuspend
===real text-mode console===
== Debian 7 (Grub 2)==
/etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX="nomodeset"
GRUB_TERMINAL=console
GRUB_GFXMODE=800x600
GRUB_GFXPAYLOAD_LINUX=keep
GRUB_INIT_TUNE="480 440 1"
===Power===
For APC equipment, there is software ready in the Debian repository. Good reason to buy this if you need uninterrupted power. After connecting the equipment (power and USB-cabling), install the ups-daemon software:
apt-get install apcupsd apcupsd-doc apcupsd-cgi
Edit the configfile /etc/apcupsd/apcupsd.conf
## apcupsd.conf v1.1 ##
# for apcupsd release 3.14.12 (29 March 2014) - debian
UPSNAME apcups01
UPSCABLE usb
UPSTYPE usb
DEVICE
LOCKFILE /var/lock
SCRIPTDIR /etc/apcupsd
PWRFAILDIR /etc/apcupsd
NOLOGINDIR /etc
ONBATTERYDELAY 6
BATTERYLEVEL 90
MINUTES 10
TIMEOUT 0
ANNOY 300
ANNOYDELAY 60
NOLOGON disable
KILLDELAY 0
NETSERVER off
NISIP 127.0.0.1
NISPORT 3551
EVENTSFILE /var/log/apcupsd.events
EVENTSFILEMAX 10
UPSCLASS standalone
UPSMODE disable
STATTIME 0
STATFILE /var/log/apcupsd.status
LOGSTATS off
DATATIME 0
And flag the parameter ISCONFIGURED in /etc/default/apcupsd to yes.
===docs===
* http://wiki.kartbuilding.net/index.php/KVM_Setup_on_Debian_Squeeze
* http://blog.frosty-geek.net/2011/02/ubuntu-tagged-vlan-interfaces-and.html
* http://blog.snijders-it.nl/
* http://wiki.libvirt.org/page/Networking
* http://wiki.andreas-duffner.de/index.php/Virtual_machines_with_kvm
=====Distribution: Ubuntu 12.04 / 14.04 LTS =====
Use the Debian 6 howto.
Below some specific Ubuntu actions.
===real text-mode console===
/etc/default/grub
GRUB_DEFAULT=0
#GRUB_HIDDEN_TIMEOUT=0
#GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
#GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
#GRUB_CMDLINE_LINUX_DEFAULT="text"
#GRUB_CMDLINE_LINUX=""
#GRUB_CMDLINE_LINUX="init=/sbin/init -v noplymouth INIT_VERBOSE=yes"
#GRUB_CMDLINE_LINUX="vga=769"
GRUB_CMDLINE_LINUX="nomodeset"
GRUB_TERMINAL=console
#GRUB_GFXMODE=640x480
===locales===
locale-gen nl_NL.UTF-8
dpkg-reconfigure locales
=====Distribution: Ubuntu 18.04 LTS =====
Install the networking software:
apt-get install bridge-utils vlan
===network config===
This uses netplan.
Remove all .yaml files in /etc/netplan/ (or rename them to *.disabled)
==One network, one bridge==
A simple configuration for a simple network. The server has one bridge, with a static IP on the bridge.
Add to /etc/netplan/10-netconfig-bridge-static.yaml
network:
version: 2
renderer: networkd
ethernets:
eno1:
dhcp4: no
bridges:
br0:
interfaces:
- eno1
addresses:
- 192.168.2.203/24
gateway4: 192.168.2.1
parameters:
stp: false
forward-delay: 0
nameservers:
addresses:
- 194.109.6.66
- 194.109.9.99
==One bridge per VLAN==
A more complex setup. The system has one physical NIC. This is connected to a switch. This port is in trunc-mode, and has four VLANs configured on this port.
On the server, the four VLANs are split, and for every VLAN a bridge is created. A static IP number is configured on one bridge, to access the server.
Remove alle files in /etc/netplan/ and create the file /etc/netplan/10-netconfig-bridge-per-vlan.yaml with the following:
network:
version: 2
renderer: networkd
ethernets:
eno1:
dhcp4: no
dhcp6: no
bridges:
br0010:
interfaces:
- vlan0010
parameters:
stp: false
forward-delay: 0
addresses:
- 192.168.10.42/24
gateway4: 192.168.10.1
nameservers:
addresses:
- 1.1.1.1
- 8.8.8.8
br0011:
interfaces:
- vlan0011
parameters:
stp: false
forward-delay: 0
dhcp4: no
dhcp6: no
br0012:
interfaces:
- vlan0012
parameters:
stp: false
forward-delay: 0
dhcp4: no
dhcp6: no
br0013:
interfaces:
- vlan0013
parameters:
stp: false
forward-delay: 0
dhcp4: no
dhcp6: no
vlans:
vlan0010:
accept-ra: no
id: 10
link: eno1
vlan0011:
accept-ra: no
id: 11
link: eno1
vlan0012:
accept-ra: no
id: 12
link: eno1
vlan013:
accept-ra: no
id: 13
link: eno1
And add the following file: /etc/systemd/network/10-netplan-brUp.network
[Match]
Name=br00*
[Network]
LinkLocalAddressing=no
ConfigureWithoutCarrier=true
Explanation. This is to bring up the anonymous bridges automatically after boot (the bridges which have no ip-address configured on it). Due to a bug in the combination of netplan and networkd, anonymous bridges will have operational status 'off' after boot.
This can be checked with:
networkctl list
This can be solved manually with:
ip link set dev br0011 up
ip link set dev br0012 up
ip link set dev br0013 up
for the above netplan yaml example.
===KVM software===
Install the KVM-serversoftware:
apt-get install qemu-kvm libvirt-daemon-system virt-top
And the cli administration tools:
apt-get install libvirt-clients
=====Distribution: CentOS =====
===preparation===
Install a minimal CentOS system with a static IP-number.
===network config CentOS-8===
With nmcli
Add to /etc/sysconfig/
Bonding:
nmcli con add type bond con-name bond0 ifname bond0 autoconnect yes \
ipv4.method disabled \
ipv6.method ignore
nmcli con add type ethernet ifname eno1 con-name bond0-sl1 master bond0
nmcli con add type ethernet ifname eno2 con-name bond0-sl2 master bond0
Split the trunc-datastream to VLAN's:
nmcli con add type vlan ifname vlan20 con-name vlan20 vlan.id 20 \
vlan.parent bond0 \
ipv4.method disabled \
ipv6.method ignore
# repeat per VLAN
Create a bridge per VLAN:
BR_NAME="br20"
BR_INT="vlan20"
SUBNET_IP="192.168.103.32/24"
GW="192.168.103.1"
DNS1="192.168.102.144"
DNS2="192.168.102.146"
nmcli connection add type bridge con-name ${BR_NAME} ifname ${BR_NAME} autoconnect yes
nmcli connection modify ${BR_NAME} ipv4.method manual ipv4.addresses ${SUBNET_IP}
nmcli connection modify ${BR_NAME} ipv4.gateway ${GW}
nmcli connection modify ${BR_NAME} ipv4.dns ${DNS1} +ipv4.dns ${DNS2}
nmcli connection up ${BR_NAME}
nmcli connection add type bridge-slave con-name ${BR_INT} ifname ${BR_INT} master ${BR_NAME} autoconnect yes
nmcli connection up ifname ${BR_INT}
#
ip r add default via 192.168.103.1
#
===hypervisor kvm===
Install the software:
yum install kvm virt-manager libvirt
=====Check and performance tuning=====
Do a final check on the host with:
virt-host-validate
QEMU: Checking for hardware virtualization : PASS
QEMU: Checking if device /dev/kvm exists : PASS
QEMU: Checking if device /dev/kvm is accessible : PASS
QEMU: Checking if device /dev/vhost-net exists : PASS
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup 'memory' controller support : PASS
QEMU: Checking for cgroup 'memory' controller mount-point : PASS
QEMU: Checking for cgroup 'cpu' controller support : PASS
QEMU: Checking for cgroup 'cpu' controller mount-point : PASS
QEMU: Checking for cgroup 'cpuacct' controller support : PASS
QEMU: Checking for cgroup 'cpuacct' controller mount-point : PASS
QEMU: Checking for cgroup 'devices' controller support : PASS
QEMU: Checking for cgroup 'devices' controller mount-point : PASS
QEMU: Checking for cgroup 'net_cls' controller support : PASS
QEMU: Checking for cgroup 'net_cls' controller mount-point : PASS
QEMU: Checking for cgroup 'blkio' controller support : PASS
QEMU: Checking for cgroup 'blkio' controller mount-point : PASS
QEMU: Checking for device assignment IOMMU support : PASS
QEMU: Checking if IOMMU is enabled by kernel : PASS
LXC: Checking for Linux >= 2.6.26 : PASS
LXC: Checking for namespace ipc : PASS
LXC: Checking for namespace mnt : PASS
LXC: Checking for namespace pid : PASS
LXC: Checking for namespace uts : PASS
LXC: Checking for namespace net : PASS
LXC: Checking for namespace user : PASS
LXC: Checking for cgroup 'memory' controller support : PASS
LXC: Checking for cgroup 'memory' controller mount-point : PASS
LXC: Checking for cgroup 'cpu' controller support : PASS
LXC: Checking for cgroup 'cpu' controller mount-point : PASS
LXC: Checking for cgroup 'cpuacct' controller support : PASS
LXC: Checking for cgroup 'cpuacct' controller mount-point : PASS
LXC: Checking for cgroup 'devices' controller support : PASS
LXC: Checking for cgroup 'devices' controller mount-point : PASS
LXC: Checking for cgroup 'net_cls' controller support : PASS
LXC: Checking for cgroup 'net_cls' controller mount-point : PASS
LXC: Checking for cgroup 'freezer' controller support : PASS
LXC: Checking for cgroup 'freezer' controller mount-point : PASS
====tuning====
===PCI passthrough===
Verify that your system has IOMMU support (VT-d):
dmesg | grep -e DMAR -e IOMMU
or for AMD-machines:
dmesg | grep AMD-Vi
If the hardware supports it, pass one of the following commands as a kernel parameter:
intel_iommu=on # Intel only
iommu=pt iommu=1 # AMD only
For example in /etc/default/grub or /etc/sysconfig/grub in the line
GRUB_CMDLINE_LINUX_DEFAULT="...."
=====Nested KVM=====
(this feature is only for the purpose of testing, not for production)
A first check:
egrep '(vmx|svm)' /proc/cpuinfo
will give one or more lines when virtual machines can be created on this host.
Check the CPU-architecture of the physical system with:
lscpu
Check the current status of the host/hypervisor (the physical system) with:
cat /sys/module/kvm_intel/parameters/nested
To activate KVM nesting, create or edit /etc/modprobe.d/kvm.conf (or /etc/modprobe.d/qemu-system-x86.conf on ubuntu) on the host and add:
options kvm_intel nested=1
#options kvm_amd nested=1
Reboot the system to effect the setting.
Create a VM, and use "copy host CPU configuration" in the cpu-section of the VM-definition.
In this VM you can check the kvm nesting feature with:
cat /sys/module/kvm_intel/parameters/nested
Also:
egrep '(vmx|svm)' /proc/cpuinfo
will give one or more lines.
======KVM guest actions======
=====Usefull tooling for a VM=====
====ACPI-stuff====
Requirements Debian:
apt-get install acpid
====disk-changes changes====
Requirements Debian:
apt-get install parted
Probe the changed partition-table:
partprobe
=====enlarge a vdisk=====
First enlarge the vdisk in the KVM host (with lvm or whatever).
In case the (v)disk is larger dan 2TB, use parted instead of fdisk.
Below an example of a 3 TB disk, which is enlarged to 3.5TB. The KVM guest is rebooted, and the filesystem on it is umounted.
The trick is to delete te partition (rm) and to recreate it directly afterwards, with the same startingpoint, and with a lager number as the endpoint.
parted -l /dev/vdc
# parted /dev/vdc
(parted) p
Model: Virtio Block Device (virtblk)
Disk /dev/vdc: 3848GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 3299GB 3299GB xfs
# parted /dev/vdc
(parted) rm 1
# parted /dev/vdc
(parted) mkpart
Partition name? []?
File system type? [ext2]? xfs
Start? 1049kB
End? 3848GB
Instead of a number as an answer to the question "End", it is also possible to use -1. This will use the largest possible value, ie. the end of the disk.
The action above can also be done in a one-liner:
# parted /dev/vdc
(parted) mkpart primary xfs 1049kB -1
View the result:
# parted /dev/vdc
(parted) p
Model: Virtio Block Device (virtblk)
Disk /dev/vdc: 3848GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 3848GB 3848GB xfs
(parted) q
The partitioning part is done now.
Resize the filesystem on the partition. For example with resizefs or xfs_growfs.
======KVM administration workstation======
====Ubuntu/Debian====
For a Ubuntu/Debian based workstation with graphical display:
apt-get install ssh virt-manager
Only when it is needed to run kvm-vpsses on the workstation also, install also the package kvm.
Start virtmanager, and click on 'add-connection'.
====MS-windows====
Using Virt-manager via XMing and putty:
http://blog.allanglesit.com/2011/03/linux-kvm-managing-kvm-guests-using-virt-manager-on-windows/
======KVM administration======
Instructions how to administer the KVM host system (the iron, of the frame).
====disk administration of the VPSses====
===creation of diskspace for a VPS===
A wide range of storage formats are available to choose from. A few thoughts:
* for the best performance, use a raw partition as storage backend for a VPS. You can create a partition on the host or a logical-volume on the host. This will be assigned to the guest.
* if you are using raw volumes or partitions, it is best to set the caching option to 'none' (to avoid the cache completely, which reduces data copies and bus traffic).
* if you are using files, and you are not in the need of storage guarantees, for example for testing purposes, you can choose 'writeback caching' which offers maximum speed.
* if your guest supports it, use the 'virtio' interface.
* don't the linux filesystem btrfs on the host for image files. It will result in low I/O performance.
===enlarge diskspace inside a VPS===
To enlarge a disk of a VPS , use the following procedure:
On the VM-host:
In case of raw-devices, enlarge the logical-volume:
lvresize -L +2G /dev/vg01/VPSfoo01
to grow the disk/LUN of VPSfoo with 2G
In case of file-devices:
cd /var/lib/libvirt/images
dd if=/dev/null of=VPSfoo.img bs=1 count=0 seek=10G
to grow the disk/LUN of VPSfoo to 10G
After this, stop and start the VPS. This is needed, because the new disk-geometry has to be read by the kernel.
In the VPS:
Now start fdisk, and it will detect a larger disk. Add a partition, or repartition the partition at the end of the disk (you did have this procedure in mind, when designing the disk-layout, didn't you :-)
Now run pvresize on this partition, and after this, the volume-group has free-extents. Use these to grow one or more logical-volumes.
===add a lun to a VPS===
To attach a storage lun to a running VPS , use the following procedure:
On the VM-host:
Create a new raw-device (and optionally make a symlink).
Attach the raw-device to the vps (in this case, the name is vpsfoo2 and it is the second virtio-disk aka vdb):
virsh attach-disk vpsfoo2 /var/lib/libvirt/images/vpsfoo2data.img vdb --targetbus virtio --live
======Migration of VMs to another host ======
Instructions how to migrate VMs to another hypervisor-host.
====Offline====
Create a destination KVM-hypervisor system, including bridges on the required networks and VLANs. Try to use the same names for bridges, filesystems, logical-volumes. Else use "virsh edit" to make the modifications befor starting the VM on the destination hypervisor.
===On the source-hypervisor===
create a definition-file:
virsh list --all
virsh dumpxml --security-info vpstest2 > /var/lib/libvirt/images/vpstest2.xml
virsh shutdown vpstest2
virsh destroy vpstest2 # if needed
===On the destination-hypervisor===
Create the required logical-volumes, and symlinks:
lvcreate -L 4G -n vpstest2 vg0
ln -s /dev/mapper/vg0-vpstest2 /var/lib/libvirt/images/vpstest2.img
And get the raw-logical-volume with a dd piped through ssh:
ssh root@sourcehyp "dd if=/dev/mapper/vg0-vpstest2" | dd of=/dev/mapper/vg0-vpstest2
And get the config-definition-file:
scp root@sourcehyp:/var/lib/libvirt/images/vpstest2.xml /var/lib/libvirt/images/vpstest2.xml
And create the VM:
virsh define /var/lib/libvirt/images/vpstest2.xml
And start the VM:
virsh start vpstest2