======KVM hypervisor host====== Instructions to setup the host system (the iron, of the frame). =====Distribution: Debian 6 (Squeeze) or 7 (Wheezy) or 8 (jessie)===== ===preparation=== Install a minimal debian system with a static IP-number. Configure locale (for perl and apt-get tooling): export LANGUAGE=en_US.UTF-8 export LANG=en_US.UTF-8 export LC_ALL=en_US.UTF-8 locale-gen en_US.UTF-8 dpkg-reconfigure locales Install the software: apt-get install qemu-kvm libvirt-bin virtinst virt-top For multiple networking cards: apt-get install bridge-utils vlan Install usefull tooling: apt-get install htop iftop iotop dnsutils tcpdump kpartx mc xfsprogs ===network config=== Add to /etc/network/interfaces auto br0 iface br0 inet static bridge_ports eth0 bridge_stp off bridge_maxwait 0 address 10.11.12.10 netmask 255.255.255.0 broadcast 10.11.12.255 gateway 10.11.12.1 dns-nameservers 10.11.12.66 10.110.12.66 dns-search intra.example.com auto br1 iface br1 inet manual bridge_ports eth1 bridge_stp off bridge_maxwait 0 auto br2 iface br2 inet manual bridge_ports eth2 bridge_stp off bridge_maxwait 1 ## dot1q Trunk iface eth3.11 inet manual vlan-raw-device eth3 iface eth3.12 inet manual vlan-raw-device eth3 auto br3-vlan11 iface br3-vlan11 inet manual bridge_ports eth3.11 bridge_stp off bridge_maxwait 0 bridge_fd 9 bridge_hello 2 bridge_maxage 12 up /sbin/ifconfig $IFACE up || /bin/true ===automatic suspend/resume=== == Debian 7== Enable automatic save2disk at shutdown + resume at startup in /etc/default/libvirt-guests by setting: ON_BOOT=start ON_SHUTDOWN=suspend == Debian 6== Configure automatic suspend/resume for the guests: cd /etc/init.d cp /usr/share/doc/libvirt-bin/examples/libvirt-suspendonreboot virsh-suspend update-rc.d libvirt-suspendonreboot defaults 29 71 mkdir /var/lib/libvirt/autosuspend ===real text-mode console=== == Debian 7 (Grub 2)== /etc/default/grub GRUB_DEFAULT=0 GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian` GRUB_CMDLINE_LINUX_DEFAULT="" GRUB_CMDLINE_LINUX="nomodeset" GRUB_TERMINAL=console GRUB_GFXMODE=800x600 GRUB_GFXPAYLOAD_LINUX=keep GRUB_INIT_TUNE="480 440 1" ===Power=== For APC equipment, there is software ready in the Debian repository. Good reason to buy this if you need uninterrupted power. After connecting the equipment (power and USB-cabling), install the ups-daemon software: apt-get install apcupsd apcupsd-doc apcupsd-cgi Edit the configfile /etc/apcupsd/apcupsd.conf ## apcupsd.conf v1.1 ## # for apcupsd release 3.14.12 (29 March 2014) - debian UPSNAME apcups01 UPSCABLE usb UPSTYPE usb DEVICE LOCKFILE /var/lock SCRIPTDIR /etc/apcupsd PWRFAILDIR /etc/apcupsd NOLOGINDIR /etc ONBATTERYDELAY 6 BATTERYLEVEL 90 MINUTES 10 TIMEOUT 0 ANNOY 300 ANNOYDELAY 60 NOLOGON disable KILLDELAY 0 NETSERVER off NISIP 127.0.0.1 NISPORT 3551 EVENTSFILE /var/log/apcupsd.events EVENTSFILEMAX 10 UPSCLASS standalone UPSMODE disable STATTIME 0 STATFILE /var/log/apcupsd.status LOGSTATS off DATATIME 0 And flag the parameter ISCONFIGURED in /etc/default/apcupsd to yes. ===docs=== * http://wiki.kartbuilding.net/index.php/KVM_Setup_on_Debian_Squeeze * http://blog.frosty-geek.net/2011/02/ubuntu-tagged-vlan-interfaces-and.html * http://blog.snijders-it.nl/ * http://wiki.libvirt.org/page/Networking * http://wiki.andreas-duffner.de/index.php/Virtual_machines_with_kvm =====Distribution: Ubuntu 12.04 / 14.04 LTS ===== Use the Debian 6 howto. Below some specific Ubuntu actions. ===real text-mode console=== /etc/default/grub GRUB_DEFAULT=0 #GRUB_HIDDEN_TIMEOUT=0 #GRUB_HIDDEN_TIMEOUT_QUIET=true GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian` #GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" #GRUB_CMDLINE_LINUX_DEFAULT="text" #GRUB_CMDLINE_LINUX="" #GRUB_CMDLINE_LINUX="init=/sbin/init -v noplymouth INIT_VERBOSE=yes" #GRUB_CMDLINE_LINUX="vga=769" GRUB_CMDLINE_LINUX="nomodeset" GRUB_TERMINAL=console #GRUB_GFXMODE=640x480 ===locales=== locale-gen nl_NL.UTF-8 dpkg-reconfigure locales =====Distribution: Ubuntu 18.04 LTS ===== Install the networking software: apt-get install bridge-utils vlan ===network config=== This uses netplan. Remove all .yaml files in /etc/netplan/ (or rename them to *.disabled) ==One network, one bridge== A simple configuration for a simple network. The server has one bridge, with a static IP on the bridge. Add to /etc/netplan/10-netconfig-bridge-static.yaml network: version: 2 renderer: networkd ethernets: eno1: dhcp4: no bridges: br0: interfaces: - eno1 addresses: - 192.168.2.203/24 gateway4: 192.168.2.1 parameters: stp: false forward-delay: 0 nameservers: addresses: - 194.109.6.66 - 194.109.9.99 ==One bridge per VLAN== A more complex setup. The system has one physical NIC. This is connected to a switch. This port is in trunc-mode, and has four VLANs configured on this port. On the server, the four VLANs are split, and for every VLAN a bridge is created. A static IP number is configured on one bridge, to access the server. Remove alle files in /etc/netplan/ and create the file /etc/netplan/10-netconfig-bridge-per-vlan.yaml with the following: network: version: 2 renderer: networkd ethernets: eno1: dhcp4: no dhcp6: no bridges: br0010: interfaces: - vlan0010 parameters: stp: false forward-delay: 0 addresses: - 192.168.10.42/24 gateway4: 192.168.10.1 nameservers: addresses: - 1.1.1.1 - 8.8.8.8 br0011: interfaces: - vlan0011 parameters: stp: false forward-delay: 0 dhcp4: no dhcp6: no br0012: interfaces: - vlan0012 parameters: stp: false forward-delay: 0 dhcp4: no dhcp6: no br0013: interfaces: - vlan0013 parameters: stp: false forward-delay: 0 dhcp4: no dhcp6: no vlans: vlan0010: accept-ra: no id: 10 link: eno1 vlan0011: accept-ra: no id: 11 link: eno1 vlan0012: accept-ra: no id: 12 link: eno1 vlan013: accept-ra: no id: 13 link: eno1 And add the following file: /etc/systemd/network/10-netplan-brUp.network [Match] Name=br00* [Network] LinkLocalAddressing=no ConfigureWithoutCarrier=true Explanation. This is to bring up the anonymous bridges automatically after boot (the bridges which have no ip-address configured on it). Due to a bug in the combination of netplan and networkd, anonymous bridges will have operational status 'off' after boot. This can be checked with: networkctl list This can be solved manually with: ip link set dev br0011 up ip link set dev br0012 up ip link set dev br0013 up for the above netplan yaml example. ===KVM software=== Install the KVM-serversoftware: apt-get install qemu-kvm libvirt-daemon-system virt-top And the cli administration tools: apt-get install libvirt-clients =====Distribution: CentOS ===== ===preparation=== Install a minimal CentOS system with a static IP-number. ===network config CentOS-8=== With nmcli Add to /etc/sysconfig/ Bonding: nmcli con add type bond con-name bond0 ifname bond0 autoconnect yes \ ipv4.method disabled \ ipv6.method ignore nmcli con add type ethernet ifname eno1 con-name bond0-sl1 master bond0 nmcli con add type ethernet ifname eno2 con-name bond0-sl2 master bond0 Split the trunc-datastream to VLAN's: nmcli con add type vlan ifname vlan20 con-name vlan20 vlan.id 20 \ vlan.parent bond0 \ ipv4.method disabled \ ipv6.method ignore # repeat per VLAN Create a bridge per VLAN: BR_NAME="br20" BR_INT="vlan20" SUBNET_IP="192.168.103.32/24" GW="192.168.103.1" DNS1="192.168.102.144" DNS2="192.168.102.146" nmcli connection add type bridge con-name ${BR_NAME} ifname ${BR_NAME} autoconnect yes nmcli connection modify ${BR_NAME} ipv4.method manual ipv4.addresses ${SUBNET_IP} nmcli connection modify ${BR_NAME} ipv4.gateway ${GW} nmcli connection modify ${BR_NAME} ipv4.dns ${DNS1} +ipv4.dns ${DNS2} nmcli connection up ${BR_NAME} nmcli connection add type bridge-slave con-name ${BR_INT} ifname ${BR_INT} master ${BR_NAME} autoconnect yes nmcli connection up ifname ${BR_INT} # ip r add default via 192.168.103.1 # ===hypervisor kvm=== Install the software: yum install kvm virt-manager libvirt =====Check and performance tuning===== Do a final check on the host with: virt-host-validate QEMU: Checking for hardware virtualization : PASS QEMU: Checking if device /dev/kvm exists : PASS QEMU: Checking if device /dev/kvm is accessible : PASS QEMU: Checking if device /dev/vhost-net exists : PASS QEMU: Checking if device /dev/net/tun exists : PASS QEMU: Checking for cgroup 'memory' controller support : PASS QEMU: Checking for cgroup 'memory' controller mount-point : PASS QEMU: Checking for cgroup 'cpu' controller support : PASS QEMU: Checking for cgroup 'cpu' controller mount-point : PASS QEMU: Checking for cgroup 'cpuacct' controller support : PASS QEMU: Checking for cgroup 'cpuacct' controller mount-point : PASS QEMU: Checking for cgroup 'devices' controller support : PASS QEMU: Checking for cgroup 'devices' controller mount-point : PASS QEMU: Checking for cgroup 'net_cls' controller support : PASS QEMU: Checking for cgroup 'net_cls' controller mount-point : PASS QEMU: Checking for cgroup 'blkio' controller support : PASS QEMU: Checking for cgroup 'blkio' controller mount-point : PASS QEMU: Checking for device assignment IOMMU support : PASS QEMU: Checking if IOMMU is enabled by kernel : PASS LXC: Checking for Linux >= 2.6.26 : PASS LXC: Checking for namespace ipc : PASS LXC: Checking for namespace mnt : PASS LXC: Checking for namespace pid : PASS LXC: Checking for namespace uts : PASS LXC: Checking for namespace net : PASS LXC: Checking for namespace user : PASS LXC: Checking for cgroup 'memory' controller support : PASS LXC: Checking for cgroup 'memory' controller mount-point : PASS LXC: Checking for cgroup 'cpu' controller support : PASS LXC: Checking for cgroup 'cpu' controller mount-point : PASS LXC: Checking for cgroup 'cpuacct' controller support : PASS LXC: Checking for cgroup 'cpuacct' controller mount-point : PASS LXC: Checking for cgroup 'devices' controller support : PASS LXC: Checking for cgroup 'devices' controller mount-point : PASS LXC: Checking for cgroup 'net_cls' controller support : PASS LXC: Checking for cgroup 'net_cls' controller mount-point : PASS LXC: Checking for cgroup 'freezer' controller support : PASS LXC: Checking for cgroup 'freezer' controller mount-point : PASS ====tuning==== ===PCI passthrough=== Verify that your system has IOMMU support (VT-d): dmesg | grep -e DMAR -e IOMMU or for AMD-machines: dmesg | grep AMD-Vi If the hardware supports it, pass one of the following commands as a kernel parameter: intel_iommu=on # Intel only iommu=pt iommu=1 # AMD only For example in /etc/default/grub or /etc/sysconfig/grub in the line GRUB_CMDLINE_LINUX_DEFAULT="...." =====Nested KVM===== (this feature is only for the purpose of testing, not for production) A first check: egrep '(vmx|svm)' /proc/cpuinfo will give one or more lines when virtual machines can be created on this host. Check the CPU-architecture of the physical system with: lscpu Check the current status of the host/hypervisor (the physical system) with: cat /sys/module/kvm_intel/parameters/nested To activate KVM nesting, create or edit /etc/modprobe.d/kvm.conf (or /etc/modprobe.d/qemu-system-x86.conf on ubuntu) on the host and add: options kvm_intel nested=1 #options kvm_amd nested=1 Reboot the system to effect the setting. Create a VM, and use "copy host CPU configuration" in the cpu-section of the VM-definition. In this VM you can check the kvm nesting feature with: cat /sys/module/kvm_intel/parameters/nested Also: egrep '(vmx|svm)' /proc/cpuinfo will give one or more lines. ======KVM guest actions====== =====Usefull tooling for a VM===== ====ACPI-stuff==== Requirements Debian: apt-get install acpid ====disk-changes changes==== Requirements Debian: apt-get install parted Probe the changed partition-table: partprobe =====enlarge a vdisk===== First enlarge the vdisk in the KVM host (with lvm or whatever). In case the (v)disk is larger dan 2TB, use parted instead of fdisk. Below an example of a 3 TB disk, which is enlarged to 3.5TB. The KVM guest is rebooted, and the filesystem on it is umounted. The trick is to delete te partition (rm) and to recreate it directly afterwards, with the same startingpoint, and with a lager number as the endpoint. parted -l /dev/vdc # parted /dev/vdc (parted) p Model: Virtio Block Device (virtblk) Disk /dev/vdc: 3848GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 3299GB 3299GB xfs # parted /dev/vdc (parted) rm 1 # parted /dev/vdc (parted) mkpart Partition name? []? File system type? [ext2]? xfs Start? 1049kB End? 3848GB Instead of a number as an answer to the question "End", it is also possible to use -1. This will use the largest possible value, ie. the end of the disk. The action above can also be done in a one-liner: # parted /dev/vdc (parted) mkpart primary xfs 1049kB -1 View the result: # parted /dev/vdc (parted) p Model: Virtio Block Device (virtblk) Disk /dev/vdc: 3848GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 3848GB 3848GB xfs (parted) q The partitioning part is done now. Resize the filesystem on the partition. For example with resizefs or xfs_growfs. ======KVM administration workstation====== ====Ubuntu/Debian==== For a Ubuntu/Debian based workstation with graphical display: apt-get install ssh virt-manager Only when it is needed to run kvm-vpsses on the workstation also, install also the package kvm. Start virtmanager, and click on 'add-connection'. ====MS-windows==== Using Virt-manager via XMing and putty: http://blog.allanglesit.com/2011/03/linux-kvm-managing-kvm-guests-using-virt-manager-on-windows/ ======KVM administration====== Instructions how to administer the KVM host system (the iron, of the frame). ====disk administration of the VPSses==== ===creation of diskspace for a VPS=== A wide range of storage formats are available to choose from. A few thoughts: * for the best performance, use a raw partition as storage backend for a VPS. You can create a partition on the host or a logical-volume on the host. This will be assigned to the guest. * if you are using raw volumes or partitions, it is best to set the caching option to 'none' (to avoid the cache completely, which reduces data copies and bus traffic). * if you are using files, and you are not in the need of storage guarantees, for example for testing purposes, you can choose 'writeback caching' which offers maximum speed. * if your guest supports it, use the 'virtio' interface. * don't the linux filesystem btrfs on the host for image files. It will result in low I/O performance. ===enlarge diskspace inside a VPS=== To enlarge a disk of a VPS , use the following procedure: On the VM-host: In case of raw-devices, enlarge the logical-volume: lvresize -L +2G /dev/vg01/VPSfoo01 to grow the disk/LUN of VPSfoo with 2G In case of file-devices: cd /var/lib/libvirt/images dd if=/dev/null of=VPSfoo.img bs=1 count=0 seek=10G to grow the disk/LUN of VPSfoo to 10G After this, stop and start the VPS. This is needed, because the new disk-geometry has to be read by the kernel. In the VPS: Now start fdisk, and it will detect a larger disk. Add a partition, or repartition the partition at the end of the disk (you did have this procedure in mind, when designing the disk-layout, didn't you :-) Now run pvresize on this partition, and after this, the volume-group has free-extents. Use these to grow one or more logical-volumes. ===add a lun to a VPS=== To attach a storage lun to a running VPS , use the following procedure: On the VM-host: Create a new raw-device (and optionally make a symlink). Attach the raw-device to the vps (in this case, the name is vpsfoo2 and it is the second virtio-disk aka vdb): virsh attach-disk vpsfoo2 /var/lib/libvirt/images/vpsfoo2data.img vdb --targetbus virtio --live ======Migration of VMs to another host ====== Instructions how to migrate VMs to another hypervisor-host. ====Offline==== Create a destination KVM-hypervisor system, including bridges on the required networks and VLANs. Try to use the same names for bridges, filesystems, logical-volumes. Else use "virsh edit" to make the modifications befor starting the VM on the destination hypervisor. ===On the source-hypervisor=== create a definition-file: virsh list --all virsh dumpxml --security-info vpstest2 > /var/lib/libvirt/images/vpstest2.xml virsh shutdown vpstest2 virsh destroy vpstest2 # if needed ===On the destination-hypervisor=== Create the required logical-volumes, and symlinks: lvcreate -L 4G -n vpstest2 vg0 ln -s /dev/mapper/vg0-vpstest2 /var/lib/libvirt/images/vpstest2.img And get the raw-logical-volume with a dd piped through ssh: ssh root@sourcehyp "dd if=/dev/mapper/vg0-vpstest2" | dd of=/dev/mapper/vg0-vpstest2 And get the config-definition-file: scp root@sourcehyp:/var/lib/libvirt/images/vpstest2.xml /var/lib/libvirt/images/vpstest2.xml And create the VM: virsh define /var/lib/libvirt/images/vpstest2.xml And start the VM: virsh start vpstest2