This guide explains how to reinstall a compute node from Centos8 Stream-Yoga to AlmaLinux9-Yoga
In the following instructions the node cld-np-19, an INFN node, is the one to be reinstalled
First of all take note of the IP addresses used by the VM for the management and data networks (192.168.60.x and 192.168.61.x) ) and the relevant interfaces
[root@cld-np-19 ~]# nmcli eno1: connected to System eno1 "Intel X710" ethernet (i40e), 34:E6:D7:F5:53:0D, hw, port e4434b3a8890, mtu 1500 ip4 default inet4 192.168.60.129/24 route4 192.168.60.0/24 metric 100 route4 default via 192.168.60.254 metric 100 inet6 fe80::36e6:d7ff:fef5:530d/64 route6 fe80::/64 metric 1024 eno3: connected to eno3 "Intel X710" ethernet (i40e), 34:E6:D7:F5:53:0F, hw, port e4434b3a8892, mtu 9000 inet4 192.168.61.129/24 route4 192.168.61.0/24 metric 101 inet6 fe80::65:ec7c:7325:3bec/64 route6 fe80::/64 metric 1024 ... ... |
Then disable the compute node, so that no new VMs will be instantiated on this compute node:
openstack compute service set --disable cld-np-19.cloud.pd.infn.it nova-compute |
To check if the compute node was actually disabled:
openstack compute service list |
Then finds all the virtual machines instantiated on this compute node:
openstack server list --all --host cld-np-19.cloud.pd.infn.it |
Then migrate each VM instantiated on this compute node to other hypervisors.
To avoid the migration of the same virtual machine multiple times, please migrate the VM on a compute node that was already migrated to AlmaLinux9
The migration of a virtual machine usually requires a downtime (~ 15') of that VM, unless the VM was created using a volume. So before migrating the VM you must agree this operation with the relevant owner.
E.g. to find the owner of a VM with UUID be832766-7996-4ccf-a9c1-5eec2e5f8fd3:
[root@cld-ctrl-01 ~]# openstack server show be832766-7996-4ccf-a9c1-5eec2e5f8fd3 | grep user_id | user_id | 7681252c430040e7bb96c7c4f7c88464 | [root@cld-ctrl-01 ~]# openstack user show 7681252c430040e7bb96c7c4f7c88464 +---------------------+----------------------------------+ | Field | Value | +---------------------+----------------------------------+ | domain_id | default | | email | Ysabella.Ong@lnl.infn.it | | enabled | True | | id | 7681252c430040e7bb96c7c4f7c88464 | | name | ysaong@infn.it | | options | {} | | password_expires_at | None | +---------------------+----------------------------------+ |
To migrate a virtual machine to another compute node, there are two possible procedures:
Once there are no more VMs on that compute node, reinstall the node with AlmaLinux9 using foreman
When the node restarts after the update, make sure that SELinux is disabled:
[root@cld-np-19 ~]# getenforce Disabled |
Then do an update of the packages (probabily only puppet will be updated):
[root@cld-np-19 ~]# yum clean all [root@cld-np-19 ~]# yum update -y |
Once the node has been reinstalled with AlmaLinux9, configure the data network
TBC
The address to be used must be the original one
MTU must be 9000
If this is a DELL host, reinstall DELL OpenManage, as explained here
Check on Nagios that the relevant checks get green
Then configure the node as Openstack compute node using puppet
Stop puppet:
systemctl stop puppet |
In foreman move the host under the ComputeNode-Prod_Yoga hostgroup
Run puppet manually:
puppet agent -t |
If the configuration fails reporting
error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key error: "net.bridge.bridge-nf-call-iptables" is an unknown key error: "net.bridge.bridge-nf-call-arptables" is an unknown key |
then please issue:
modprobe br_netfilter |
and then rerun puppet
If the procedure terminated without error, enable puppet and reboot the host
systemctl enable puppet shutdown -r now |
Enable the nagios check for lv
Log on cld-nagios.cloud.pd.infn.it
cd /etc/nagios/objects/ |
Edit cloudcomputenodes.cfg (or the other file where this compute node is defined) and add a passive check:
define service{ use LVS ; Name of service template to use host_name cld-np-19 service_description LVS freshness_threshold 28800 } |
Make sure there are no problems in the configuration:
[root@cld-nagios objects]# nagios -v /etc/nagios/nagios.cfg |
and restart nagios to have applied the new check:
[root@cld-nagios objects]# systemctl restart nagios |
openstack compute service set --enable cld-np-19.cloud.pd.infn.it nova-compute |
Within 24 hours, the "VM network" nagios check should get green