Procedura indicativa

A grandi linee la procedura sara':

  

Azioni da fare prima di cominciare con l'installazione della release

systemctl stop puppet
systemctl disable puppet

Staccare tutti i router dall'L3 del controller-01

Trovare quali sono i router
openstack router list
+--------------------------------------+-------------+--------+-------+----------------------------------+-------------+------+
| ID                                   | Name        | Status | State | Project                          | Distributed | HA   |
+--------------------------------------+-------------+--------+-------+----------------------------------+-------------+------+
| 92e8b080-f3aa-4d9f-b3d4-613e0dbfd099 | Lan         | ACTIVE | UP    | 56c3f5c047e74a78a71438c4412e6e13 | False       | True |
| 9e31c216-0635-4d21-b7aa-63fe4aee875e | ext-to-vos  | ACTIVE | UP    | 56c3f5c047e74a78a71438c4412e6e13 | False       | True |
| eaa80135-6b79-44e0-b637-cef88d09b85c | CloudVeneto | ACTIVE | UP    | 56c3f5c047e74a78a71438c4412e6e13 | False       | True |
+--------------------------------------+-------------+--------+-------+----------------------------------+-------------+------+

trovare l'ip address nell'external_gateway_info per ogni router e vedere in quale controller e' attaccato

openstack router show 92e8b080-f3aa-4d9f-b3d4-613e0dbfd099

+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                   | Value                                                                                                                                           |
+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                                                                                              |
| availability_zone_hints |                                                                                                                                                 |
| availability_zones      | nova                                                                                                                                            |
| created_at              | 2018-11-28T16:06:24Z                                                                                                                            |
| description             |                                                                                                                                                 |
| distributed             | False                                                                                                                                           |
| enable_ndp_proxy        | None                                                                                                                                            |
| external_gateway_info   | {"network_id": "38356cfc-d83a-40f0-8604-09ddea12aa20", "external_fixed_ips": [{"subnet_id": "ec498b88-cbda-45d3-8f9f-174d335c6670",             |
|                         | "ip_address": "172.25.27.180"}], "enable_snat": false}                                                                                          |
...

[root@controller-02 ~]# ip netns exec qrouter-92e8b080-f3aa-4d9f-b3d4-613e0dbfd099 ip addr show | grep 172.25.27.180
    inet 172.25.27.180/24 scope global qg-fcc0f7ca-b4

lo stesso comando nel controller-01 non ritorna nulla quindi e' nel controller-02

analogamente per gli altri.

Se bisogna staccare dal controller-01 perche' in active, i comandi da eseguire sono ad esempio:

# for i in $(openstack router list -f value -c ID); do echo $i; openstack network agent list --agent-type l3 --sort-column Host --router $i --long; done 
92e8b080-f3aa-4d9f-b3d4-613e0dbfd099
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
| ID                                   | Agent Type | Host                           | Availability Zone | Alive | State | Binary           | HA State |
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
| aa34b512-89d8-4913-aee1-9f2d2fdf124c | L3 agent   | controller-01.cloud.pd.infn.it | nova              | :-)   | UP    | neutron-l3-agent | standby  |
| b91764b8-58a2-4ad6-a8fc-fd20aa664571 | L3 agent   | controller-02.cloud.pd.infn.it | nova              | :-)   | UP    | neutron-l3-agent | active   |
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
9e31c216-0635-4d21-b7aa-63fe4aee875e
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
| ID                                   | Agent Type | Host                           | Availability Zone | Alive | State | Binary           | HA State |
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
| aa34b512-89d8-4913-aee1-9f2d2fdf124c | L3 agent   | controller-01.cloud.pd.infn.it | nova              | :-)   | UP    | neutron-l3-agent | standby  |
| b91764b8-58a2-4ad6-a8fc-fd20aa664571 | L3 agent   | controller-02.cloud.pd.infn.it | nova              | :-)   | UP    | neutron-l3-agent | active   |
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
eaa80135-6b79-44e0-b637-cef88d09b85c
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
| ID                                   | Agent Type | Host                           | Availability Zone | Alive | State | Binary           | HA State |
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
| aa34b512-89d8-4913-aee1-9f2d2fdf124c | L3 agent   | controller-01.cloud.pd.infn.it | nova              | :-)   | UP    | neutron-l3-agent | active  |
| b91764b8-58a2-4ad6-a8fc-fd20aa664571 | L3 agent   | controller-02.cloud.pd.infn.it | nova              | :-)   | UP    | neutron-l3-agent | standby |
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+


quindi ad esempio per spostare il router eaa80135-6b79-44e0-b637-cef88d09b85c attivo attaccato all'agente L3 aa34b512-89d8-4913-aee1-9f2d2fdf124c del controller-01 nello 02 dovremmo fare:
openstack network agent remove router --l3 aa34b512-89d8-4913-aee1-9f2d2fdf124c eaa80135-6b79-44e0-b637-cef88d09b85c;

dopo un po' troveremo:
eaa80135-6b79-44e0-b637-cef88d09b85c
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
| ID                                   | Agent Type | Host                           | Availability Zone | Alive | State | Binary           | HA State |
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+
| aa34b512-89d8-4913-aee1-9f2d2fdf124c | L3 agent   | controller-01.cloud.pd.infn.it | nova              | :-)   | UP    | neutron-l3-agent | standby  |
| b91764b8-58a2-4ad6-a8fc-fd20aa664571 | L3 agent   | controller-02.cloud.pd.infn.it | nova              | :-)   | UP    | neutron-l3-agent | active   |
+--------------------------------------+------------+--------------------------------+-------------------+-------+-------+------------------+----------+


Installazione Epoxy nel controller-01




In https://cld-config.cloud.pd.infn.it/hosts/controller-xx.cloud.pd.infn.it editare l'host sostituendo l'hostgroup "hosts_all/ControllerNode-Test" con "hosts_all/ControllerNode_Test-Epoxy"

Nel controller poi eseguire
puppet agent -t 

Se ci sono problemi con i sertificati (di solito dopo il ripristino del clone), vedere procedura in https://confluence.infn.it/x/kw5-B

A questo punto tutti i servizi sono configurati sul controller-01


su -s /bin/sh -c "cinder-manage db sync" cinder

2026-03-17 11:57:43.085 212882 INFO cinder.db.migration [-] Applying migration(s)
2026-03-17 11:57:43.088 212882 INFO alembic.runtime.migration [-] Context impl MySQLImpl.
2026-03-17 11:57:43.088 212882 INFO alembic.runtime.migration [-] Will assume non-transactional DDL.
2026-03-17 11:57:43.132 212882 INFO cinder.db.migration [-] Migration(s) applied

Far partire il servizio sul controller1 
systemctl start openstack-cinder-api.service openstack-cinder-scheduler.service openstack-cinder-volume.service

Modificare l'HA per far puntare il servizio al controller1

Modificare l'HA proxy in modo che cinder punti al controller-01
in cld-config:
cp /etc/puppetlabs/code/environments/production/modules/cloudtest_haproxy/files/servizio_httpd_glance_nova_neutron_cinder_acceso01_spento02.cfg /etc/puppetlabs/code/environments/production/modules/cloudtest_haproxy/files/haproxy_el9.cfg
(controllare porta 8776)

Eseguire puppet nei tre haproxy
puppet agent -t

Stopparlo e disabilitarlo sul controller2
systemctl stop openstack-cinder-api.service openstack-cinder-scheduler.service openstack-cinder-volume.service
systemctl disable openstack-cinder-api.service openstack-cinder-scheduler.service openstack-cinder-volume.service

==============================================================================
Quando il controller2 sara' aggiornato rieseguire il online_data_migration
su -s /bin/sh -c "cinder-manage db online_data_migrations" cinder


su -s /bin/sh -c "heat-manage db_sync" heat

026-03-17 12:27:45.669 216268 INFO heat.db.migration [-] Applying migration(s)
2026-03-17 12:27:45.682 216268 INFO alembic.runtime.migration [-] Context impl MySQLImpl.
2026-03-17 12:27:45.682 216268 INFO alembic.runtime.migration [-] Will assume non-transactional DDL.
2026-03-17 12:27:45.689 216268 INFO alembic.runtime.migration [-] Context impl MySQLImpl.
2026-03-17 12:27:45.689 216268 INFO alembic.runtime.migration [-] Will assume non-transactional DDL.
2026-03-17 12:27:45.696 216268 INFO heat.db.migration [-] Migration(s) applied


Accendere il servizio sul controller1 
systemctl start openstack-heat-api.service \
  openstack-heat-api-cfn.service openstack-heat-engine.service

Modificare l'HA proxy in modo che heat punti al controller-01
in cld-config:
cp /etc/puppetlabs/code/environments/production/modules/cloudtest_haproxy/files/servizio_httpd_glance_nova_neutron_cinder_heat_acceso01_spento02.cfg /etc/puppetlabs/code/environments/production/modules/cloudtest_haproxy/files/haproxy_el9.cfg
(controllare porte 8000, 8004)

Eseguire puppet nei tre haproxy
puppet agent -t
e spegnerlo e disabilitarlo sul controller2

 systemctl stop openstack-heat-api.service \
  openstack-heat-api-cfn.service openstack-heat-engine.service
 
systemctl disable openstack-heat-api.service \
  openstack-heat-api-cfn.service openstack-heat-engine.service


A questo punto tutti i servizi puntano al controller1.

Installazione Epoxy nel controller-02


    
-->> QUI 31/03 <--


Far partire tutti i db mysql del cluster percona, accendendoli con ordine inverso allo spegnimento

[root@cld-db-test-05 ~]# systemctl start mysql
[root@cld-db-test-06 ~]# systemctl start mysql



Compute

Mettere in drain un nodo alla volta.

openstack compute service set --disable compute-01.cloud.pd.infn.it nova-compute

openstack compute service list


Per il singolo nodo in drain, migrare le VM con live migration quando possibile (altrimenti si spegne e si migra)

In foreman cambiamo la classe per Epoxy

Giro puppet 


In caso di nodi con VM che non possono essere migrate come fare l'update (vedi in passato)




----QUI---

Configurare il controller via puppet

Installare pacchetti openstack-heat-ui e python3-osc-placement


[root@controller-01 yum.repos.d]# yum install openstack-heat-ui
Last metadata expiration check: 1:28:25 ago on Thu 23 Jan 2025 04:37:45 PM CET.
Dependencies resolved.
==============================================================================================================================================================================================
 Package                                                Architecture                     Version                                     Repository                                          Size
==============================================================================================================================================================================================
Installing:
 openstack-heat-ui                                      noarch                           11.0.0-2.el9s                               centos-openstack-caracal                           892 k
Installing dependencies:
 python3-XStatic-Angular-UUID                           noarch                           0.0.4.0-13.el9s                             centos-openstack-caracal                            13 k
 python3-XStatic-Angular-Vis                            noarch                           4.16.0.0-10.el9s                            centos-openstack-caracal                            13 k
 python3-XStatic-FileSaver                              noarch                           1.3.2.0-10.el9s                             centos-openstack-caracal                            13 k
 python3-XStatic-JS-Yaml                                noarch                           3.8.1.0-11.el9s                             centos-openstack-caracal                            13 k
 python3-XStatic-Json2yaml                              noarch                           0.1.1.0-10.el9s                             centos-openstack-caracal                            13 k
 xstatic-angular-uuid-common                            noarch                           0.0.4.0-13.el9s                             centos-openstack-caracal                            11 k
 xstatic-angular-vis-common                             noarch                           4.16.0.0-10.el9s                            centos-openstack-caracal                           9.6 k
 xstatic-filesaver-common                               noarch                           1.3.2.0-10.el9s                             centos-openstack-caracal                            11 k
 xstatic-js-yaml-common                                 noarch                           3.8.1.0-11.el9s                             centos-openstack-caracal                            30 k
 xstatic-json2yaml-common                               noarch                           0.1.1.0-10.el9s                             centos-openstack-caracal                           9.2 k

Transaction Summary
=====================================================


[root@controller-01 keystone]# yum install python3-osc-placement
Last metadata expiration check: 2:05:32 ago on Thu 23 Jan 2025 04:37:45 PM CET.
Dependencies resolved.
==============================================================================================================================================================================================
 Package                                            Architecture                        Version                                   Repository                                             Size
==============================================================================================================================================================================================
Installing:
 python3-osc-placement                              noarch                              4.3.0-1.el9s                              centos-openstack-caracal                               51 k

Transaction Summary
============================================================

Rabbit per nova e neutron

in caracal abbiamo deciso si utilizzare un rabbit dedicato per il servizio nova, uno per il servizio neutron e uno per tutti gli altri servizi. Va quindi ridefinita cell

Servizio nova usa rabbit-03
transport_url = rabbit://openstack:RABBIT_zzz@192.168.60.225:5672

Servizio neutron usa rabbit-02
transport_url = rabbit://openstack:RABBIT_zzz@192.168.60.224:5672

gli altri servizi (no keystone) usano rabbit-01

[root@controller-01 etc]# nova-manage cell_v2 list_cells --verbose
+-------+--------------------------------------+----------------------------------------------------+----------------------------------------------------------------+----------+
|  Name |                 UUID                 |                   Transport URL                    |                      Database Connection                       | Disabled |
+-------+--------------------------------------+----------------------------------------------------+----------------------------------------------------------------+----------+
| cell0 | 00000000-0000-0000-0000-000000000000 |                     none://///                     | mysql+pymysql://nova:NOVA_xx_yyy@192.168.60.88:6306/nova_cell0 |  False   |
| cell1 | 8fc9fbbe-697a-4d92-9ff6-cba3feb50b8e | rabbit://openstack:RABBIT_zzz@192.168.60.223:5672 |    mysql+pymysql://nova:NOVA_xx_yyy@192.168.60.88:6306/nova    |  False   |
+-------+--------------------------------------+----------------------------------------------------+----------------------------------------------------------------+----------+


[root@controller-01 etc]# nova-manage cell_v2 update_cell --cell 8fc9fbbe-697a-4d92-9ff6-cba3feb50b8e --transport-url rabbit://openstack:RABBIT_zzz@192.168.60.225:5672 --database_connection mysql+pymysql://nova:NOVA_xx_yyy@192.168.60.88:6306/nova


[root@controller-01 etc]# nova-manage cell_v2 list_cells --verbose
+-------+--------------------------------------+----------------------------------------------------+----------------------------------------------------------------+----------+
|  Name |                 UUID                 |                   Transport URL                    |                      Database Connection                       | Disabled |
+-------+--------------------------------------+----------------------------------------------------+----------------------------------------------------------------+----------+
| cell0 | 00000000-0000-0000-0000-000000000000 |                     none://///                     | mysql+pymysql://nova:NOVA_xx_yyy@192.168.60.88:6306/nova_cell0 |  False   |
| cell1 | 8fc9fbbe-697a-4d92-9ff6-cba3feb50b8e | rabbit://openstack:RABBIT_zzz@192.168.60.225:5672 |    mysql+pymysql://nova:NOVA_xx_yyy@192.168.60.88:6306/nova    |  False   |
+-------+--------------------------------------+----------------------------------------------------+----------------------------------------------------------------+----------+

Aggiornare ceph a reef abilitando repo epel


[root@controller-01 log]# yum update \*ceph\* --enablerepo=epel
Last metadata expiration check: 1:23:23 ago on Mon 07 Apr 2025 12:52:12 PM CEST.
Dependencies resolved.
===================================================================================================================================================================================================================
 Package                                                  Architecture                              Version                                              Repository                                           Size
===================================================================================================================================================================================================================
Upgrading:
 abseil-cpp                                               x86_64                                    20211102.0-4.el9                                     epel                                                551 k
 ceph-common                                              x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                     18 M
 grpc-data                                                noarch                                    1.46.7-10.el9                                        epel                                                 19 k
 libarrow                                                 x86_64                                    9.0.0-13.el9                                         epel                                                4.4 M
 libarrow-doc                                             noarch                                    9.0.0-13.el9                                         epel                                                 25 k
 libcephfs2                                               x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    691 k
 librados2                                                x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    3.2 M
 libradosstriper1                                         x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    457 k
 librbd1                                                  x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    2.9 M
 librgw2                                                  x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    4.4 M
 parquet-libs                                             x86_64                                    9.0.0-13.el9                                         epel                                                838 k
 python3-ceph-argparse                                    x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                     46 k
 python3-ceph-common                                      x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    130 k
 python3-cephfs                                           x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    163 k
 python3-grpcio                                           x86_64                                    1.46.7-10.el9                                        epel                                                2.0 M
 python3-rados                                            x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    320 k
 python3-rbd                                              x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    299 k
 python3-rgw                                              x86_64                                    2:18.2.4-2.el9s                                      centos-ceph-reef                                    100 k
 re2                                                      x86_64                                    1:20211101-20.el9                                    epel                                                191 k
 thrift                                                   x86_64                                    0.15.0-4.el9                                         epel                                                1.6 M

Transaction Summary
===================================================================================================================================================================================================================
Upgrade  20 Packages

Riabilitare servizio puppet

systemctl enable puppet

Fare il reboot del nodo

shutdown -r now

Ricordarsi che il calendar delle GPU va installato a mano