Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Disable SELinux

Install ceph:


For C7:


Code Block
languagebash
rpm -Uvh https://download.ceph.com/rpm-nautilus/el7/noarch/ceph-release-1-1.el7.noarch.rpm
yum install yum-plugin-priorities
yum clean all
yum update
yum install ceph

For C8Add the following lines to /etc/security/limits.conf:


Code Block
languagebash
rpm -Uvh https://download.ceph.com/rpm-nautilus/el8/noarch/ceph-release-1-1.el8.noarch.rpm


Then:

Code Block
languagebash
yum clean all
yum update
yum install ceph


Add the following lines to /etc/security/limits.conf:

Code Block
languagebash
* soft nofile 65536
* hard nofile 65536* soft nofile 65536
* hard nofile 65536


Add the following lines to /etc/sysctl.conf (to prevent "page allocation failure" errors, and to prevent swapping):

...

Code Block
languagebash
# cp /boot/grub2/grub.cfg ~
# grub2-mkconfig -o /boot/grub2/grub.cfg

...


Stop and disable puppet, and then reboot:

...


Move the host in the hosts_all/CephProd hostgroup(hosts_all/CephProd-C8 dor CentOS8) hostgroup 

Run once puppet:

Code Block
languagebash
puppet agent -t

Enable the nagios sensors.

...

Copy from a ceph-mon-xx host the file /etc/ceph/ceph.client.admin.keyring and sets its owernship and mode that should be:

Code Block
languagebash
-rw-------. 1 ceph ceph 137 Feb 20 13:51 /etc/ceph/ceph.client.admin.keyring


If it doesn't exist yet, create the rack in the ceph crush map:

Code Block
languagebash
ceph osd crush add-bucket Rack12-PianoAlto rack

...


ceph osd crush move Rack12-PianoAlto root=default


In the considered example, there are 10 SATA disks (/dev/sdc .. /dev/sdl) and 2 SSD disks (/dev/sda and /dev/sdb)

...

Prepare the disks for block and block.db:

Code Block
languagebash
# Block

...


echo "vgcreate on SATA disks..."

...


vgcreate ceph-block-50 /dev/sdc

...


vgcreate ceph-block-51 /dev/sdd

...


vgcreate ceph-block-52 /dev/sde

...


vgcreate ceph-block-53 /dev/sdf

...


vgcreate ceph-block-54 /dev/sdg

...


vgcreate ceph-block-55 /dev/sdh

...


vgcreate ceph-block-56 /dev/sdi

...


vgcreate ceph-block-57 /dev/sdj

...


vgcreate ceph-block-58 /dev/sdk

...


vgcreate ceph-block-59 /dev/sdl

...


echo "lvcreate on SATA disks..."

...


lvcreate -l 100%FREE -n block-50 ceph-block-50

...


lvcreate -l 100%FREE -n block-51 ceph-block-51

...


lvcreate -l 100%FREE -n block-52 ceph-block-52

...


lvcreate -l 100%FREE -n block-53 ceph-block-53

...


lvcreate -l 100%FREE -n block-54 ceph-block-54

...


lvcreate -l 100%FREE -n block-55 ceph-block-55

...


lvcreate -l 100%FREE -n block-56 ceph-block-56

...


lvcreate -l 100%FREE -n block-57 ceph-block-57

...


lvcreate -l 100%FREE -n block-58 ceph-block-58

...


lvcreate -l 100%FREE -n block-59 ceph-block-59

...


#

...


# Block.db

...


echo "vgcreate on SSD disks..."

...


vgcreate ceph-db-50-54 /dev/sda

...


vgcreate ceph-db-55-59 /dev/sdb

...


echo "lvcreate on SSD disks..."

...


lvcreate -L 89GB -n db-50 ceph-db-50-54

...


lvcreate -L 89GB -n db-51 ceph-db-50-54

...


lvcreate -L 89GB -n db-52 ceph-db-50-54

...


lvcreate -L 89GB -n db-53 ceph-db-50-54

...


lvcreate -L 89GB -n db-54 ceph-db-50-54

...


lvcreate -L 89GB -n db-55 ceph-db-55-59

...


lvcreate -L 89GB -n db-56 ceph-db-55-59

...


lvcreate -L 89GB -n db-57 ceph-db-55-59

...


lvcreate -L 89GB -n db-58 ceph-db-55-59

...


lvcreate -L 89GB -n db-59 ceph-db-55-59


Possible error with vgcreate:

Code Block
languagebash
[root@c-osd-5 /]# vgcreate ceph-block-12 /dev/vdb

...


Device /dev/vdb excluded by a filter.


This is because the disk has a GPT. Lets delete it with gdisk:


Code Block
languagebash
[root@c-osd-5 /]# gdisk /dev/vdb

...

 
GPT fdisk (gdisk) version 0.8.10

...



Partition table scan:

...


MBR: protective

...


BSD: not present

...


APM: not present

...


GPT: present

...



Found valid GPT with protective MBR; using GPT.

...



Command (? for help): x

...



Expert command (? for help): ?

...


a set attributes

...


c change partition GUID

...


d display the sector alignment value

...


e relocate backup data structures to the end of the disk

...


g change disk GUID

...


h recompute CHS values in protective/hybrid MBR

...


i show detailed information on a partition

...


l set the sector alignment value

...


m return to main menu

...


n create a new protective MBR

...


o print protective MBR data

...


p print the partition table

...


q quit without saving changes

...


r recovery and transformation options (experts only)

...


s resize partition table

...


t transpose two partition table entries

...


u replicate partition table on new device

...


v verify disk

...


w write table to disk and exit

...


z zap (destroy) GPT data structures and exit

...


? print this menu

...



Expert command (? for help): z

...


About to wipe out GPT on /dev/vdb. Proceed? (Y/N): y

...


GPT data structures destroyed! You may now partition the disk using fdisk or

...


other utilities.

...


Blank out MBR? (Y/N):

...

 
Your option? (Y/N):

...

 Y



Then edit ceph.conf to enable the following lines:

osd memory target = 3221225472
If it doesn't not exist yet, create the file:

Code Block
languagebash
-rw------- 1 ceph ceph 71 Apr 28 12:18 /var/lib/ceph/bootstrap-osd/ceph

...

.keyring


Code Block
languagebash
# cat /var/lib/ceph/bootstrap-osd/ceph.keyring

...


[client.bootstrap-osd]

...


key = AQA+Y6hYQTvEHRAAr4Q/mwHCByv/kokqnu6nCA==


It must match with what appears for the 'client.bootstrap-osd' entry in the 'ceph auth export' output. You can copy the file from another OSD node.

Add (via puppet) the new OSDs in the ceph.conf file


Code Block
languagebash
...

...


...

...



[osd.50]

...


host = ceph-osd-06 #manual deployments only.

...


public addr = 192.168.61.235

...


cluster addr = 192.168.222.235

...


osd memory target = 3221225472

...




[osd.51]

...


host = ceph-osd-06 #manual deployments only.

...


public addr = 192.168.61.235

...


cluster addr = 192.168.222.235

...


osd memory target = 3221225472

...



...

...


...


Run puppet once to have the file updated on the new OSD node

Code Block
languagebash
puppet agent -t

Disable data movements:


Code Block
languagebash
[root@c-osd-1 /]# ceph osd set norebalance

...


norebalance is set

...



[root@c-osd-1 /]# ceph osd set nobackfill

...


nobackfill is set

...



[root@c-osd-1 /]# ceph osd set noout
noout is set

...



Create a first OSD:

Code Block
languagebash
ceph-volume lvm create --bluestore --data ceph-block-50/block-50 --block.db ceph-db-50-54/db-50


The above command could trigger some data movement

...

Then move this host (ceph-osd-06 in our example) in the relevant rack:

Code Block
languagebash
ceph osd crush move ceph-osd-06 rack=Rack12-PianoAlto


Ri-verify with ceph osd df and ceph osd tree.

Verify that the OSD is using the right vgs:


Code Block
languagebash
[root@ceph-osd-06 ~]# ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-50

...


infering bluefs devices from bluestore path

...


{

...


"/var/lib/ceph/osd/ceph-50/block": {

...


"osd_uuid": "dc72b996-d035-4dcd-ba42-1a6433eb78f7",

...


"size": 10000827154432,

...


"btime": "2019-02-19 11:55:47.553215",

...


"description": "main",

...


"bluefs": "1",

...


"ceph_fsid": "8162f291-00b6-4b40-a8b4-1981a8c09b64",

...


"kv_backend": "rocksdb",

...


"magic": "ceph osd volume v026",

...


"mkfs_done": "yes",

...


"osd_key": "AQCu4Gtc+jKSJhAAKzaAAYuTKWZs9rjJlBXWww==",

...


"ready": "ready",

...


"whoami": "50"

...


},

...


"/var/lib/ceph/osd/ceph-50/block.db": {

...


"osd_uuid": "dc72b996-d035-4dcd-ba42-1a6433eb78f7",

...


"size": 95563022336,

...


"btime": "2019-02-19 11:55:47.573213",

...


"description": "bluefs db"

...


}

...


}

...


[root@ceph-osd-06 ~]# ls -l /var/lib/ceph/osd/ceph-50/block

...


lrwxrwxrwx 1 ceph ceph 27 Feb 19 12:23 /var/lib/ceph/osd/ceph-50/block -> /dev/ceph-block-50/block-50

...


[root@ceph-osd-06 ~]# ls -l /var/lib/ceph/osd/ceph-50/block.db

...


lrwxrwxrwx 1 ceph ceph 24 Feb 19 12:23 /var/lib/ceph/osd/ceph-50/block.db -> /dev/ceph-db-50-54/db-50

...


[root@ceph-osd-06 ~]# 


Create the other OSDs (use also –osd-id if needed, e.g. when migrating OSDs from filestore to bluestore):):



Code Block
languagebash
ceph-volume lvm create --bluestore --data ceph-block-51/block-51 --block.db ceph-db-50-54/db-51

...


ceph-volume lvm create --bluestore --data ceph-block-52/block-52 --block.db ceph-db-50-54/db-52

...


ceph-volume lvm create --bluestore --data ceph-block-53/block-53 --block.db ceph-db-50-54/db-53

...


ceph-volume lvm create --bluestore --data ceph-block-54/block-54 --block.db ceph-db-50-54/db-54

...


ceph-volume lvm create --bluestore --data ceph-block-55/block-55 --block.db ceph-db-55-59/db-55

...


ceph-volume lvm create --bluestore --data ceph-block-56/block-56 --block.db ceph-db-55-59/db-56

...


ceph-volume lvm create --bluestore --data ceph-block-57/block-57 --block.db ceph-db-55-59/db-57

...


ceph-volume lvm create --bluestore --data ceph-block-58/block-58 --block.db ceph-db-55-59/db-58

...


ceph-volume lvm create --bluestore --data ceph-block-59/block-59 --block.db ceph-db-55-59/db-59




Reboot the new osd node:

Code Block
languagebash
shutdown -r now


Verify that the new OSDs are up.

...

Verify that all buckets are using straw2:

Code Block
languagebash
ceph osd getcrushmap -o crush.map; crushtool -d crush.map | grep straw; rm -f crush.map


If not (i.e. if some are using straw), run the following command:

Code Block
languagebash
ceph osd crush set-all-straw-buckets-to-straw2 

Warning: this could trigger a data rebalance

Enable and start puppet:

Code Block
languagebash
systemctl start puppet

...


systemctl enable puppet


Then, after a few minutes, check that "ceph status" doesn't report Pgs in peering.

Then:

Code Block
languagebash
[root@c-osd-1 /]# ceph osd unset nobackfill

...


nobackfill is unset

...


[root@c-osd-1 /]# ceph osd unset norebalance

...


norebalance is unset

...


[root@c-osd-1 /]# ceph osd unset noout

...


noout is unset

...


[root@c-osd-1 /]# 


This should trigger a data movement.