当前位置: 代码迷 >> 综合 >> ZFS CoW and Ceph CoW and CSI (by quqi99)
  详细解决方案

ZFS CoW and Ceph CoW and CSI (by quqi99)

热度:45   发布时间:2023-12-13 08:55:16.0

作者:张华 发表于:2021-03-08
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明

ZFS Basic Usage

sudo apt install zfsutils-linux
#sudo zpool create new-pool /dev/sdb /dev/sdc         #striped pool, raid-0
#sudo zpool create new-pool mirror /dev/sdb /dev/sdc  #mirrored pool, raid-1
sudo zpool create -m /usr/share/pool new-pool mirror /dev/sdb /dev/sdc #-m is mount point
sudo zpool add new-pool /dev/sdx
sudo zpool status
zfs list
sudo zpool destroy new-pool
sudo zfs snapshot <pool-id>
sudo zfs rollback <pool-id>

ZFS CoW Test

使用vim或rsync修改ZFS上的文件会导制整个文件重新写入,以CoW写入 (ZFS是文件系统级别CoW, Ceph是块级别Cow)

sudo apt install zfs -y && sudo /sbin/modprobe zfs
dd if=/dev/zero of=/tmp/disk0  bs=1M  count=100
sudo zpool create testpool /tmp/disk0
sudo zfs create testpool/sdc
sudo zfs get recordsize testpool/sdc
sudo cp ~/.vimrc /testpool/sdc/vimrc
$ ls -i /testpool/sdc/
256 vimrc
$ zdb -ddddd testpool/sdc 256
...
Indirect blocks:0 L0 0:4ae00:3800 3800L/3800P F=1 B=152/152 cksum=52e2222c06f:25cf8696e8501d:b54096cb07913e1f:d6eba6a36d42e991zdb -R testpool 0:4ae00:3800:r > /tmp/file
$ diff /tmp/file /testpool/sdc/vimrc 
459d458
< 
\ No newline at end of fileecho 'modify vimrc' |sudo tee -a /testpool/sdc/vimrc
$ ls -i /testpool/sdc/vimrc 
256 /testpool/sdc/vimrc
$ zdb -ddddd testpool/sdc 256
0 L0 0:69e00:3800 3800L/3800P F=1 B=189/189 cksum=52f651a16b5:25cf8cebaf016b:b54096de7b2dc560:d6eba6d2669c6ef3zdb -R testpool 0:4ae00:3800:r > /tmp/file_old
zdb -R testpool 0:69e00:3800:r > /tmp/file_new
#vimdiff /tmp/file_old /tmp/file2
$ diff /tmp/file_old /tmp/file_new
459c459,462
< 
\ No newline at end of file
---
> modify vimrcsudo cp /testpool/sdc/vimrc /tmp/vimrc
echo 'xxxxxxxxx' |sudo tee -a /tmp/vimrc
$ sudo rsync -av --inplace --no-whole-file /tmp/vimrc /testpool/sdc/vimrc
sending incremental file list
vimrc
sent 506 bytes  received 161 bytes  1,334.00 bytes/sec
total size is 14,335  speedup is 21.49
$ zdb -ddddd testpool/sdc 256
...
0 L0 0:84e00:3800 3800L/3800P F=1 B=318/318 cksum=53056157fa5:25cf8f4613d2d3:b54096e2b7747740:d6eba6d8fd3d794b
zdb -R testpool 0:84e00:3800:r > /tmp/file_after_rsync_replace

Ceph CoW Test

ceph snapshot是把源volume基于CoW做一个只读副本,以后用于恢复。
ceph clone可以基于snapshot来做clone, 也是基于CoW。即基于snapshot的clone只创建了映射到源(这里是快照)的逻辑,没有给clone分配真实的物理空间。虽然快照是只读的,但是基于快照创建的克隆是可读可写的。当我们给clone的镜像写操作时系统才会真正的给clone的镜像分配物理空间。明白了上面的道理所以我们知道从快照克隆的镜像时依赖于快照的,一旦快照被删除则这个克隆镜像也就毁了,所以我们要保护好这个快照。

./generate-bundle.sh --ceph --name k8s --create-model --run
juju scp kubernetes-master/0:config ~/.kube/config
juju ssh ceph-mon/0 -- sudo -s
ceph -s
ceph df
ceph osd tree
ceph osd lspools  #rados lspools
rbd ls <pool>sudo modprobe rbd
#pg_num=128(<5 OSDs), pg_num=512(5~10 OSDs), pg_num=4096(10~50 OSDs)
# pg_num(https://ceph.com/pgcalc/) (> 50 OSDs)
ceph osd pool create testpool 128
sudo rbd create --size 1 -p testpool test1
sudo rbd ls testpool
sudo rbd info testpool/test1#create snap
sudo rbd snap create testpool/test1@test1-snap
# sudo rbd snap list testpool/test1
SNAPID  NAME        SIZE   PROTECTED  TIMESTAMP               4  test1-snap  1 MiB             Mon Mar  8 02:57:24 2021# create clone
sudo rbd snap protect testpool/test1@test1-snap
sudo rbd clone testpool/test1@test1-snap testpool/test1-snap-clone
sudo rbd snap unprotect testpool/test1@test1-snap
#sudo rbd snap rm testpool/test1@test1-snap
# sudo rbd ls testpool
test1
test1-snap-clone
# sudo rbd info testpool/test1-snap-clone |grep parentparent: testpool/test1@test1-snap#如果不想被依赖于快照,需要对克隆和快照做一个合并
sudo rbd flatten testpool/test1-snap-clone
# sudo rbd info testpool/test1-snap-clone |grep parent
#

k8s使用ceph-csi消费RBD作为持久化存储

#set up ceph k8s env
./generate-bundle.sh --ceph --name k8s --create-model --run
juju scp kubernetes-master/0:config ~/.kube/configcharm yaml已经帮助部署了ceph-csi(git clone https://github.com/ceph/ceph-csi.git), 如果重头开始搭建见:
K8S使用ceph-csi持久化存储之RBD - https://mp.weixin.qq.com/s?__biz=MzAxMjk0MTYzNw==&mid=2247484067&idx=1&sn=43bdeb61540c088035d798ee42bbd076

k8s使用ceph-csi持久化存储之CephFS

k8s也可以在块的基础上来使用FS, 见 - https://www.cnblogs.com/wsjhk/p/13710577.html

k8s CSI插件开发

k8s CSI插件开发导读见 - https://www.jianshu.com/p/88ec8cba7507

ZFS CSI Driver - https://github.com/openebs/zfs-localpv
Ceph CSI Driver - https://github.com/ceph/ceph-csi

ZFS基于文件级实现CoW, Ceph基于块级实现CoW.

ZFS CSI Driver(https://github.com/openebs/zfs-localpv/blob/master/pkg/driver/controller.go)实现了Volume, Snap等生命周期管理的方法。

  • CreateZFSVolume
  • CreateSnapClone

Ceph CSI Driver 不仅实现了Volume, Snap的生命周期管理,也实现了Clone的生命周期管理,所以理论上Ceph CSI Driver也是支持CoW的。见:
https://github.com/ceph/ceph-csi/blob/devel/internal/rbd/clone.go
https://github.com/ceph/ceph-csi/blob/devel/internal/rbd/driver.go

Reference

[1] https://ubuntu.com/tutorials/setup-zfs-storage-pool#2-installing-zfs
[2] http://tim-tang.github.io/blog/2016/06/05/zfs-cow-deepin
[3] https://my.oschina.net/wangzilong/blog/1595081