内容简介:在使用Rook部署的Ceph集群里,配置好CephFS后,在比较长的目录下,读写文件失败。现象如下:使用ceph-fuse客户端时:
在使用Rook部署的Ceph集群里,配置好CephFS后,在比较长的目录下,读写文件失败。
现象如下:
使用ceph-fuse客户端时:
root@ceph3:/mnt/test/volumes/kubernetes/kubernetes/kubernetes-dynamic-pvc-9adbb10c-f86a-11e8-96ff-9247c38478e0# cat fox cat: fox: Input/output error
使用kernel client时:
root@ceph3:/mnt/test/volumes/kubernetes/kubernetes/kubernetes-dynamic-pvc-9adbb10c-f86a-11e8-96ff-9247c38478e0# cat fox cat: fox: File name too long
问题分析
打开ceph-fuse client的log:
# ceph-fuse -m 10.10.15.89:6790,10.10.15.198:6790 /mnt/test/ -n client.admin --keyring=./yangguanjun/keyring --debug-client=20
然后在 /var/log/ceph/里查看log,有err: -36
2018-12-05 18:42:04.908612 7fd5955bc700 3 client.5213 ll_read 0x565512bf8760 0x10000000007 0~4096 2018-12-05 18:42:04.910114 7fd599dc5700 10 client.5213 ms_handle_connect on 172.16.1.54:6800/33886 2018-12-05 18:42:04.910887 7fd5955bc700 10 client.5213 check_pool_perm on pool 2 ns fsvolumens_kubernetes-dynamic-pvc-9adbb10c-f86a-11e8-96ff-9247c38478e0 rd_err = -36 wr_err = -36 2018-12-05 18:42:04.910907 7fd5955bc700 10 client.5213 check_pool_perm on pool 2 ns fsvolumens_kubernetes-dynamic-pvc-9adbb10c-f86a-11e8-96ff-9247c38478e0 rd_err = -36 wr_err = -36
查看 linux 里的 include/uapi/asm-generic/errno.h
,有如下定义:
#define ENAMETOOLONG 36 /* File name too long */
因为这个错误跟ceph-fuse和kernel client无关,所以猜测是osd相关地方的问题;
然后搜索ceph代码,osd相关的地方有如下几处检查:
文件:osd/PrimaryLogPG.cc
/** do_op - do an op * pg lock will be held (if multithreaded) * osd_lock NOT held. */ void PrimaryLogPG::do_op(OpRequestRef& op) { ... // object name too long? if (m->get_oid().name.size() > cct->_conf->osd_max_object_name_len) { dout(4) << "do_op name is longer than " << cct->_conf->osd_max_object_name_len << " bytes" << dendl; osd->reply_op_error(op, -ENAMETOOLONG); return; } if (m->get_hobj().get_key().size() > cct->_conf->osd_max_object_name_len) { dout(4) << "do_op locator is longer than " << cct->_conf->osd_max_object_name_len << " bytes" << dendl; osd->reply_op_error(op, -ENAMETOOLONG); return; } if (m->get_hobj().nspace.size() > cct->_conf->osd_max_object_namespace_len) { dout(4) << "do_op namespace is longer than " << cct->_conf->osd_max_object_namespace_len << " bytes" << dendl; osd->reply_op_error(op, -ENAMETOOLONG); return; } ... }
于是打开OSD的log:
[root@rook-ceph-tools /]# ceph tell osd.0 config set debug_osd 5 Set debug_osd to 5/5 [root@rook-ceph-tools /]# ceph tell osd.1 config set debug_osd 5 Set debug_osd to 5/5
然后继续测试重新问题,抓取osd的log:
# grep "longer than" * rook-ceph-osd-0-85f5bf454f-64w7d-ceph3.log:2018-12-05 11:14:38.864707 7fbe34194700 4 osd.0 pg_epoch: 24 pg[2.15( empty local-lis/les=21/22 n=0 ec=20/20 lis/c 21/21 les/c/f 22/22/0 21/21/20) [0,1] r=0 lpr=21 crt=0'0 mlcod 0'0 active+clean] do_op namespace is longer than 64 bytes rook-ceph-osd-0-85f5bf454f-64w7d-ceph3.log:2018-12-05 11:14:38.864853 7fbe3819c700 4 osd.0 pg_epoch: 24 pg[2.15( empty local-lis/les=21/22 n=0 ec=20/20 lis/c 21/21 les/c/f 22/22/0 21/21/20) [0,1] r=0 lpr=21 crt=0'0 mlcod 0'0 active+clean] do_op namespace is longer than 64 bytes
而对应代码处的检查为:
if (m->get_hobj().nspace.size() > cct->_conf->osd_max_object_namespace_len)
查看osd相关的配置:
[root@rook-ceph-osd-0-85f5bf454f-64w7d-ceph3 ceph]# cat ceph.conf ... osd max object name len = 256 osd max object namespace len = 64 ...
(o゜▽゜)o☆[BINGO!],找到原因了,哪为什么会有这个配置呢??
在Rook的代码里有下面代码,文件:pkg/daemon/ceph/osd/device.go
func writeConfigFile(cfg *osdConfig, context *clusterd.Context, cluster *cephconfig.ClusterInfo, location string) error { cephConfig := cephconfig.CreateDefaultCephConfig(context, cluster, cfg.rootPath) if isBluestore(cfg) { cephConfig.GlobalConfig.OsdObjectStore = config.Bluestore } else { cephConfig.GlobalConfig.OsdObjectStore = config.Filestore } cephConfig.CrushLocation = location if cfg.dir || isFilestoreDevice(cfg) { // using the local file system requires some config overrides // http://docs.ceph.com/docs/jewel/rados/configuration/filesystem-recommendations/#not-recommended cephConfig.GlobalConfig.OsdMaxObjectNameLen = 256 cephConfig.GlobalConfig.OsdMaxObjectNamespaceLen = 64 } ... }
可以看出这个配置项是在配置OSD使用目录 或 使用FileStore时添加的。
查看Ceph关于Filestore里的说明,指出在ext4的文件系统里,因为xattrs长度的限制,启动Filestore会被限制,用户可以在确定object name比较短的应用场景里,配置下面的两个参数来使用ext4的Filestore。
osd max object name len = 256 osd max object namespace len = 64
参考: http://docs.ceph.com/docs/jewel/rados/configuration/filesystem-recommendations/#not-recommended
查看我们的配置,cluster.yaml里对OSD的配置如下:
... storage: useAllNodes: false useAllDevices: false deviceFilter: config: storeType: bluestore nodes: - name: ceph2 devices: - name: sdb1 - name: ceph3 devices: - name: sdb1
而Rook在配置OSD时候,是不支持配置分区的。若配置为分区时,实际上Rook代码检查端会跳过所有有分区的盘,然后默认使用 /var/lib/rook/osd<id>/
这个目录来创建OSD,如下:
# ll /var/lib/rook/osd0/ total 3344 drwxr--r-- 3 root root 4096 Dec 5 16:44 ./ drwxr-xr-x 5 root root 4096 Dec 5 16:44 ../ lrwxrwxrwx 1 root root 34 Dec 5 16:44 block -> /var/lib/rook/osd0/bluestore-block lrwxrwxrwx 1 root root 31 Dec 5 16:44 block.db -> /var/lib/rook/osd0/bluestore-db lrwxrwxrwx 1 root root 32 Dec 5 16:44 block.wal -> /var/lib/rook/osd0/bluestore-wal -rw-r--r-- 1 root root 2 Dec 5 16:44 bluefs -rw-r--r-- 1 root root 472432779264 Dec 5 19:22 bluestore-block -rw-r--r-- 1 root root 1073741824 Dec 5 16:44 bluestore-db -rw-r--r-- 1 root root 603979776 Dec 5 19:27 bluestore-wal
解决办法
临时方法
修改 osd_max_object_namespace_len
为更长的值即可。
[root@rook-ceph-tools /]# ceph tell osd.0 config set osd_max_object_namespace_len 256 Set osd_max_object_namespace_len to 256 [root@rook-ceph-tools /]# ceph tell osd.1 config set osd_max_object_namespace_len 256 Set osd_max_object_namespace_len to 256
推荐方法
Ceph OSD使用BlueStore,在Rook对应Ceph集群的clustre.yaml文件里,指定OSD使用整块磁盘。
... storage: useAllNodes: false useAllDevices: false deviceFilter: config: storeType: bluestore nodes: - name: ceph2 devices: - name: sdb - name: ceph3 devices: - name: sdb
注:指定的磁盘要创建GPT Header,并且删除所有分区
以上所述就是小编给大家介绍的《Rook部署的Ceph系统中CephFS的一个小问题》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!
猜你喜欢:- 总结—Harbor仓库部署和使用问题集锦
- 全网IPv6部署带来的现实问题
- 原创 Wagon部署springboot项目读取配置文件错误问题
- linux 部署golang 项目(直接部署和基于nginx部署)
- 部署策略对比:蓝绿部署、金丝雀发布及其他
- 使用Docker容器化部署实践之Django应用部署(一)
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Data Structures and Algorithm Analysis in Java
Mark A. Weiss / Pearson / 2011-11-18 / GBP 129.99
Data Structures and Algorithm Analysis in Java is an “advanced algorithms” book that fits between traditional CS2 and Algorithms Analysis courses. In the old ACM Curriculum Guidelines, this course wa......一起来看看 《Data Structures and Algorithm Analysis in Java》 这本书的介绍吧!