User Tools

Site Tools


Site Tools

Install lustre

Prepare Installation

Set SELinux to disabled on MDS/MGS and OSS. The lustre will not work if SELinux is set to enabled or permissive (see error at bottom of this page).

On all, add Intel repositories by adding /etc/yum.repos.d/lustreintel.repo :

[lustre-server]
name=CentOS-$releasever - Lustre
baseurl=https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el6/server/
gpgcheck=0

[e2fsprogs]
name=CentOS-$releasever - Ldiskfs
baseurl=https://downloads.hpdd.intel.com/public/e2fsprogs/latest/el6/
gpgcheck=0

[lustre-client]
name=CentOS-$releasever - Lustre
baseurl=https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el6/client/
gpgcheck=0

Manual install

MDS and MGS

Install needed rpm, here for 2.7.0 lustre.

 yum install -y kernel-*lustre kernel-firmware*lustre lustre-modules libss libcom_err e2fsprogs e2fsprogs-libs lustre-osd-ldiskfs lustre-osd-ldiskfs-mount lustre

Then prepare network, and edit /etc/modprobe.d/lnet.conf. Here the network used for data is 172.16.0.x and from equipments its is linked to interface eth5.

options lnet ip2nets="tcp0(eth5) 172.16.0.*"

Note: if you are using IB and multirail with two ip, you can use the following configuration (assuming both IB networks are 172.20 and 172.21):

options lnet ip2nets="o2ib0(ib0,ib1) 172.[20-21].*"

Format partition used for MDT/MGS :

mkfs.lustre --reformat --fsname=toto --mgs --mdt --index=0 /dev/sdb

Create mount point :

mkdir /mnt/mdt

Activate lnet :

modprobe -v lnet
lctl net up

Mount :

mount -t lustre /dev/sdb /mnt/mdt/

OSS

Install needed packages :

yum install -y kernel-*lustre kernel-firmware*lustre lustre-modules libss libcom_err e2fsprogs e2fsprogs-libs lustre-osd-ldiskfs lustre-osd-ldiskfs-mount lustre

Edit /etc/modprobe.d/lnet.conf

options lnet ip2nets="tcp0(eth5) 172.16.0.*"

Format partition used for OST :

mkfs.lustre --reformat --ost --fsname=toto --mgsnode=mds1@tcp0 --index=0 /dev/sdb

Create mount point :

mkdir /ostoss_mount

Activate lnet :

modprobe -v lnet
lctl net up

Check lnet car reach mds/mgs :

lctl ping mds1@tcp0

Mount :

mount -t lustre /dev/sdb /ostoss_mount

Client

Edit /etc/modprobe.d/lnet.conf

options lnet ip2nets="tcp0(eth5) 172.16.0.*"

We want our client to have to kernel up to date. We need to build manually module for current kernel. Download lustre-client<yourlustrerelease>.src.rpm :

wget https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el6/client/SRPMS/lustre-client-2.7.0-2.6.32_504.8.1.el6.x86_64.src.rpm

Install needed packages :

yum install kernel-devel rpm-build make
yum install libtool libselinux-devel

Try to build :

rpmbuild --rebuild --without servers lustre-client-2.7.0-2.6.32_504.8.1.el6.x86_64.src.rpm 
Installing lustre-client-2.7.0-2.6.32_504.8.1.el6.x86_64.src.rpm
warning: user jenkins does not exist - using root
warning: group jenkins does not exist - using root
warning: user jenkins does not exist - using root
warning: group jenkins does not exist - using root
sed: can't read /lib/modules/2.6.32-573.el6.x86_64/build/include/linux/version.h: No such file or directory
sed: can't read /lib/modules/2.6.32-573.el6.x86_64/build/include/linux/version.h: No such file or directory
sed: can't read /lib/modules/2.6.32-573.el6.x86_64/build/include/linux/version.h: No such file or directory
sed: can't read /lib/modules/2.6.32-573.el6.x86_64/build/include/linux/version.h: No such file or directory
error: line 98: Empty tag: Release:

If you get this error, need to manually specify the position of kernel sources :

ln -s /usr/src/kernels/2.6.32-573.7.1.el6.x86_64 /usr/src/kernels/2.6.32-573.el6.x86_64

Now rebuild :

rpmbuild --rebuild --without servers lustre-client-2.7.0-2.6.32_504.8.1.el6.x86_64.src.rpm

Now rpm are available in .rpmbuild/RPMS/x86_64/

Install them locally :

yum localinstall lustre-client-modules-2.7.0-2.6.32_573.7.1.el6.x86_64.x86_64.rpm lustre-client-2.7.0-2.6.32_573.7.1.el6.x86_64.x86_64.rpm

Reboot. Now you can mount the FS :

mkdir /mnt/lustre
mount -t lustre mds1@tcp0:/toto /mnt/lustre

Install using shine

Shine is a tool from French CEA very usefull to automate lustre install and management. Shine will help to format and configure MGS/MDS/OSS. You only need to provide servers with already installed lustre packages and kernel, and lnet up. In this example, mds and mgs disks are separated (sdb and sdc).

Install shine manually :

yum localinstall shine-1.4-1.el6.noarch.rpm --nogpgcheck

Now edit /etc/shine/shine.conf as following :

#
#   Configuration file for shine, a Lustre administration utility.
#   See "man shine.conf" for a complete description of this file.
#

#
# Default LMF files (.lmf) search directory.
#
lmf_dir=/etc/shine/models

#
# File describing Lustre tuning parameters
# (to enable, create the file and uncomment the following line)
#
#tuning_file=/etc/shine/tuning.conf

#
# Directory path where to store generated configuration files.
#
conf_dir=/var/cache/shine/conf


#
# The Lustre version managed by Shine.
#
#ie: lustre_version=1.8.5

lustre_version=2.7.0

#
# CLI
#

# Tell if colored output should be used on command line interface.
#
color=auto


#
# BACKEND 
#

#
# Backend system to use. Its primary goal is to centralize devices information,
# but it can also be used to store Lustre file systems states.
# Match internal backend module name:
# 
#   Shine.Configuration.Backend.*
# 
# Possible values are (case-sensitive):
#
#   None        No backend. Each model file provides required devices
#               information. Recommended for simple, small configs (default).
#
#   File        Built-in File backend: a single file (see storage_file below)
#               centralizes Lustre devices information (cluster-wide).
#               Highly recommended if you plan to install more than one Lustre
#               file system.
#
#   ClusterDB   Bull ClusterDB (proprietary external database backend).
#
backend=None

#
# The location of the target storage information file.
#
#storage_file=/etc/shine/storage.conf

#
# Directory used for cached status information.
#
#status_dir=/var/cache/shine/status


#
# TIMEOUTS and FANOUT
#

# Timeout in seconds given to ConnectTimeout parameter of ssh
#
#ssh_connect_timeout=30

# Maximum number of simultaneous local commands and remote connections.
# (default is ClusterShell default fanout).
#
#ssh_fanout=64


#
# COMMANDS
#

# Additional paths to look for Lustre or ldiskfs specific commands.
#
#command_path=/usr/lib/lustre

Now let's create the configuration file of our file system, from the example model :

cd /etc/shine/models
cp example.lmf myfs.lmf

And edit myfs.lmf as following :

#
# This is an example of lustre model file. It contains a set of
# configuration directives to install a simple Lustre filesystem.
#
# $Id$

### Section 1 - Required directives

# fs_name
# The Lustre filesystem name (8 characters max).
fs_name: myfilesystemforsphen

# nid_map
# Hosts to Lnet NIDs mapping.
#
# Use multiple lines with the same nodes if you have several nids for
# the same machines.
nid_map: nodes=mds1 nids=mds1@tcp0
nid_map: nodes=oss3 nids=oss3@tcp0
nid_map: nodes=client1 nids=client1@tcp0

# mount_path
# Default clients mount point path.
# Some variables could be used in 'mount_path', see -Path Variables- below.
mount_path: /example


### Section 2 - Target definitions

# Defines your Lustre filesystem targets.
#
# mgt|mdt|ost: [ tag=<RegExp> ] [ node=<RegExp> ] [ dev=<RegExp> ] 
#              [ index=<RegExp> ] [ jdev=<RegExp> ] [ ha_node=<RegExp> ]
#              [ group=<RegExp> ] [ mode={external|managed} ]
#              [ network=<RegExp> ] [ active={yes|nocreate|no|manual} ]
#
# Here, we don't use any backend (no File nor ClusterDB), so we have to
# fully describe our targets (no RegExp accepted). For this simple
# example, only minimum target information is provided.

# mgt
# Management Target
mgt: node=mds1 dev=/dev/sdb

# mdt
# MetaData Target
mdt: node=mds1 dev=/dev/sdc

# ost
# Object Storage Target(s)
ost: node=oss3 dev=/dev/sdb

# client
# FS clients definition. Like targets, use multiple lines if you want.
client: node=client1

# Sometimes it is needed for some nodes to mount this FS on a different
# mount point (not the default mount_path). In that case, use the
# optional client parameter mount_path.
# Some variables could be used in 'mount_path', see -Path Variables- below.
# 

# Also, to override default client mount options, add the following
# mount_options inline option:


### Section 3 - Additionnal directives

# description
# Optional FS description
description: Example Lustre Filesystem

# stripe_size
# Specify the stripe size in bytes. Default is 1048576 (1M)
stripe_size: 1048576

# stripe_count
# Specify the number of OSTs each file should be stripped on.
# If not defined, no explicit value is used and Lustre will apply its default behaviour.
#stripe_count: 1

#
# mgt_format_params:
# mdt_format_params:
# ost_format_params:
#
# Optional argument that will be used by mkfs.lustre for a target. Default is
# no option.
#
# ie: disable failover mode and enable failout instead
# mdt_format_params: failover.mode=failout
# ost_format_params: failover.mode=failout

#
# mgt_mkfs_options:
# mdt_mkfs_options:
# ost_mkfs_options:
#
# Optional argument for --mkfsoptions, by target type. You can use ext3 format
# options here. Defaults is no option.
# ie: do not reserve blocks for super-user.
mgt_mkfs_options: -m 0
mdt_mkfs_options: -m 0
ost_mkfs_options: -m 0

#
# mgt_mount_options:
# mdt_mount_options:
# ost_mount_options:
#
# Optional argument used when starting a target. Default is no options.

# ie: Enable ACL for MDT
mdt_mount_options: acl


# mount_options
# This define the default options to mount the filesystem on clients.
mount_options: 

#
# Quota
#
# Enable quota support.
# In lustre 2.4 and above, all quota options described here are ignored
# Possible values are yes or no (default is no).
quota: no

# Quota configuration options
# Describe options for quota support, if quota enabled.
#
# quota_type:     (default is 'ug')
# quota_iunit:    <number of inodes>
# quota_bunit:    <size in MB>
# quota_itune:    <percentage>
# quota_btune:    <percentage>

# Target Mount Path patterns
#
# -Path Variables-
# Several variables could be used within these paths:
#  $fs_name:  Filesystem name defined in 'fs_name:'
#  $label:    Component label (ie: foo-OST0002)
#  $type:     Component type ('mdt', 'mgt', 'ost', 'router', 'client')
#  $index:    Target index, in decimal (ie: 1, 2, 36, ...) or in hex (ie. 0x2, 0xa5, 0x00FA)
#  $dev:      Base name of target device path (ie: sdc)
#  $jdev:     Base name of target journal device path
mgt_mount_path: /mnt/$fs_name/mgt
mdt_mount_path: /mnt/$fs_name/mdt/$index
ost_mount_path: /mnt/$fs_name/ost/$index

#
# Routers
#

# nova7 and nova8 are declared as LNET routers.

Now deploy the configuration :

shine install -m /etc/shine/models/myfs.lmf 

If OK, format and mount :

shine format -f myfilesystemforsphen
shine mount -f myfilesystemforsphen

To stop it, unmount and stop :

shine umount -f myfilesystemforsphen
shine stop -f myfilesystemforsphen

Errors encoutered

From dmesg :

INFO: task mount.lustre:1316 blocked for more than 120 seconds.
      Tainted: G        W  ---------------    2.6.32-504.8.1.el6_lustre.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mount.lustre  D 0000000000000000     0  1316   1315 0x00000080
 ffff88001cc6fc70 0000000000000086 00000000d91c4366 0000000000000000
 00000000ffffffea ffff88001cc6fc78 0000000000000000 ffff88001cc6fc48
 0000000000000282 ffff88001cc6fc48 ffff88001fb2c5f8 ffff88001cc6ffd8
Call Trace:
 [<ffffffff8152d3a5>] rwsem_down_failed_common+0x95/0x1d0
 [<ffffffff8152d536>] rwsem_down_read_failed+0x26/0x30
 [<ffffffff81299334>] call_rwsem_down_read_failed+0x14/0x30
 [<ffffffff8152ca34>] ? down_read+0x24/0x30
 [<ffffffffa040594c>] ldiskfs_quota_off+0x1bc/0x1f0 [ldiskfs]
 [<ffffffff812398aa>] ? selinux_sb_copy_data+0x14a/0x1e0
 [<ffffffff81190d26>] deactivate_locked_super+0x46/0x90
 [<ffffffff81190e7d>] vfs_kern_mount+0x10d/0x1b0
 [<ffffffff81190f92>] do_kern_mount+0x52/0x130
 [<ffffffff811b2b9b>] do_mount+0x2fb/0x930
 [<ffffffff811b3260>] sys_mount+0x90/0xe0
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
[root@mds1 ~]# mount -t lustre /dev/sdb /mnt/mdt/
mount.lustre: Unable to mount /dev/sdb: Invalid argument

mount.lustre FATAL: failed to write local files: Invalid argument
mount.lustre: increased /sys/block/sdb/queue/max_sectors_kb from 1024 to 32767
mount.lustre: mount /dev/sdb at /mnt/mdt failed: Invalid argument
This may have multiple causes.
Are the mount options correct?
Check the syslog for more info.

In syslog :

LDISKFS-fs (sdb): Unrecognized mount option "context=unconfined_u:object_r:user_tmp_t:s0" or missing value
Lustre: Lustre: Build Version: 2.7.0-RC4--PRISTINE-2.6.32-504.8.1.el6_lustre.x86_64

You need to disable SELinux (not just permissive, DISABLE). Edit /etc/selinux/config, change enforced/permissive to disabled, and reboot. Then reformat and now you can mount:

mkfs.lustre --reformat --fsname=mylustrefs --mgs --mdt --index=0 /dev/sdb
mount -t lustre /dev/sdb /mnt/mdt/