User Tools

Site Tools


Site Tools

Prepare install

Useful commands

  • Load a kernel module : modprobe mymodule -v
  • Unload a kernel module : modprobe -r mymodule
  • List loaded kernel modules : lsmod
  • Check listening processes and port used : netstat -aut
  • Get hardware information (use –help for more details) : inxi
  • Check network configuration : ip add
  • Open a screen : screen -S sphen
  • List screens : screen -ls
  • Join a screen : screen -x sphen
  • Detach a screen : use Ctrl+a+d on keyboard
  • Change keyboard language in current terminal : loadkeys fr (azerty), loadkeys us (qwerty)
  • Remount / when in read only (often in recovery mode) : mount -o remount,rw /
  • Apply a patch on a file : patch myfile.txt < mypatch.txt
  • Do a patch from original and modified file : diff -Naur original.txt modified.txt

IPMI commands for remote control :

  • Boot, very useful for very slow to boot systems (bios can be replaced with pxe or cdrom or disk) : ipmitool -I lanplus -H bmc5 -U user -P password chassis bootdev bios
  • Make boot persistent : ipmitool -I lanplus -H bmc5 -U user -P password chassis bootdev disk options=persistent
  • Control power (reset car be replaced with soft or cycle or off or on) : ipmitool -I lanplus -H bmc5 -U user -P password chassis power reset
  • Activate remote console (for BIOS for example, use Enter, then & then . to exit) : ipmitool -H bmc5 -U user -P password -I lanplus -e \& sol activate

More: https://support.pivotal.io/hc/en-us/articles/206396927-How-to-work-on-IPMI-and-IPMITOOL
Note: when using sol activate, if keyboard does not work, try using the same command into a screen, this may solve the issue.

Clush usage :

  • To do a command on all nodes : clush -bw node1,node[4-5] “hostname”
  • To copy a file on all nodes : clush -w node1,node[4-5] –copy /root/slurm.conf –dest=/etc/slurm/slurm.conf
  • To replace a string in a file of all nodes : clush -bw compute1[34-67] 'sed -i “s/10.0.0.1/nfsserver/g” /etc/fstab'

Download everything

Download Centos 7 iso, you will need minimal iso and everything iso :

Optional: download Virtual Box from here if you want to test in VMs with an easy GUI https://www.virtualbox.org/wiki/Linux_Downloads :

Now we will need to get some rpm. You may need phpldapadmin (optional), nagios (optional), munge and slurm. You can build them manually using method described below. (Note that nagios building is tricky, I made these build steps below to be able to use latest version, but you can use the one provided into Centos or in the EPEL, which is a little older. If you choose this way to do, you can download these nagios rpm from Centos/EPEL repositories using the same method than the one used here to download phpldapadmin rpms.)


I also provide all needed rpm on my personal repository. I hope you will use them without any troubles. Some are made directly from sources tar balls, others are made from src.rpm files from fedora project. Note also that all rpm will be built with the gcc (and gmp/mpc/mpfr) present in the repository (here 6.3.0), except gmp, mpc, mpfr and the gcc itself which are made with the Centos/RHEL default gcc. Binaries are made on a i5-4670K (Haswell). If you need a specific rpm (other prefix, other architecture), feel free to ask.

Go to sphenisc repository ヾ(o✪‿✪o)シ



Old made rpms (until repository is up):
Munge:

munge-0.5.12-1.el7.centos.x86_64.rpm
munge-devel-0.5.12-1.el7.centos.x86_64.rpm
munge-libs-0.5.12-1.el7.centos.x86_64.rpm

munge-0.5.11-1.el7.centos.x86_64.rpm
munge-devel-0.5.11-1.el7.centos.x86_64.rpm
munge-libs-0.5.11-1.el7.centos.x86_64.rpm

Slurm:

slurm-16.05.2-1.el7.centos.x86_64.rpm
slurm-devel-16.05.2-1.el7.centos.x86_64.rpm
slurm-munge-16.05.2-1.el7.centos.x86_64.rpm
slurm-plugins-16.05.2-1.el7.centos.x86_64.rpm

slurm-14.11.8-1.el7.centos.x86_64.rpm
slurm-devel-14.11.8-1.el7.centos.x86_64.rpm
slurm-munge-14.11.8-1.el7.centos.x86_64.rpm
slurm-plugins-14.11.8-1.el7.centos.x86_64.rpm

Nagios:

nagios-4.1.1-2.el7.centos.x86_64.rpm
nagios-contrib-4.1.1-2.el7.centos.x86_64.rpm
nagios-devel-4.1.1-2.el7.centos.x86_64.rpm
nagios-plugins-2.1.1-1.x86_64.rpm
nrpe-2.15-1.x86_64.rpm
nrpe-plugin-2.15-1.x86_64.rpm

Download phpldapadmin

First, we need to download phpldapadmin from EPEL repository. Again, we assume the cluster will not reach the web, so we need to download rpm before install.

Install a VM with Centos 7.2, and add the EPEL repository using:

yum install epel-release

Then download phpldapadmin rpm and it's dependency (will be lighttpd rpm) using:

yum install --downloadonly --downloaddir=. phpldapadmin

Keep these RPM, we will use them later.

Build base

Because cluster will be installed without Internet access, few things need to be built first. Use another computer, install Centos 7.2 on it. Be careful, all of these must be compiled on the same centos 7.2 version and architecture of the aimed cluster, here x86_64. You can use a VM with a Centos 7.2 and the iso downloaded first to build the following packages.

To build rpm, install package rpm-build :

yum install rpm-build wget make gcc zlib-devel bzip2-devel openssl-devel

munge

Download munge 0.5.11:

wget https://github.com/dun/munge/releases/download/munge-0.5.11/munge-0.5.11.tar.bz2

The spec is bugged, two files are missing in it, you have to extract archive, patch, and repack :

tar xvjf munge-0.5.11.tar.bz2
cd munge-0.5.11

Then edit munge.spec by applying the following patch on it. To do so, create a patch.txt file containing the following:

--- munge.spec.old      2016-06-10 10:06:04.435748124 -0400
+++ munge.spec  2016-06-10 10:06:35.753430513 -0400
@@ -167,6 +167,8 @@
 %{_bindir}/*
 %{_sbindir}/*
 %{_mandir}/*[^3]/*
+%{_prefix}/lib/systemd/system/munge.service
+%{_prefix}/lib/tmpfiles.d/munge.conf

 %files devel
 %defattr(-,root,root,0755)

And patch spec file using :

patch munge.spec < patch.txt

Now repack and build rpm :

cd ../
tar cvzf mymunge.tar.gz munge-0.5.11
rpmbuild -ta mymunge.tar.gz

After this step, you will have your munge rpm in ./rpmbuild/RPMS/x86_64. Use this command to see them:

 find . -name "*.rpm"

Will shows:

./rpmbuild/RPMS/x86_64/munge-0.5.11-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/munge-devel-0.5.11-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/munge-libs-0.5.11-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/munge-debuginfo-0.5.11-1.el7.centos.x86_64.rpm
./rpmbuild/SRPMS/munge-0.5.11-1.el7.centos.src.rpm

Keep these rpm, you will need the later. Now, to build slurm (the rpm), install these munge rpm locally, slurm will need them, using the following:

yum localinstall ./rpmbuild/RPMS/x86_64/munge*.rpm

slurm

Download slurm 14.11.8:

wget http://www.schedmd.com/download/archive/slurm-14.11.8.tar.bz2

Like for munge, build the rpm, you will need the munge packages and others:

yum install mysql-devel perl-ExtUtils-MakeMaker perl pam-devel readline-devel

Now build the rpm:

rpmbuild -tb slurm-14.11.8.tar.bz2

Like before, slurm rpm can be found easily using:

find . -name "*.rpm"
./rpmbuild/RPMS/x86_64/slurm-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-perlapi-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-devel-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-munge-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-slurmdbd-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-sql-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-plugins-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-torque-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-sjobexit-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-slurmdb-direct-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-sjstat-14.11.8-1.el7.centos.x86_64.rpm
./rpmbuild/RPMS/x86_64/slurm-pam_slurm-14.11.8-1.el7.centos.x86_64.rpm

Keep these rpm.

nagios (optional)

Nagios is here made of 3 parts: nagios, nagios-plugins, and nrpe. We will compile all rpm. The aim here is to use latest version available. You can use the version provided in Centos or EPEL, a little older, in order to avoid these compilation steps.

Nagios is tricky to build, so stay close to the instructions provided here.

First, download needed archives:

wget http://downloads.sourceforge.net/project/nagios/nagios-4.x/nagios-4.1.1/nagios-4.1.1.tar.gz
wget http://nagios-plugins.org/download/nagios-plugins-2.1.1.tar.gz
wget https://sourceforge.net/projects/nagios/files/nrpe-2.x/nrpe-2.15/nrpe-2.15.tar.gz

Install needed packages, we will need them to compile nagios.

yum install make wget httpd php gcc glibc glibc-common gd gd-devel php-gd net-snmp* openssl openssl-devel xinetd doxygen gperf

Extract nagios archive, we will edit spec file:

tar xvzf nagios-4.1.1.tar.gz 
cd nagios-4.1.1

Apply the following patch on nagios.spec, in order to also include webconf into the rpm, and to modify default command group. To do so, create a patch.txt file, and fill it with:

--- nagios-4.1.1/nagios.spec    2015-08-19 17:49:52.000000000 -0400
+++ nagios-4.1.1up/nagios.spec  2016-06-10 07:06:23.952849903 -0400
@@ -87,8 +87,7 @@
     --sbindir="%{_libdir}/nagios/cgi" \
     --sysconfdir="%{_sysconfdir}/nagios" \
     --with-cgiurl="/nagios/cgi-bin" \
-    --with-command-user="apache" \
-    --with-command-group="apache" \
+    --with-command-group="nagcmd" \
     --with-gd-lib="%{_libdir}" \
     --with-gd-inc="%{_includedir}" \
     --with-htmurl="/nagios" \
@@ -166,7 +165,13 @@
 find %{buildroot}/%{_libdir}/nagios/cgi -type f -print | sed s!'%{buildroot}'!!g | egrep -ve "($CGI)" > cgi.files
 find %{buildroot}/%{_libdir}/nagios/cgi -type f -print | sed s!'%{buildroot}'!!g | egrep "($CGI)" > contrib.files

+### Install webconf

+%{__make} install-webconf \
+    DESTDIR="%{buildroot}" \
+    INSTALL_OPTS="" \
+    COMMAND_OPTS="" \
+    INIT_OPTS=""

 %pre
 if ! /usr/bin/id nagios &>/dev/null; then

Then apply this patch on nagios spec file, using:

patch nagios.spec < patch.txt

Now create an archive with this, and build rpm:

cd ../
tar cvzf nagios-4.1.1-modified.tar.gz nagios-4.1.1
rpmbuild -ta nagios-4.1.1-modified.tar.gz

Then we will build nagios-plugins rpm.

Extract tarball

tar xvzf nagios-plugins-2.1.1.tar.gz
cd nagios-plugins-2.1.1

And again we will patch the spec file nagios-plugins.spec, here to remove some features we don’t need and that would require many other rpm installation, and also to modify configure arguments and remove a compilation bug. Proceed with the same method as before, and use the following patch:

--- nagios-plugins-2.1.1/nagios-plugins.spec    2015-07-31 13:48:09.000000000 -0400
+++ nagios-plugins-2.1.1up/nagios-plugins.spec  2016-06-10 06:18:49.161898523 -0400
@@ -77,23 +77,23 @@
 Requires:      postgresql-libs
 Requires:      procps
 Requires:      python
-Requires:      samba-client
+#Requires:     samba-client
 Requires:      shadow-utils
 Requires:      traceroute
 Requires:      /usr/bin/mailq
 BuildRequires: bind-utils
 BuildRequires: coreutils
 BuildRequires: iputils
-BuildRequires: mysql-devel
+#BuildRequires:        mysql-devel
 BuildRequires: net-snmp-utils
 BuildRequires: net-tools
 BuildRequires: ntp
-BuildRequires: openldap-devel
+#BuildRequires:        openldap-devel
 BuildRequires: openssh-clients
 BuildRequires: openssl-devel
-BuildRequires: postgresql-devel
+#BuildRequires:        postgresql-devel
 BuildRequires: procps
-BuildRequires: samba-client
+#BuildRequires:        samba-client
 BuildRequires: /usr/bin/mailq
 %endif

@@ -120,6 +120,9 @@
 --libexecdir=%{_libexecdir} \
 --sysconfdir=%{_sysconfdir} \
 --datadir=%{_datadir} \
+--with-nagios-user=nagios \
+--with-nagios-group=nagios \
+--without-dbi --without-radius --without-ldap \
 --with-cgiurl=/nagios/cgi-bin
 ls -1 %{npdir}/plugins > %{npdir}/ls-plugins-before
 ls -1 %{npdir}/plugins-root > %{npdir}/ls-plugins-root-before
@@ -172,7 +175,6 @@
 comm -13 %{npdir}/ls-plugins-scripts-before %{npdir}/ls-plugins-scripts-after | egrep -v "\.o$|^\." | gawk -v libexecdir=%{_libexecdir} '{printf( "%s/%s\n", libexecdir, $0);}' >> %{name}.lang
 echo "%{_libexecdir}/utils.pm" >> %{name}.lang
 echo "%{_libexecdir}/utils.sh" >> %{name}.lang
-echo "%{_libexecdir}/check_ldaps" >> %{name}.lang

 sed -i '/libnpcommon/d' %{name}.lang
 sed -i '/nagios-plugins.mo/d' %{name}.lang
@@ -182,9 +184,8 @@


 %files -f %{name}.lang
-%config(missingok,noreplace) %{_sysconfdir}/command.cfg
 %doc CODING COPYING FAQ INSTALL LEGAL README REQUIREMENTS SUPPORT THANKS
-%doc ChangeLog command.cfg
+%doc ChangeLog
 %if ! %{isaix}
 %{_datadir}/locale/de/LC_MESSAGES/nagios-plugins.mo
 %{_datadir}/locale/fr/LC_MESSAGES/nagios-plugins.mo

Now redo tarball and build rpm:

cd ../
tar cvzf nagios-plugins-2.1.1-modified.tar.gz nagios-plugins-2.1.1
rpmbuild -ta nagios-plugins-2.1.1-modified.tar.gz

To finish, let’s build nrpe rpm.

tar xvzf nrpe-2.15.tar.gz
cd nrpe-2.15

Again, patch spec file in order to be able to compile nrpe. Nrpe spec file is totally bugged. Thanks to Florin Andrei for this last spec patch. (https://support.nagios.com/forum/viewtopic.php?f=7&t=21809).

--- nrpe-2.15/nrpe.spec 2013-09-06 11:27:13.000000000 -0400
+++ nrpe-2.15up/nrpe.spec       2016-06-10 09:49:16.392849799 -0400
@@ -9,16 +9,14 @@
 %endif
 %if %{islinux}
        %define _init_dir /etc/init.d
-       %define _exec_prefix %{_prefix}/sbin
-       %define _bindir %{_prefix}/sbin
-       %define _sbindir %{_prefix}/lib/nagios/cgi
-       %define _libexecdir %{_prefix}/lib/nagios/plugins
-       %define _datadir %{_prefix}/share/nagios
-       %define _localstatedir /var/log/nagios
-       %define nshome /var/log/nagios
+       %define _bindir_nrpe %{_prefix}/sbin
+       %define _sbindir_nrpe %{_libdir}/nagios/cgi
+       %define _libexecdir_nrpe %{_libdir}/nagios/plugins
+       %define _datadir_nrpe %{_datadir}/nagios
+       %define nshome %{_localstatedir}/log/nagios
        %define _make make
 %endif
-%define _sysconfdir /etc/nagios
+%define _sysconfdir_nrpe %{_sysconfdir}/nagios

 %define name nrpe
 %define version 2.15
@@ -141,7 +139,7 @@
 %post
 /usr/bin/lssrc -s nrpe > /dev/null 2> /dev/null
 if [ $? -eq 1 ] ; then
-       /usr/bin/mkssys -p %{_bindir}/nrpe -s nrpe -u 0 -a "-c %{_sysconfdir}/nrpe.cfg -d -s" -Q -R -S -n 15 -f 9
+       /usr/bin/mkssys -p %{_bindir_nrpe}/nrpe -s nrpe -u 0 -a "-c %{_sysconfdir_nrpe}/nrpe.cfg -d -s" -Q -R -S -n 15 -f 9
 fi
 /usr/bin/startsrc -s nrpe
 %endif
@@ -177,13 +175,11 @@
        --with-nrpe-user=%{nsusr} \
        --with-nrpe-group=%{nsgrp} \
        --prefix=%{_prefix} \
-       --exec-prefix=%{_exec_prefix} \
-       --bindir=%{_bindir} \
-       --sbindir=%{_sbindir} \
-       --libexecdir=%{_libexecdir} \
-       --datadir=%{_datadir} \
-       --sysconfdir=%{_sysconfdir} \
-       --localstatedir=%{_localstatedir} \
+       --bindir=%{_bindir_nrpe} \
+       --sbindir=%{_sbindir_nrpe} \
+       --libexecdir=%{_libexecdir_nrpe} \
+       --datadir=%{_datadir_nrpe} \
+       --sysconfdir=%{_sysconfdir_nrpe} \
        --enable-command-args
 %{_make} all

@@ -192,18 +188,18 @@
 %if %{islinux}
 install -d -m 0755 ${RPM_BUILD_ROOT}%{_init_dir}
 %endif
-DESTDIR=${RPM_BUILD_ROOT} %{_make} install install-daemon-config
-#install -d -m 0755 ${RPM_BUILD_ROOT}%{_sysconfdir}
-#install -d -m 0755 ${RPM_BUILD_ROOT}%{_bindir}
-#install -d -m 0755 ${RPM_BUILD_ROOT}%{_libexecdir}
+DESTDIR=${RPM_BUILD_ROOT} %{_make} install install-daemon install-daemon-config install-plugin install-xinetd
+#install -d -m 0755 ${RPM_BUILD_ROOT}%{_sysconfdir_nrpe}
+#install -d -m 0755 ${RPM_BUILD_ROOT}%{_bindir_nrpe}
+#install -d -m 0755 ${RPM_BUILD_ROOT}%{_libexecdir_nrpe}

 # install templated configuration files
-#cp sample-config/nrpe.cfg ${RPM_BUILD_ROOT}%{_sysconfdir}/nrpe.cfg
+cp sample-config/nrpe.cfg ${RPM_BUILD_ROOT}%{_sysconfdir_nrpe}/nrpe.cfg
 #%if %{isaix}
-#cp init-script ${RPM_BUILD_ROOT}%{_init_dir}/nrpe
+cp init-script ${RPM_BUILD_ROOT}%{_init_dir}/nrpe
 #%endif
-#cp src/nrpe ${RPM_BUILD_ROOT}%{_bindir}
-#cp src/check_nrpe ${RPM_BUILD_ROOT}%{_libexecdir}
+#cp src/nrpe ${RPM_BUILD_ROOT}%{_bindir_nrpe}
+#cp src/check_nrpe ${RPM_BUILD_ROOT}%{_libexecdir_nrpe}

 %clean
 rm -rf $RPM_BUILD_ROOT
@@ -214,20 +210,27 @@
 %defattr(755,root,root)
 /etc/init.d/nrpe
 %endif
-%{_bindir}/nrpe
-%dir %{_sysconfdir}
+%{_bindir_nrpe}/nrpe
+%dir %{_sysconfdir_nrpe}
 %defattr(600,%{nsusr},%{nsgrp})
-%config(noreplace) %{_sysconfdir}/*.cfg
+%config(noreplace) %{_sysconfdir_nrpe}/*.cfg
 %defattr(755,%{nsusr},%{nsgrp})
 %doc Changelog LEGAL README

 %files plugin
 %defattr(755,%{nsusr},%{nsgrp})
-%{_libexecdir}
+%{_libexecdir_nrpe}
 %defattr(644,%{nsusr},%{nsgrp})
 %doc Changelog LEGAL README

 %changelog
+* Wed Oct 23 2013 Florin Andrei <florin@andrei.myip.org> 2.15
+- fixed many, many instances where default RPM macros were over-written
+- re-enabled copying of the init.d file and the .cfg file to /etc
+- removed hardcoded /usr/lib location for plugins, now it uses the %{_libdir}
+  macro. This will match the Core package and be consistent with 32/64 bit
+  architecture.
+
 * Mon Mar 12 2012 Eric Stanley estanley<@>nagios.com
 - Created autoconf input file
 - Updated to support building on AIX
@@ -250,3 +253,4 @@

 * Sat Dec 28 2002 James 'Showkilr' Peterson <showkilr@showkilr.com>
 - First RPM build (1.5-1)
+

Redo archive, and build rpm:

cd ../
tar cvzf nrpe-2.15.modified.tar.gz nrpe-2.15
rpmbuild -ta mynrpe-2.15.modified.tar.gz

All nagios RPM are now built:

~# ls /root/rpmbuild/RPMS/x86_64/
nagios-4.1.1-2.el7.centos.x86_64.rpm
nagios-contrib-4.1.1-2.el7.centos.x86_64.rpm
nagios-debuginfo-4.1.1-2.el7.centos.x86_64.rpm
nagios-devel-4.1.1-2.el7.centos.x86_64.rpm
nagios-plugins-2.1.1-1.x86_64.rpm
nagios-plugins-debuginfo-2.1.1-1.x86_64.rpm
nrpe-2.15-1.x86_64.rpm
nrpe-debuginfo-2.15-1.x86_64.rpm
nrpe-plugin-2.15-1.x86_64.rpm
~#

Ansible (optional)

If you plan to use Ansible to deploy nodes, you need to download it and build it locally. This step can be done on Centos or Debian/Ubuntu, just be sure your architecture is the same as the cluster (here x86_64).
If you are behind a proxy, export needed variables, if not, skip this step:

export http_proxy=http://proxy.sphenisc.com:80
export https_proxy=http://proxy.sphenisc.com:80

Now get latest Ansible, and needed python packages (replace both /home/sphen by your home name):

mkdir ansible
cd ansible/
git clone git://github.com/ansible/ansible.git --recursive
cd ../
tar cvzf ansible.tar.gz ansible

To build needed dependencies, you need pip and python devel : From ubuntu:

sudo apt-get install python-pip
sudo apt-get install python-dev

From centos (you need to turn on EPEL, see here how (RHEL5-6, RHEL7)):

yum install python-pip
yum install python-devel

Now download libraries and compile them:

pip install --ignore-installed --target=/home/sphen/pip --install-option="--install-purelib=/home/sphen/pip" paramiko PyYAML Jinja2 httplib2 six
tar cvzf pip.tar.gz pip

On the server using Ansible, you will need to tell python to use the packages we built, using PYTHON_PATH variable:

export PYTHONPATH=/root/pip/:$PYTHONPATH

Prepare USB key

To prepare the USB key for Centos install, you can follow this link: http://wiki.centos.org/HowTos/InstallFromUSBkey


You are now ready to install the cluster. Go to the next part: Core (batman) setup