Details on Hardware

Seamicro chassis

hufty.ci.centos.org

The chassis is connected to an upstream ethernet switch with an lacp/port-channel (8 gigabits NICs) Each compute node (64 physical compute nodes in a chassis) has:

Each compute node follows the same naming convention : n${srv_id}.${chassis}.ci.centos.org for for server 1 on hufty, that is n1.hufty.ci.centos.org

pufty.ci.centos.org

The chassis is connected to an upstream ethernet switch with an lacp/port-channel (8 gigabits NICs) Each compute node (64 physical compute nodes in a chassis) has:

Each compute node follows the same naming convention : n${srv_id}.${chassis}.ci.centos.org for for server 64 on pufty, that is n64.pufty.ci.centos.org

crusty.ci.centos.org

Inband MM IP: 172.19.0.3

The chassis is connected to an upstream ethernet switch with an lacp/port-channel (2 10-gigabit NICs) Each compute node (64 physical compute nodes in a chassis) has:

Each compute node follows the same naming convention : n${srv_id}.${chassis}.ci.centos.org for for server 64 on pufty, that is n64.pufty.ci.centos.org

dusty.ci.centos.org

Inband MM IP: 172.19.0.4

The chassis is connected to an upstream ethernet switch with an lacp/port-channel (2 10-gigabit NICs) Each compute node (64 physical compute nodes in a chassis) has:

Each compute node follows the same naming convention : n${srv_id}.${chassis}.ci.centos.org for for server 64 on pufty, that is n64.pufty.ci.centos.org

gusty.ci.centos.org

Inband MM IP: 172.19.0.5

The chassis is connected to an upstream ethernet switch with an lacp/port-channel (2 10-gigabit NICs) Each compute node (64 physical compute nodes in a chassis) has:

Each compute node follows the same naming convention : n${srv_id}.${chassis}.ci.centos.org for for server 64 on pufty, that is n64.pufty.ci.centos.org

Provisioning a compute node

From the time you launch a (re)provisioning (through pxe, automated) to the time machine is available through sshd => ~8 minutes. Process goes like this:

NOTE: those Seamicro computes nodes are deployed by an ansible playbook call, itself being done by the Duffy middleware (see http://wiki.centos.org/QaWiki/CI/Duffy)

Some interesting snippets used to speed-up the install process:

Enabling multipath in active/active mode at install time, but only working for CentOS 7 at the moment

%pre
#!/bin/bash
/sbin/mpathconf --enable --with_multipathd y
/sbin/multipath -F
/sbin/multipath -r -p multibus
sed -i s/"user_friendly_names yes"/"user_friendly_names yes \n        path_grouping_policy multibus"/g /etc/multipath.conf
%end

Forcing AMD cpu to switch to performance instead of ondemand scaling_governor

%pre
for i in {0..7} ; do echo performance > /sys/devices/system/cpu/cpu${i}/cpufreq/scaling_governor ; done
%end

Monitoring

We have a Zabbix server installed on the admin node, which allows agents to auto-register themselves and get a default monitoring template added. Zabbix-agent packages are installed at kickstart time, and minimally configured to point to admin.ci.centos.org, so that we can track uptime/cpu/memory/bandwidth usage for each node that is installed/reinstalled/etc ...

Remote Access

The only way to get access into that isolated setup is to use jump.ci.centos.org (publicly reachable) as a ssh jump-host. You need to ask for your public key to be integrated, if you're a member of the CI Infra Team (people managing the CI infra).

Reserved Nodes

Machine/VM

reserved for

n63.pufty

libguestfs hypervisor + slaves CI

n64.pufty

libvirt hypervisor + slaves CI

Reserved IP addresses

QaWiki/PubHardware (last edited 2017-04-20 09:08:07 by FabianArrotin)