Virtual
OSCAR Cluster HeadnodeErich Focht <efocht at gmail dot com>
The Virtual OSCAR Cluster Headnode is based on the OSCAR (Open Source Cluster Application Resources) cluster infrastructure, a collaborative effort to make beowulf-type clusters easy to use and manage. The author is core OSCAR developer and uses VMware's VMplayer on a regular basis for developing and testing OSCAR. The virtual appliance has been created to facilitate the installation and management of clusters based on x86 nodes. The Virtual Headnode is ready built, configured and prepared for installing and controlling a cluster, most of the steps required for building an OSCAR cluster were already done. This setup is using recent developments in OSCAR which allowed the reduction of the size of the compressed virtual machine disk image to less than 1.5GB, including one cluster node image. This makes it feasible to backup and copy this image, use it for testing changes or for implementing highly available solutions based on VMware.
Right now OSCAR clusters are mainly used in the area of High Performance Computing for running simulation codes in areas like Computational Fluid Dynamics (CFD), Crash Simulations (eg. for the automotive industry), Bioinformatics, Chemistry, Molecular Dynamics, etc... One headnode is typically used to manage up to 200 cluster nodes, though this is not a limit imposed by the installed software. Users of OSCAR clusters are encouraged to register their clusters in the OSCAR cluster database, part of which is displayed in the OSCAR cluster map.
The VMware OSCAR virtual headnode appliance can be downloaded from http://www.vmware.com/vmtn/appliances/directory/341
The virtual appliance is a fully working OSCAR headnode prepared for the definition of client cluster nodes and their deployment. It is defined with two bridged ethernet interfaces:
eth0: IP address 192.168.100.1, cluster internal interface, should not be modified. The cluster nodes should all be defined inside the subnet 192.168.100.0/24.
eth1: IP address 192.168.10.140, external interface of the headnode. This address can and should be modified. It should be integrated in the user's LAN. This should be the address used for connecting to the headnode from an address external to the cluster. When changing the IP address don't forget to also adapt the default route and the nameserver!
A client node image based on Fedora Core 5 is predefined and ready built for deployment to cluster nodes. It is called fc5image. You can build additional images or import node images from other distributions.
Image based installation / node deployment: built with the SystemInstallation Suite: SystemImager, SystemInstaller-OSCAR, SystemConfigurator.
Resource Manager: the virtual headnode has the resource manager Torque installed and preconfigured with one queue. The resource scheduling is controlled by Maui. Use the command qsub for submitting jobs, qstat for checking their status, qdel for deleting them, pbsnodes for finding out about the availability of nodes.
Parallel message passing libraries: MPI, PVM:
MPICH 1.2.7
OpenMPI 1.0.2
LAM MPI 7.1.2
PVM 3.4.5+4
Environment switcher: allows switching easily between environments like different MPI versions, compilers, etc... Read the man-pages of switcher. All delivered MPIs come with preconfigured switcher/modules.
Live cluster management tools: C3 (Cluster Command and Control) and SC3 (Scalable (Sub-)Cluster Command and Control). Allow parallel cluster command execution with commands like cexec, cpush, ckill, cget, scexec, scpush, scrpm...
Cluster user database synchronization via opium/sync_files: pushes password databases from the master node to the cluster nodes periodically. If the master node is integrated into the enterprise LDAP or NIS+ environment, the client nodes will get the passwords assembled from the LDAP or NIS+ database on the master node. There is no need to integrate the cluster nodes into LDAP, too.
Shared /home filesystem across the cluster: the /home filesystem of the master node is exported to the client nodes via NFS. This allows them to share the SSH keys. SSH keys for accessing the cluster are automatically generated and allow accessing any node within the cluster without entering passwords.
Cluster performance monitoring system: Ganglia 3.0.3 is installed on the master node and made accessible over its web server. The generic URL is http://master_node_address/ganglia . This allows monitoring a huge set of metrics related to the machine's activity and performance, including the time history of the measured values.
Cluster health monitoring
system: Nagios 1.3 is installed on the master node and automatically
configured for monitoring services on the master node and the
accessibility of client nodes, their filesystems and swap space.
Nagios will restart certain services on the master node if they die
via an event handler, thus keeping them highly available.
Nagios
is extremely versatile in its configuration and the way it notifies
the administrators of any failures. The default configuration for
OSCAR is to send notification emails to the mail alias nagios-admin
on the headnode. Edit the /etc/aliases file if you want to change
this behavior.
NetBootManager for managing how and what client nodes will be booting if they are configured in the BIOS to boot via PXE. This application manages symbolic links to PXE configuration files for each cluster node.
Headnode Accessibility
The first step after booting a fresh Virtual OSCAR Headnode and logging in as user “oscar” (password: vmoscar) is to integrate its external interface (eth1) into the local network. Select System -> Administration -> Network, enter the root password (default: vmOSCAR) and edit the “eth1” setting according to your needs. Then open a shell and restart the network:
$ sudo service network restart
Check the accessibility of the machine from outside.
Management Interface
The main interface to the OSCAR system is the Management Interface which can be started by clicking on the OSCAR icon located on the desktop of the user “oscar”. It offers following buttons:
“Download Additional OSCAR Packages”: currently there are no downloadable packages for the OSCAR version included in the Virtual Appliance.
“Build OSCAR Client Image”: build additional client images. Can be used for building images of other distributions after a bit of tweaking. A Fedora Core 5 based image named fc5image is delivered with the Virtual Appliance.
“Add OSCAR Clients”: define additional OSCAR clients. The Virtual Appliance comes with no clients defined, use this button to define your cluster nodes. Their IP addresses should be defined in the same subnet as the headnode's eth0 subnet!
“Delete OSCAR Clients”: delete OSCAR clients from the OSCAR database.
“Install/Uninstall OSCAR Packages”: this step doesn't work currently.
“Monitor Cluster Deployment”: start a monitoring panel for visualizing the cluster node deployment status.
“Test Cluster Setup”: push this button to test the OSCAR installation after the deployment and the “Finish Cluster Setup” step.
“Network Boot Manager”: control the next boot action of client nodes which are booting from the network. Offers several configurable options: install, boot from installed harddisk, boot remotely offered kernel, run memory test.
“Ganglia Monitoring System”: open the ganglia system performance monitoring tool in a web browser.
“Nagios Monitoring System”: open the nagios cluster health monitoring tool in a web browser. Nagios requires authentication, the htpasswd file is located in /etc/nagios. The Virtual Appliance comes with nagios configured for two users:
user: guest, password: guest : has only read access
user: nagiosadmin, password: testnagios : has read/write access.
Links for further documentation, information on the project, mailing lists:
The OSCAR project homepage: http://oscar.openclustergroup.org
Developer mailing list: oscar-devel@lists.sourceforge.net
Define and Deploy Nodes
The Virtual OSCAR Headnode appliance comes with no client nodes defined, therefore defining nodes by pressing the “Add OSCAR Clients” button in the management panel is one the first step towards getting an OSCAR cluster up and running.
Adding new nodes to the cluster is a 3-step process. You will be guided through this process by the panel opened after pressing the “Add OSCAR Clients” button.
Define Client Nodes: define the name prefix, the number of nodes, the starting IP address, etc...
Setup
Networking: mainly you will need to collect the ethernet MAC
addresses of the nodes you intend to add. This can be done by
detecting them (i.e. press the “Detect MAC Addresses” button and
let the new nodes try to PXE-boot), entering them one by one or
loading them from a file.
After the network has been prepared
start client node deployment by network-booting the new nodes. They
will get an IP address from the headnode and will install themselves
by rsync-ing the node image to their local harddisk.
Of course
the client nodes can also be virtual machines, such that the entire
cluster can be virtualized.
ATTENTION: MAKE SURE YOU COLLECT THE
RIGHT MAC-ADDRESSES AND INSTALL THE RIGHT NODES! WHEN INSTALLING A
NODE ITS HARDDISK WILL BE ERASED AND OVERWRITTEN! THE AUTHOR TAKES
NO RESPONSIBILITY FOR NODES AND HARDDISKS WHICH HAVE BEEN
ERRONEOUSLY ERASED!
Finish Cluster Setup: this “post-install” step is needed in order to get the configuration files of the cluster applications finalized and in sync with the changed cluster state. Push this button when all client nodes were installed successfully.