Virtual OSCAR Cluster Headnode

Erich Focht <efocht at gmail dot com>



The Virtual OSCAR Cluster Headnode is based on the OSCAR (Open Source Cluster Application Resources) cluster infrastructure, a collaborative effort to make beowulf-type clusters easy to use and manage. The author is core OSCAR developer and uses VMware's VMplayer on a regular basis for developing and testing OSCAR. The virtual appliance has been created to facilitate the installation and management of clusters based on x86 nodes. The Virtual Headnode is ready built, configured and prepared for installing and controlling a cluster, most of the steps required for building an OSCAR cluster were already done. This setup is using recent developments in OSCAR which allowed the reduction of the size of the compressed virtual machine disk image to less than 1.5GB, including one cluster node image. This makes it feasible to backup and copy this image, use it for testing changes or for implementing highly available solutions based on VMware.



Right now OSCAR clusters are mainly used in the area of High Performance Computing for running simulation codes in areas like Computational Fluid Dynamics (CFD), Crash Simulations (eg. for the automotive industry), Bioinformatics, Chemistry, Molecular Dynamics, etc... One headnode is typically used to manage up to 200 cluster nodes, though this is not a limit imposed by the installed software. Users of OSCAR clusters are encouraged to register their clusters in the OSCAR cluster database, part of which is displayed in the OSCAR cluster map.



The VMware OSCAR virtual headnode appliance can be downloaded from http://www.vmware.com/vmtn/appliances/directory/341



The virtual appliance is a fully working OSCAR headnode prepared for the definition of client cluster nodes and their deployment. It is defined with two bridged ethernet interfaces:

A client node image based on Fedora Core 5 is predefined and ready built for deployment to cluster nodes. It is called fc5image. You can build additional images or import node images from other distributions.

Functionality & Components

  1. Image based installation / node deployment: built with the SystemInstallation Suite: SystemImager, SystemInstaller-OSCAR, SystemConfigurator.

  2. Resource Manager: the virtual headnode has the resource manager Torque installed and preconfigured with one queue. The resource scheduling is controlled by Maui. Use the command qsub for submitting jobs, qstat for checking their status, qdel for deleting them, pbsnodes for finding out about the availability of nodes.

  3. Parallel message passing libraries: MPI, PVM:

  1. Environment switcher: allows switching easily between environments like different MPI versions, compilers, etc... Read the man-pages of switcher. All delivered MPIs come with preconfigured switcher/modules.

  2. Live cluster management tools: C3 (Cluster Command and Control) and SC3 (Scalable (Sub-)Cluster Command and Control). Allow parallel cluster command execution with commands like cexec, cpush, ckill, cget, scexec, scpush, scrpm...

  3. Cluster user database synchronization via opium/sync_files: pushes password databases from the master node to the cluster nodes periodically. If the master node is integrated into the enterprise LDAP or NIS+ environment, the client nodes will get the passwords assembled from the LDAP or NIS+ database on the master node. There is no need to integrate the cluster nodes into LDAP, too.

  4. Shared /home filesystem across the cluster: the /home filesystem of the master node is exported to the client nodes via NFS. This allows them to share the SSH keys. SSH keys for accessing the cluster are automatically generated and allow accessing any node within the cluster without entering passwords.

  5. Cluster performance monitoring system: Ganglia 3.0.3 is installed on the master node and made accessible over its web server. The generic URL is http://master_node_address/ganglia . This allows monitoring a huge set of metrics related to the machine's activity and performance, including the time history of the measured values.

  6. Cluster health monitoring system: Nagios 1.3 is installed on the master node and automatically configured for monitoring services on the master node and the accessibility of client nodes, their filesystems and swap space. Nagios will restart certain services on the master node if they die via an event handler, thus keeping them highly available.
    Nagios is extremely versatile in its configuration and the way it notifies the administrators of any failures. The default configuration for OSCAR is to send notification emails to the mail alias nagios-admin on the headnode. Edit the /etc/aliases file if you want to change this behavior.

  7. NetBootManager for managing how and what client nodes will be booting if they are configured in the BIOS to boot via PXE. This application manages symbolic links to PXE configuration files for each cluster node.



Virtual Headnode Usage



Headnode Accessibility

The first step after booting a fresh Virtual OSCAR Headnode and logging in as user “oscar” (password: vmoscar) is to integrate its external interface (eth1) into the local network. Select System -> Administration -> Network, enter the root password (default: vmOSCAR) and edit the “eth1” setting according to your needs. Then open a shell and restart the network:

$ sudo service network restart

Check the accessibility of the machine from outside.



Management Interface

The main interface to the OSCAR system is the Management Interface which can be started by clicking on the OSCAR icon located on the desktop of the user “oscar”. It offers following buttons:



Links for further documentation, information on the project, mailing lists:



Define and Deploy Nodes

The Virtual OSCAR Headnode appliance comes with no client nodes defined, therefore defining nodes by pressing the “Add OSCAR Clients” button in the management panel is one the first step towards getting an OSCAR cluster up and running.



Adding new nodes to the cluster is a 3-step process. You will be guided through this process by the panel opened after pressing the “Add OSCAR Clients” button.

  1. Define Client Nodes: define the name prefix, the number of nodes, the starting IP address, etc...

  2. Setup Networking: mainly you will need to collect the ethernet MAC addresses of the nodes you intend to add. This can be done by detecting them (i.e. press the “Detect MAC Addresses” button and let the new nodes try to PXE-boot), entering them one by one or loading them from a file.
    After the network has been prepared start client node deployment by network-booting the new nodes. They will get an IP address from the headnode and will install themselves by rsync-ing the node image to their local harddisk.
    Of course the client nodes can also be virtual machines, such that the entire cluster can be virtualized.
    ATTENTION: MAKE SURE YOU COLLECT THE RIGHT MAC-ADDRESSES AND INSTALL THE RIGHT NODES! WHEN INSTALLING A NODE ITS HARDDISK WILL BE ERASED AND OVERWRITTEN! THE AUTHOR TAKES NO RESPONSIBILITY FOR NODES AND HARDDISKS WHICH HAVE BEEN ERRONEOUSLY ERASED!

  1. Finish Cluster Setup: this “post-install” step is needed in order to get the configuration files of the cluster applications finalized and in sync with the changed cluster state. Push this button when all client nodes were installed successfully.