Documentation and Information:Computational clusters in Fine Hall

From CompudocWiki
Revision as of 11:16, 15 November 2005 by Plazonic (talk | contribs) (added macomp notes)
Jump to navigation Jump to search

Fine Hall machine room is currently hosting 3 different computational clusters:

Comp computational cluster

Description

Comp cluster is an older cluster consisting of 16 single CPU AMD Athlon machines with speeds around 1.6Ghz and memory per node ranging from 3.5GB to 512MB. Nodes are connected together with 100Mb ethernet networking and have 20GB-40GB hard drives.

Configuration

The cluster is integrated into Fine Hall Math/PACM network and all the cluster machines mount Math/PACM home directories and run the same operating system version as the rest of Fine Hall Linux machines - PU_IAS/Elders 2WS Linux (clone of RHEL4). The software set also closely matches Fine Hall Linux workstations though some of the graphical/desktop applications with no computational use have not been installed.

For temporary storage, besides /tmp, one can use also /scratch - with no quotas. It must be emphasized that both /scratch and /tmp cannot be used for permanent data storage and no crucial data should be stored there, e.g. use it for intermediate computational results. /tmp and /scratch are NOT backed up and can be erased at any time, especially if a reinstall of one or more machines is required or if one of these directories is full and other users need space. /tmp is also regularly cleaned up by a system job and any file in /tmp that hasn't been accessed in last 10 days will be deleted.

There is no special scheduling software in use and users are free to choose one or more machines on which to run their software.

Access

This cluster is fully accessible to all members of Math/PACM.

How to connect

Comp cluster node names are comp01, comp02, ..., comp16. You can connect to it with ssh but only from math.princeton.edu and pacm.princeton.edu. E.g. ssh comp08. In order to choose a machine which is not used or under less load run compload on either math.princeton.edu or pacm.princeton.edu.

Special Notes

We expect this cluster to be retired within a year or so.

Macomp computing cluster

Description

MaComp computational cluster consists of 26 dual Opteron 248 nodes (2.2Ghz operating frequency). Master node is equipped with 8GB of memory and the nodes with 2GB each. Nodes are connected with gigabit ethernet networking and have 120GB IDE hard drives.

Configuration

The cluster is integrated into Fine Hall Math/PACM network and all the cluster machines mount Math/PACM home directories. The operating system used on these machines is a clone of RHEL 3.

For temporary storage, besides /tmp, one can use also /scratch - with no quotas. It must be emphasized that both /scratch and /tmp cannot be used for permanent data storage and no crucial data should be stored there, e.g. use it for intermediate computational results. /tmp and /scratch are NOT backed up and can be erased at any time, especially if a reinstall of one or more machines is required or if one of these directories is full and other users need space. /tmp is also regularly cleaned up by a system job and any file in /tmp that hasn't been accessed in last 10 days will be deleted.

The scheduling software used is Sun's Grid Engine version 6.0 and all the jobs have to be submitted with SGE. Once logged in please check /usr/finehall/computing/sge/samples/readme.txt for basic instructions on how to submit jobs to SGE and in particular how to submit MPIch jobs. You can find sample submission scripts in /usr/finehall/computing/sge/samples

Access

At this time access is restricted to grant applicants/contributers.

How to connect

In order to connect to MaComp cluster you will have to login first to math.princeton.edu and from there you can:

ssh macomp

Login should proceed without the need to enter any passwords. If you are denied access or asked for a password then your account has not yet been allowed access to the cluster.

Wiffin computing cluster

Description

Wiffin computational cluster consists of 20 dual Xeon 2.2Ghz nodes. Half of the nodes have 2GB and the other half 4GB memory. Nodes are connected with gigabit ethernet networking.

Configuration

The cluster is running a version of RedHat Linux.

Scheduling software used is Sun's Grid Engine version 5.3 and all the jobs have to be submitted with SGE.

Access

The access to this cluster is restricted to members of Prof. Emily Carter's research group and it is not otherwise part of Fine Hall network of Math/PACM Linux machines.