Difference between revisions of "Documentation and Information:Computational clusters in Fine Hall"

Revision as of 11:32, 26 July 2010

Fine Hall machine room is currently hosting 1 mini computational cluster

NewComp computing cluster

Description

NewComp mini computational cluster consists of 4 nodes, each with 2 Xeon X5680 CPUs - 6 cores each, 12 cores total, running at 3.33GHz. Each node has 96GB of memory - 8 GB/node. Head node is equipped with one Intel Xeon X5650 CPUs (6 cores total, running at 2.67GHz) and with 12 GB of memory.

Nodes are connected with gigabit ethernet networking as well as 4x Infiniband.

Configuration

The cluster is integrated into Fine Hall Math/PACM network and all the cluster machines mount Math/PACM home directories. The operating system used on these machines is a clone of RHEL 6.

For temporary storage, besides /tmp, one can use also /scratch - with no quotas. It must be emphasized that both /scratch and /tmp cannot be used for permanent data storage and no crucial data should be stored there, e.g. use it for intermediate computational results. /tmp and /scratch are NOT backed up and can be erased at any time, especially if a reinstall of one or more machines is required or if one of these directories is full and other users need space. /tmp is also regularly cleaned up by a system job and any file in /tmp that hasn't been accessed in last 10 days will be deleted.

Head node /scratch is approximately 3TBs and its subdirectory /scratch/network is exported to all nodes (as /scratch/network). Therefore if you need to access/write temporary data from all nodes create a subdirectory of /scratch/network (like /scratch/network/username) and read/write there.

Nodes also have local /scratch space and their size is approximately 700GB. This local disk is also quite fast so consider it for fast data writing and reading. Just like with /scratch/network create /scratch/network/username and read/write from there. As mentioned above the /scratch/network on these nodes is mounted from the head node and while bigger in size it is also a lot slower then the local disk.

It cannot be emphasize enough that /scratch (and /scratch/network) is for temporary data storage only. Data placed there will occasionally be purged (without notice, oldest first) as needed to ensure all users have enough space.

Access

At this time the cluster is open to all Math and PACM members.

How to connect

In order to connect to NewComp cluster you will have to login first to math.princeton.edu and from there you can:

ssh newcomp

Login should proceed without the need to enter any passwords.

Scheduling/Running Jobs

No jobs/computations, expect maybe very short test runs, should be run on the head node. Any other jobs will be terminated without prior notice.

All jobs have to be submitted to the scheduler which will take care of assigning the necessary resources and running the job. Any computations found running without being submitted through the scheduler or that were submitted incorrectly (e.g. if the job consumes more cores then allocated or runs after it was supposed to complete) will be terminated without prior notice.

The scheduler in use on newcomp is torque/maui.

Torque/Maui Queues

The scheduler will automatically place your job in one of the following queues. Here are their names and their current limits:

Short Length Queue

4 hour wall clock limit
48 max processes total (of all users together)
3 nodes max per job

Medium Length Queue

4-24 hour wall clock limit
24 max processes total (of all users together)
2 nodes max per job

Long Length Queue

24 hour-7 days wall clock limit
24 max processes total (of all users together)
12 max processes per user

Submitting Single Core/Serial Jobs

To run a single core program with executable called, say, myprogram, you will need to write a job script for torque. Here is a sample command script, serial.cmd, which uses (of course) 1 core:

cd my_serial_directory
cat serial.cmd

# serial job using 1 node and 1 processor, and runs
# for 3 hours (max).
#PBS -l nodes=1:ppn=1,walltime=3:00:00
#
# sends mail if the process aborts, when it begins, and
# when it ends (abe)
#PBS -m abe
#
cd $HOME/my_serial_directory
./myprogram

To submit the job to the scheduling system, use:

qsub serial.cmd

@@ Line 1: / Line 1: @@
-Fine Hall machine room is currently hosting 3 different computational clusters:
+Fine Hall machine room is currently hosting 1 mini computational cluster
-== Comp computational cluster ==
+== NewComp computing cluster ==
 === Description ===
-Comp cluster is an older cluster consisting of 16 single CPU AMD Athlon machines with speeds around 1.6Ghz and memory per node ranging from 1GB to 512MB.  Nodes are connected together with 100Mb ethernet networking and have 20GB-40GB hard drives.
+NewComp mini computational cluster consists of 4 nodes, each with 2 Xeon X5680 CPUs - 6 cores each, 12 cores total, running at 3.33GHz.  Each node has 96GB of memory - 8 GB/node.  Head node is equipped with one Intel Xeon X5650 CPUs (6 cores total, running at 2.67GHz) and with 12 GB of memory.
+Nodes are connected with gigabit ethernet networking as well as 4x Infiniband.
 === Configuration ===
-The cluster is integrated into Fine Hall Math/PACM network and all the cluster machines mount Math/PACM home directories and run the same operating system version as the rest of Fine Hall Linux machines - PU_IAS/Elders 5 Linux (clone of RHEL5).  The software set also closely matches Fine Hall Linux workstations though some of the graphical/desktop applications with no computational use have not been installed.
+The cluster is integrated into Fine Hall Math/PACM network and all the cluster machines mount Math/PACM home directories.  The operating system used on these machines is a clone of RHEL 6.
-For temporary storage, besides /tmp, one can use also /scratch - with no quotas.  It must be emphasized that both /scratch and /tmp cannot be used for permanent data storage and no crucial data should be stored there, e.g. use it for intermediate computational results.  /tmp and /scratch are '''NOT''' backed up and can be erased at any time, especially if a reinstall of one or more machines is required or if one of these directories is full and other users need space. /tmp is also regularly cleaned up by a system job and any file in /tmp that hasn't been accessed in last 10 days will be deleted.
+For temporary storage, besides /tmp, one can use also /scratch - with no quotas. It must be emphasized that both /scratch and /tmp cannot be used for permanent data storage and no crucial data should be stored there, e.g. use it for intermediate computational results. /tmp and /scratch are NOT backed up and can be erased at any time, especially if a reinstall of one or more machines is required or if one of these directories is full and other users need space. /tmp is also regularly cleaned up by a system job and any file in /tmp that hasn't been accessed in last 10 days will be deleted.
-The scheduling software used on the cluster is the Sun Grid Engine.
+Head node /scratch is approximately 3TBs and its subdirectory /scratch/network is exported to all nodes (as /scratch/network). Therefore if you need to access/write temporary data from all nodes create a subdirectory of /scratch/network (like /scratch/network/username) and read/write there.
+Nodes also have local /scratch space and their size is approximately 700GB.  This local disk is also quite fast so consider it for fast data writing and reading.  Just like with /scratch/network create /scratch/network/username and read/write from there.  As mentioned above the /scratch/network on these nodes is mounted from the head node and while bigger in size it is also a lot slower then the local disk.
+It cannot be emphasize enough that /scratch (and /scratch/network) is for '''temporary''' data storage '''only'''.  Data placed there will occasionally be purged (without notice, oldest first) as needed to ensure all users have enough space.
 === Access ===
-This cluster is fully accessible to all members of Math/PACM.
+At this time the cluster is open to all Math and PACM members.
 === How to connect ===
-Comp cluster's head node name is comp (comp01).  You can connect to it with ssh but only from '''math.princeton.edu''' and '''pacm.princeton.edu'''.  E.g. '''<tt>ssh comp</tt>'''.
+In order to connect to NewComp cluster you will have to login first to <tt>math.princeton.edu</tt> and from there you can:
+ ssh newcomp
+Login should proceed without the need to enter any passwords.
-== How to Use ==
+=== Scheduling/Running Jobs ===
-No computations/jobs should be ran on the cluster without the use of the scheduling software, SGE.  Any jobs not using SGE might be removed at any time.
+No jobs/computations, expect maybe very short test runs, should be run on the head node.  Any other jobs will be terminated without prior notice.
-== Macomp computing cluster ==
+All jobs have to be submitted to the scheduler which will take care of assigning the necessary resources and running the job.  Any computations found running without being submitted through the scheduler or that were submitted incorrectly (e.g. if the job consumes more cores then allocated or runs after it was supposed to complete) will be terminated without prior notice.
-=== Description ===
-MaComp computational cluster consists of 26 dual Opteron 248 nodes (2.2Ghz operating frequency).  Master node is equipped with 8GB of memory and the nodes with 2GB each.  Nodes are connected with gigabit ethernet networking and have 120GB IDE hard drives.
-=== Configuration ===
-The cluster is integrated into Fine Hall Math/PACM network and all the cluster machines mount Math/PACM home directories.  The operating system used on these machines is a clone of RHEL 3.
-For temporary storage, besides /tmp, one can use also /scratch - with no quotas. It must be emphasized that both /scratch and /tmp cannot be used for permanent data storage and no crucial data should be stored there, e.g. use it for intermediate computational results. /tmp and /scratch are NOT backed up and can be erased at any time, especially if a reinstall of one or more machines is required or if one of these directories is full and other users need space. /tmp is also regularly cleaned up by a system job and any file in /tmp that hasn't been accessed in last 10 days will be deleted.
+The scheduler in use on newcomp is torque/maui.
-The scheduling software used is Sun's Grid Engine version 6.0 and all the jobs '''have to''' be submitted with SGE.  Once logged in please check <tt>/usr/finehall/computing/sge/samples/readme.txt</tt> for basic instructions on how to submit jobs to SGE and in particular how to submit MPIch jobs. You can find sample submission scripts in <tt>/usr/finehall/computing/sge/samples</tt>
+==== Torque/Maui Queues ====
-=== Access ===
+The scheduler will automatically place your job in one of the following queues.  Here are their names and their current limits:
-At this time access is restricted to grant applicants/contributers.
+===== Short Length Queue =====
+* 4 hour wall clock limit
+* 48 max processes total (of all users together)
+* 3 nodes max per job
+===== Medium Length Queue =====
+* 4-24 hour wall clock limit
+* 24 max processes total (of all users together)
+* 2 nodes max per job
+===== Long Length Queue =====
+* 24 hour-7 days wall clock limit
+* 24 max processes total (of all users together)
+* 12 max processes per user
-=== How to connect ===
+==== Submitting Single Core/Serial Jobs ====
-In order to connect to MaComp cluster you will have to login first to <tt>math.princeton.edu</tt> and from there you can:
+To run a single core program with executable called, say, myprogram, you will need to write a job script for torque. Here is a sample command script, serial.cmd, which uses (of course) 1 core:
- ssh macomp
-Login should proceed without the need to enter any passwords.  If you are denied access or asked for a password then your account has not yet been allowed access to the cluster.
-== Wiffin computing cluster ==
-=== Description ===
-Wiffin computational cluster consists of 20 dual Xeon 2.2Ghz nodes.  Half of the nodes have 2GB and the other half 4GB memory.  Nodes are connected with gigabit ethernet networking.
-=== Configuration ===
+ cd my_serial_directory
-The cluster is running a version of RedHat Linux.
+ cat serial.cmd
+ # serial job using 1 node and 1 processor, and runs
+ # for 3 hours (max).
+ #PBS -l nodes=1:ppn=1,walltime=3:00:00
+ #
+ # sends mail if the process aborts, when it begins, and
+ # when it ends (abe)
+ #PBS -m abe
+ #
+ cd $HOME/my_serial_directory
+ ./myprogram
-Scheduling software used is Sun's Grid Engine version 5.3 and all the jobs '''have to''' be submitted with SGE.
+To submit the job to the scheduling system, use:
-=== Access ===
+ qsub serial.cmd
-The access to this cluster is restricted to members of Prof. Emily Carter's research group and it is not otherwise part of Fine Hall network of Math/PACM Linux machines.