Skip to topic | Skip to bottom
Home
Main
Main.PongoUser1.9 - 10 Jun 2007 - 01:03 - DavidBrodbecktopic end

Start of topic | Skip to actions

Getting the most out of the compute servers

Sketch of set up

Pongo is the head node in a cluster with 8 other computers. Each computer has 4GB RAM, but the disk space is mostly on Pongo (the fileserver). The current cluster management/load balancing software is Mosix (http://openmosix.sourceforge.net/), which allows processes to automatically migrate to other nodes, under certain conditions (see also http://howto.x-tend.be/openMosixWiki/index.php/don't):

  1. At least one other process is running on the head node
  2. The process to migrate has been running for at least a minute
  3. The process to migrate does not (under the current version of Mosix) used shared memory. For practical purposes, this means most programs written in C or C++ are able to migrate, but Java programs are not. (In addition, Matlab and BLAST won't migrate.)

When a process migrates, this is completely transparent to the user: the output is still written to the expected place on Pongo, etc.

How do I make sure my process can migrate if need be?

  • Use C or C++ instead of Java
  • Make sure any libraries you're using don't use shared memory
  • Compile Java code with gcj

There is reason to believe that future releases of Mosix will be able to migrate code with shared memory, but we're not there yet.

How can I tell if my process can migrate?

  1. Run the process and find it's process ID (PID). (The command ps u will show you all of the processes that belong to you.)
  2. If the process cannot be migrated, the reason will be listed in a file called cantmove in the directory /proc/PID/cantmove (where PID is replaced by the actual process ID). Note that this file gets cleaned up when the program exits. Here's a description of the possible reasons:

  • clone_vm: the application is using thread
  • monkey: the application is using files as shared memory
  • daemon: daemon process
  • rt_sched: real-time scheduling
  • mmap_dev: process is mapping a device
  • direct_io: direct I/O permission
  • mem_lock: locks memory

How can I tell if my process has migrated?

You can use mtop to tell whether a program has been migrated. mtop is an open-mosix-aware version of the standard UNIX utility top that adds two columns to the output relating to process migration. The N# column gives the node numner a given process is running on, and the MGS column tells the number of times it has migrated. All user shell interaction takes place on node 0, so if either of these columns contains a non-zero number, the corresponding process has been migrated. See man mtop for more details.

How can I get a snapshot of the load on the pongo cluster?

Run mosmon.

How can I make my process run on multiple machines?

The easy way (which is only applicable to certain types of tasks) is to write a script (perl script or shell script) which splits the process into separate processes and invokes each one. Mosix can then migrate those processes onto different nodes. For example, if you need to parse 10,000 sentences (stored in one input file), your script can create 10 input files, invoke the parser for each file, and then concatenate the results. The splitting and concatenating processes at the beginning would only run on one machine each, but the parsing would potentially be split across all the machines.

What if Mosix migration seems to cause my code to crash?

If your process is crashing mysteriously, especially when you run large data sets, Mosix migration may be an issue. There have been reports of mysterious crashing and segmentation faults occurring as the result of migration of certain Perl and Make scripts, as well as with some Python programs.

If you believe migration may be causing crashes in your code, you can force your code to lock to a particular node in the cluster using mosrun followed by preferred node number as a switch, and the -L switch, which locks the command to the node. For example, if I wanted to lock a hypothetical nlpProcess to node 19, I would type:

mosrun -L -19 nlpProcess

In some cases you may need to lock the process to the head node using runhome, e.g.:

runhome nlpProcess

runhome is a synonym for mosrun -L -1.

What if there is no sensible way to divide the task up into N chunks? Can I still make my programs parallel?

Yes, but it takes more effort. The tool to use is MPI (message passing interface), and to do so, you have to write it into your code. This kind of programming is trickier (you have to worry about how the parallelization works into your algorithm), but in the long run, potentially a valuable skill to have. The relevant libraries are installed on Pongo. For more information, see: /usr/share/doc/mpi-doc/ on Pongo and the tutorials at http://www.lam-mpi.org/

How Can I Check on the Status of my Processes?

Some useful information here.

Sharing the sandbox

Policy

In order for everyone to get the most out of our servers, everyone needs to play nicely, with respect to memory and cpu time. In addition to asking everyone to be mindful of being efficient with resources (are you loading the whole Penn Treebank into RAM?), we have developed the following policy with regard to long-running processes:

  1. Any process that has run for over 4 hours of CPU time will automatically trigger an email to the owner of the process.
  2. If the owner does not reply to that email (with some reasonable amount of time), the process is eligible to be terminated by the system administrators (though it would only be terminated if other processes need the machines).
  3. If you know ahead of time that your process will be long, alert linghelp@u.

Strategies for efficient use of resources:

  • Test your program on small amounts of data before going whole hog.
  • If you have load a lot of data into memory, consider whether there is a more efficient way of doing so.
  • Avoid starting intensive processes at the last minute (end of quarter, homework deadlines, etc).
  • Each node has 2 CPUs. You will generally get the best performance by launching one process for each CPU. For example, if the cluster has 8 nodes running, you would launch 16 processes. The nodecheck command will tell you how many nodes are currently running.
  • If you are running a program that doesn't migrate (such as a Java program), launching too many simultaneous processes will only slow them all down. Any more than two CPU-intensive processes will simply be stealing CPU time from each other. Additionally, running too many memory-intensive processes may cause the system to swap to disk, greatly slowing down performance.

-- EmilyBender - 09 Jan 2006, DavidBrodbeck - 10 Jun 2007
to top


You are here: Main > QuarterlySchedule > PongoUse

to top

Copyright © 1999-2008 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback