Difference: HowToUseCondor (1 vs. 29)

Revision 292014-03-12 - ebender

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 37 to 37
 notify_user = jdoe@example.com
Changed:
<
<
The notifications can get overwhelming for large jobs, since you get one email per completed process. You can adjust this by setting the "notify" attribute in your submit description file. Valid options are "Always", "Complete" (the default), "Error", and "Never". For example:
>
>
The notifications can get overwhelming for large jobs, since you get one email per completed process. You can adjust this by setting the "notification" attribute in your submit description file. Valid options are "Always", "Complete" (the default), "Error", and "Never". For example:
 
Changed:
<
<
notify = Error
>
>
notification = Error
 

...will cause email to only be sent in the event the job encounters an error.

Revision 282013-08-06 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 96 to 98
  Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, /projects, /NLP_TOOLS, and /corpora are shared; however, /tmp is not. Make sure everything your job needs is located on one of the shared filesystems.
Added:
>
>
For reasons having to do with UW IT's use of Kerberos authentication, condor can not access the UDrive folder.
 If you want to put input, output, or error files on a non-shared filesystem such as /tmp, you can add stream_input=true, stream_output=true, and/or stream_error=true to your submit file. This tells Condor to pipe the output back to the original submitting system instead of creating the file on the node. It may add a slight performance penalty of you're doing a lot of I/O.

To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Additionally, processes on patas itself are limited to no more than 2 GB of RAM. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.

Revision 272011-09-12 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 80 to 80
 

$(Process) is a variable substitution; it will be replaced by the process number of each process that's queued. Consult the condor_submit manpage ( man condor_submit) for more details.

Added:
>
>

Helping us track research usage

If your job is research-related, please add the following to your submit description file, above the queue line:
+Research = True
This helps us track research vs. non-research jobs on our cluster and potentially qualify for certain tax exemptions. It does not affect job scheduling in any way.
 

Being "nice"

If you have a very large queue of jobs to run, but don't care if they finish quickly, you can add the following to your submit file as a courtesy to other users:

Revision 262011-09-09 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 16 to 16
 Log = foobar.log arguments = "-a -n" transfer_executable = false
Added:
>
>
request_memory = 2*1024
 Queue
Line: 23 to 24
 
  • The Executable line tells Condor what program we want to run.
    • The default path is the current directory. If the executable is somewhere else, you need to supply the full path -- Condor will not search for it the way the shell does.
    • If you need to know the full path to a program that's in your default path, use the which command at a shell prompt. For example: which lexparser.csh
Changed:
<
<
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. ("Vanilla" is now the default, as of Condor 7.2.x, so you can safely leave this line out.) Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
>
>
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. ("Vanilla" is now the default, so you can safely leave this line out.) Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
 
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
  • Log indicates where the Condor log file for this job should go.
  • transfer_executable = false tells Condor it does not need to copy the executable file to the compute node. This is usually the case, since the cluster nodes share a common filesystem.
Added:
>
>
  • request_memory = 2*1024 tells Condor this job wants 2 GB (2048 MB) of RAM. If you leave out the request_memory line, the default is 1024 MB. Note that if you over-estimate, you limit the number of machines your job can run on, but if you under-estimate and the job outgrows its memory request, Condor may kill it. The SIZE column in the output of the condor_q command shows the current memory usage in megabytes of a running job.
  /condor/examples contains some sample jobs. You may want to examine some of the submit description files there to get a better feel for how this works in different situations.

Revision 252011-08-23 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 23 to 23
 
  • The Executable line tells Condor what program we want to run.
    • The default path is the current directory. If the executable is somewhere else, you need to supply the full path -- Condor will not search for it the way the shell does.
    • If you need to know the full path to a program that's in your default path, use the which command at a shell prompt. For example: which lexparser.csh
Changed:
<
<
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. ("Vanilla" is now the default, as of Condor 7.2.x, so you can safely leave this line out.) Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
>
>
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. ("Vanilla" is now the default, as of Condor 7.2.x, so you can safely leave this line out.) Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
 
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
  • Log indicates where the Condor log file for this job should go.
  • transfer_executable = false tells Condor it does not need to copy the executable file to the compute node. This is usually the case, since the cluster nodes share a common filesystem.
Line: 97 to 98
 
Changed:
<
<
>
>

Revision 242011-08-16 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 35 to 35
 notify_user = jdoe@example.com
Added:
>
>
The notifications can get overwhelming for large jobs, since you get one email per completed process. You can adjust this by setting the "notify" attribute in your submit description file. Valid options are "Always", "Complete" (the default), "Error", and "Never". For example:
notify = Error
...will cause email to only be sent in the event the job encounters an error.
 

Submitting the job

Now that you have a description file, submitting it is as simple as: condor_submit foobar.cmd

Line: 48 to 54
 
  • condor_q lists the job queue.
  • condor_rm deletes a job from the queue.
Changed:
<
<
These commands normally only operate on jobs that have been submitted from the same machine they're run from. condor_q supports a -global switch to see all jobs.
>
>
These commands normally only operate on jobs that have been submitted from the same machine they're run from. condor_q supports a -global switch to see all jobs.
  All of these commands have manual pages that may be displayed with the man command.
Line: 91 to 97
 
Changed:
<
<
-- brodbd - 27 October 2009
>
>

Revision 232011-02-28 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 48 to 48
 
  • condor_q lists the job queue.
  • condor_rm deletes a job from the queue.
Added:
>
>
These commands normally only operate on jobs that have been submitted from the same machine they're run from. condor_q supports a -global switch to see all jobs.
 All of these commands have manual pages that may be displayed with the man command.

Additionally, CondorView provides status graphs, updated every 15 minutes.

Revision 222009-10-27 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 64 to 69
 

$(Process) is a variable substitution; it will be replaced by the process number of each process that's queued. Consult the condor_submit manpage ( man condor_submit) for more details.

Added:
>
>

Being "nice"

If you have a very large queue of jobs to run, but don't care if they finish quickly, you can add the following to your submit file as a courtesy to other users:

nice_user = true
 
Added:
>
>
This tells condor to let other jobs jump ahead of yours in the queue, when a new slot is available; in other words, processes in your job will only start on slots that no other jobs want.
 

Things to keep in mind

Added:
>
>
  Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, /projects, /NLP_TOOLS, and /corpora are shared; however, /tmp is not. Make sure everything your job needs is located on one of the shared filesystems.

If you want to put input, output, or error files on a non-shared filesystem such as /tmp, you can add stream_input=true, stream_output=true, and/or stream_error=true to your submit file. This tells Condor to pipe the output back to the original submitting system instead of creating the file on the node. It may add a slight performance penalty of you're doing a lot of I/O.

Line: 79 to 90
 
Changed:
<
<
-- brodbd - 17 Sep 2009
>
>
-- brodbd - 27 October 2009

Revision 212009-09-17 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 12 to 12
 input = foobar.in output = foobar.out error = foobar.error
Changed:
<
<
Log = /tmp/brodbd/foobar.log
>
>
Log = foobar.log
 arguments = "-a -n" transfer_executable = false Queue
Line: 22 to 22
 
  • The Executable line tells Condor what program we want to run.
    • The default path is the current directory. If the executable is somewhere else, you need to supply the full path -- Condor will not search for it the way the shell does.
    • If you need to know the full path to a program that's in your default path, use the which command at a shell prompt. For example: which lexparser.csh
Changed:
<
<
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
>
>
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. ("Vanilla" is now the default, as of Condor 7.2.x, so you can safely leave this line out.) Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
 
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
Changed:
<
<
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea. Input, output, and error files can go to your home directory.
>
>
  • Log indicates where the Condor log file for this job should go.
 
  • transfer_executable = false tells Condor it does not need to copy the executable file to the compute node. This is usually the case, since the cluster nodes share a common filesystem.

/condor/examples contains some sample jobs. You may want to examine some of the submit description files there to get a better feel for how this works in different situations.

Line: 43 to 43
  The easiest way to track the progress of your job is to check its logfile. The following commands are also helpful:
  • condor_status lists available nodes and their status.
  • condor_q lists the job queue.
Changed:
<
<
  • condor_hold and condor_rm put a job on hold and delete it from the queue, respectively.
>
>
  • condor_rm deletes a job from the queue.
 All of these commands have manual pages that may be displayed with the man command.

Additionally, CondorView provides status graphs, updated every 15 minutes.

Line: 54 to 54
 Multiple submissions can also be automated; for example, if we wanted to run the above job three times, with input files named "foobar.in0" through "foobar.in2", we could do the following:
Executable = foobar
Deleted:
<
<
Universe = vanilla
 getenv = true input = foobar.in$(Process) output = foobar.out$(Process)
Line: 69 to 68
 

Things to keep in mind

Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, /projects, /NLP_TOOLS, and /corpora are shared; however, /tmp is not. Make sure everything your job needs is located on one of the shared filesystems.
Changed:
<
<
If you want to put input, output, or error files on a non-shared filesystem such as /tmp, you can add stream_input=true, stream_output=true, and/or stream_error=true to your submit file. This tells Condor to pipe the output back to the original submitting system instead of creating the file on the node.
>
>
If you want to put input, output, or error files on a non-shared filesystem such as /tmp, you can add stream_input=true, stream_output=true, and/or stream_error=true to your submit file. This tells Condor to pipe the output back to the original submitting system instead of creating the file on the node. It may add a slight performance penalty of you're doing a lot of I/O.
  To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Additionally, processes on patas itself are limited to no more than 2 GB of RAM. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.

Automating Job Submission from the Command Line

The CondorExec program allows you to send an arbitrary command line to Condor.

Added:
>
>

Related Pages

 
Changed:
<
<
-- brodbd - 27 Mar 2009
>
>
-- brodbd - 17 Sep 2009

Revision 202009-03-27 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 22 to 22
 
  • The Executable line tells Condor what program we want to run.
    • The default path is the current directory. If the executable is somewhere else, you need to supply the full path -- Condor will not search for it the way the shell does.
    • If you need to know the full path to a program that's in your default path, use the which command at a shell prompt. For example: which lexparser.csh
Changed:
<
<
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
>
>
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
 
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea. Input, output, and error files can go to your home directory.
  • transfer_executable = false tells Condor it does not need to copy the executable file to the compute node. This is usually the case, since the cluster nodes share a common filesystem.
Line: 76 to 76
  The CondorExec program allows you to send an arbitrary command line to Condor.
Changed:
<
<
-- brodbd - 21 Feb 2008
>
>
-- brodbd - 27 Mar 2009

Revision 192009-01-22 - billmcn

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Changed:
<
<
You need to create a submit description file (sometimes referred to as a submit script) telling Condor how to run your program. For example, let's say we have a program called foobar that accepts input on stdin, produces output on stdout, and accepts a few command line arguments. To run this program normally, you might do something like this: foobar -a -n <foobar.in >foobar.out
>
>
You need to create a submit description file (sometimes referred to as a submit script) telling Condor how to run your program. For example, let's say we have a program called foobar that accepts input on stdin, produces output on stdout, and accepts a few command line arguments. To run this program normally, you might do something like this: foobar -a -n <foobar.in >foobar.out
  Here's a sample Condor submit file (let's call it foobar.cmd) that does the same thing:
Line: 36 to 35
 

Submitting the job

Changed:
<
<
Now that you have a description file, submitting it is as simple as: condor_submit foobar.cmd
>
>
Now that you have a description file, submitting it is as simple as: condor_submit foobar.cmd
  The job will be queued and run on the first available machine. You will receive an email message when it completes, either at your UW address or at the one you specified in the notify_user line in the submit file.
Line: 74 to 72
 If you want to put input, output, or error files on a non-shared filesystem such as /tmp, you can add stream_input=true, stream_output=true, and/or stream_error=true to your submit file. This tells Condor to pipe the output back to the original submitting system instead of creating the file on the node.

To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Additionally, processes on patas itself are limited to no more than 2 GB of RAM. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.

Added:
>
>

Automating Job Submission from the Command Line

 
Changed:
<
<
-- brodbd - 21 Feb 2008
>
>
The CondorExec program allows you to send an arbitrary command line to Condor.
 
Added:
>
>
-- brodbd - 21 Feb 2008

Revision 182008-07-31 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Changed:
<
<
You need to create a submit description file telling Condor how to run your program. For example, let's say we have a program called foobar that accepts input on stdin, produces output on stdout, and accepts a few command line arguments. To run this program normally, you might do something like this:
>
>
You need to create a submit description file (sometimes referred to as a submit script) telling Condor how to run your program. For example, let's say we have a program called foobar that accepts input on stdin, produces output on stdout, and accepts a few command line arguments. To run this program normally, you might do something like this:
 foobar -a -n <foobar.in >foobar.out

Here's a sample Condor submit file (let's call it foobar.cmd) that does the same thing:

Revision 172008-02-21 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 73 to 73
  If you want to put input, output, or error files on a non-shared filesystem such as /tmp, you can add stream_input=true, stream_output=true, and/or stream_error=true to your submit file. This tells Condor to pipe the output back to the original submitting system instead of creating the file on the node.
Changed:
<
<
To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.
>
>
To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Additionally, processes on patas itself are limited to no more than 2 GB of RAM. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.
 
Changed:
<
<
-- brodbd - 01 Feb 2008
>
>
-- brodbd - 21 Feb 2008
 

Revision 162008-02-01 - brodbd

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 51 to 51
 Additionally, CondorView provides status graphs, updated every 15 minutes.

Advanced options

Changed:
<
<
It's possible to submit multiple jobs with one submit file, using multiple Queue lines. Each submission can have different parameters. See /condor/condor-6.8.5/examples/loop.cmd for a good, well-documented example of this.
>
>
It's possible to submit multiple jobs with one submit file, using multiple Queue lines. Each submission can have different parameters. See /condor/examples/loop.cmd for a good, well-documented example of this.
  Multiple submissions can also be automated; for example, if we wanted to run the above job three times, with input files named "foobar.in0" through "foobar.in2", we could do the following:
Line: 75 to 75
  To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.
Changed:
<
<
-- DavidBrodbeck - 11 Jan 2008
>
>
-- brodbd - 01 Feb 2008
 

Revision 152008-01-11 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 28 to 28
 
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea. Input, output, and error files can go to your home directory.
  • transfer_executable = false tells Condor it does not need to copy the executable file to the compute node. This is usually the case, since the cluster nodes share a common filesystem.
Changed:
<
<
/condor/condor-6.8.5/examples contains some sample jobs. You may want to examine some of the submit description files there to get a better feel for how this works in different situations.
>
>
/condor/examples contains some sample jobs. You may want to examine some of the submit description files there to get a better feel for how this works in different situations.
  Note: If your email address is not of the form username@u.washington.edu, or if your cluster login and your University netid don't match, you should add a notify_user line to the submit description file to tell condor where to send mail. For example:
Line: 75 to 75
  To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.
Changed:
<
<
-- DavidBrodbeck - 26 Nov 2007
>
>
-- DavidBrodbeck - 11 Jan 2008
 

Revision 142007-11-26 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 69 to 69
 $(Process) is a variable substitution; it will be replaced by the process number of each process that's queued. Consult the condor_submit manpage (man condor_submit) for more details.

Things to keep in mind

Changed:
<
<
Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, /projects, /NLP_TOOLS, and /corpora are shared; however, /tmp is not. Condor will automatically transfer executables, input, and output files, but not necessarily libraries, modules, or additional files your software tries to open. Make sure these items are located on one of the shared filesystems.
>
>
Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, /projects, /NLP_TOOLS, and /corpora are shared; however, /tmp is not. Make sure everything your job needs is located on one of the shared filesystems.

If you want to put input, output, or error files on a non-shared filesystem such as /tmp, you can add stream_input=true, stream_output=true, and/or stream_error=true to your submit file. This tells Condor to pipe the output back to the original submitting system instead of creating the file on the node.

  To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.
Changed:
<
<
-- DavidBrodbeck - 11 Oct 2007
>
>
-- DavidBrodbeck - 26 Nov 2007
 

Revision 132007-10-11 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 20 to 20
 

A few of these lines require explanation.

Added:
>
>
  • The Executable line tells Condor what program we want to run.
    • The default path is the current directory. If the executable is somewhere else, you need to supply the full path -- Condor will not search for it the way the shell does.
    • If you need to know the full path to a program that's in your default path, use the which command at a shell prompt. For example: which lexparser.csh
 
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea. Input, output, and error files can go to your home directory.
Line: 70 to 73
  To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.
Changed:
<
<
-- DavidBrodbeck - 24 Sep 2007
>
>
-- DavidBrodbeck - 11 Oct 2007
 

Revision 122007-09-24 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 15 to 15
 error = foobar.error Log = /tmp/brodbd/foobar.log arguments = "-a -n"
Added:
>
>
transfer_executable = false
 Queue
Line: 22 to 23
 
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea. Input, output, and error files can go to your home directory.
Added:
>
>
  • transfer_executable = false tells Condor it does not need to copy the executable file to the compute node. This is usually the case, since the cluster nodes share a common filesystem.
  /condor/condor-6.8.5/examples contains some sample jobs. You may want to examine some of the submit description files there to get a better feel for how this works in different situations.
Line: 45 to 47
  Additionally, CondorView provides status graphs, updated every 15 minutes.
Deleted:
<
<

Reducing startup time

By default, Condor will automatically copy your executable file to the compute node. If you know that the executable files are already accessible to the node (e.g., files in your home directory, which is on a shared filesystem) you can reduce the amount of time needed to start your job by telling Condor not to transfer those files. You can do this by adding the following line to your submit file, above the queue line:
transfer_executable = false
This also greatly reduces the load on the file server, which tends to be a choke point when queueing a large number of jobs simultaneously.

Most of the time you will want this line in your submit file. All of the nodes have the same software load, and /home, /opt, /project, and /NLP_TOOLS are shared by all nodes. Generally speaking you should start with transfer_executable set to false, and only set it to true if you have problems.

 

Advanced options

It's possible to submit multiple jobs with one submit file, using multiple Queue lines. Each submission can have different parameters. See /condor/condor-6.8.5/examples/loop.cmd for a good, well-documented example of this.
Line: 77 to 70
  To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.
Changed:
<
<
-- DavidBrodbeck - 23 Aug 2007
>
>
-- DavidBrodbeck - 24 Sep 2007
 

Revision 112007-08-23 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 75 to 75
 

Things to keep in mind

Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, /projects, /NLP_TOOLS, and /corpora are shared; however, /tmp is not. Condor will automatically transfer executables, input, and output files, but not necessarily libraries, modules, or additional files your software tries to open. Make sure these items are located on one of the shared filesystems.
Changed:
<
<
-- DavidBrodbeck - 17 Aug 2007
>
>
To keep the cluster responsive, long-running processes run on patas itself will automatically have their CPU priority lowered. Processes submitted to Condor are not affected by this, so you should try to use Condor for anything CPU-intensive.

-- DavidBrodbeck - 23 Aug 2007

 

Revision 102007-08-22 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 46 to 46
 Additionally, CondorView provides status graphs, updated every 15 minutes.

Reducing startup time

Changed:
<
<
By default, Condor will automatically copy your executable file to the compute node. If you know that the executable files are already accessible to the node (e.g., files in your home directory, which is on a shared filesystem) you can greatly reduce the amount of time needed to start your job by telling Condor not to transfer those files. You can do this by adding the following line to your submit file, above the queue line:
>
>
By default, Condor will automatically copy your executable file to the compute node. If you know that the executable files are already accessible to the node (e.g., files in your home directory, which is on a shared filesystem) you can reduce the amount of time needed to start your job by telling Condor not to transfer those files. You can do this by adding the following line to your submit file, above the queue line:
 
transfer_executable = false

Revision 92007-08-17 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 45 to 45
  Additionally, CondorView provides status graphs, updated every 15 minutes.
Added:
>
>

Reducing startup time

By default, Condor will automatically copy your executable file to the compute node. If you know that the executable files are already accessible to the node (e.g., files in your home directory, which is on a shared filesystem) you can greatly reduce the amount of time needed to start your job by telling Condor not to transfer those files. You can do this by adding the following line to your submit file, above the queue line:
transfer_executable = false
This also greatly reduces the load on the file server, which tends to be a choke point when queueing a large number of jobs simultaneously.

Most of the time you will want this line in your submit file. All of the nodes have the same software load, and /home, /opt, /project, and /NLP_TOOLS are shared by all nodes. Generally speaking you should start with transfer_executable set to false, and only set it to true if you have problems.

 

Advanced options

It's possible to submit multiple jobs with one submit file, using multiple Queue lines. Each submission can have different parameters. See /condor/condor-6.8.5/examples/loop.cmd for a good, well-documented example of this.
Line: 64 to 73
 $(Process) is a variable substitution; it will be replaced by the process number of each process that's queued. Consult the condor_submit manpage (man condor_submit) for more details.

Things to keep in mind

Changed:
<
<
Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, and /corpora are shared; however, /tmp is not, so if you need to place input or output files there you'll need to investigate the should_transfer_files option, described in the condor_submit manpage and in the Condor user manual.
>
>
Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, /projects, /NLP_TOOLS, and /corpora are shared; however, /tmp is not. Condor will automatically transfer executables, input, and output files, but not necessarily libraries, modules, or additional files your software tries to open. Make sure these items are located on one of the shared filesystems.
 
Changed:
<
<
-- DavidBrodbeck - 19 Jul 2007
>
>
-- DavidBrodbeck - 17 Aug 2007
 

Revision 82007-07-19 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 43 to 43
 
  • condor_hold and condor_rm put a job on hold and delete it from the queue, respectively.
All of these commands have manual pages that may be displayed with the man command.
Added:
>
>
Additionally, CondorView provides status graphs, updated every 15 minutes.
 

Advanced options

It's possible to submit multiple jobs with one submit file, using multiple Queue lines. Each submission can have different parameters. See /condor/condor-6.8.5/examples/loop.cmd for a good, well-documented example of this.

Revision 72007-07-19 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 21 to 21
 A few of these lines require explanation.
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
Changed:
<
<
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea.
>
>
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea. Input, output, and error files can go to your home directory.
  /condor/condor-6.8.5/examples contains some sample jobs. You may want to examine some of the submit description files there to get a better feel for how this works in different situations.
Line: 64 to 64
 

Things to keep in mind

Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, and /corpora are shared; however, /tmp is not, so if you need to place input or output files there you'll need to investigate the should_transfer_files option, described in the condor_submit manpage and in the Condor user manual.
Changed:
<
<
-- DavidBrodbeck - 11 Jul 2007
>
>
-- DavidBrodbeck - 19 Jul 2007
 

Revision 62007-07-18 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 34 to 34
 Now that you have a description file, submitting it is as simple as: condor_submit foobar.cmd
Changed:
<
<
The job will be queued and run on the first available machine. You will receive an email message when it completes.
>
>
The job will be queued and run on the first available machine. You will receive an email message when it completes, either at your UW address or at the one you specified in the notify_user line in the submit file.
 

Managing jobs

Changed:
<
<
The easiest way to track the progress of your job is to check the logfile. The following commands are also helpful:
>
>
The easiest way to track the progress of your job is to check its logfile. The following commands are also helpful:
 
  • condor_status lists available nodes and their status.
  • condor_q lists the job queue.
  • condor_hold and condor_rm put a job on hold and delete it from the queue, respectively.

Revision 52007-07-13 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 19 to 19
 

A few of these lines require explanation.

Changed:
<
<
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnKong Wiki page for information on how to run PVM directly.
>
>
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; and java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnPatas Wiki page for information on how to run PVM directly.
 
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea.

Revision 42007-07-12 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 25 to 25
  /condor/condor-6.8.5/examples contains some sample jobs. You may want to examine some of the submit description files there to get a better feel for how this works in different situations.
Added:
>
>
Note: If your email address is not of the form username@u.washington.edu, or if your cluster login and your University netid don't match, you should add a notify_user line to the submit description file to tell condor where to send mail. For example:
notify_user = jdoe@example.com
 

Submitting the job

Now that you have a description file, submitting it is as simple as: condor_submit foobar.cmd

Revision 32007-07-12 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

Line: 19 to 19
 

A few of these lines require explanation.

Changed:
<
<
>
>
  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; java, for running Java programs directly. See the Condor manual for more information about these universes. The PVM universe is not currently supported, but see the PVMOnKong Wiki page for information on how to run PVM directly.
 
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea.

Revision 22007-07-11 - DavidBrodbeck

Line: 1 to 1
 

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

You need to create a submit description file telling Condor how to run your program. For example, let's say we have a program called foobar that accepts input on stdin, produces output on stdout, and accepts a few command line arguments. To run this program normally, you might do something like this: foobar -a -n <foobar.in >foobar.out
Changed:
<
<
Here's a sample Condor submit file that does the same thing:
>
>
Here's a sample Condor submit file (let's call it foobar.cmd) that does the same thing:
 
Executable = foobar
Universe   = vanilla
Line: 23 to 23
 
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea.
Added:
>
>
/condor/condor-6.8.5/examples contains some sample jobs. You may want to examine some of the submit description files there to get a better feel for how this works in different situations.
 

Submitting the job

Now that you have a description file, submitting it is as simple as: condor_submit foobar.cmd
Line: 31 to 33
 

Managing jobs

The easiest way to track the progress of your job is to check the logfile. The following commands are also helpful:
Changed:
<
<
  • condor_status lists available nodes and their status
>
>
  • condor_status lists available nodes and their status.
 
  • condor_q lists the job queue.
  • condor_hold and condor_rm put a job on hold and delete it from the queue, respectively.
Added:
>
>
All of these commands have manual pages that may be displayed with the man command.
 

Advanced options

Changed:
<
<
It's possible to submit multiple jobs with one submit file, using multiple Queue lines. Each submission can have different parameters. This can also be automated; for example, if we wanted to run the above job three times, with input files named "foobar.in0" through "foobar.in2", we could do the following:
>
>
It's possible to submit multiple jobs with one submit file, using multiple Queue lines. Each submission can have different parameters. See /condor/condor-6.8.5/examples/loop.cmd for a good, well-documented example of this.

Multiple submissions can also be automated; for example, if we wanted to run the above job three times, with input files named "foobar.in0" through "foobar.in2", we could do the following:

 
Executable = foobar
Universe   = vanilla
Line: 49 to 54
 Queue 3
Changed:
<
<
Consult the condor_submit manpage (man condor_submit) for more details.
>
>
$(Process) is a variable substitution; it will be replaced by the process number of each process that's queued. Consult the condor_submit manpage (man condor_submit) for more details.
 

Things to keep in mind

Changed:
<
<
Because the job will actually be run on a compute node, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, and /corpora are shared; however, /tmp is not, so if you place input or output files there you'll need to investigate the should_transfer_files option, described in the condor_submit manpage and in the Condor user manual.
>
>
Because the job will actually be run on a compute node, not on the system you're logged into, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, and /corpora are shared; however, /tmp is not, so if you need to place input or output files there you'll need to investigate the should_transfer_files option, described in the condor_submit manpage and in the Condor user manual.
 
Changed:
<
<
-- DavidBrodbeck - 10 Jul 2007
>
>
-- DavidBrodbeck - 11 Jul 2007
 

Revision 12007-07-11 - DavidBrodbeck

Line: 1 to 1
Added:
>
>

How do I use it? A quick Condor tutorial

How do I set up a job?

Creating a submit description file

You need to create a submit description file telling Condor how to run your program. For example, let's say we have a program called foobar that accepts input on stdin, produces output on stdout, and accepts a few command line arguments. To run this program normally, you might do something like this: foobar -a -n <foobar.in >foobar.out

Here's a sample Condor submit file that does the same thing:

Executable = foobar
Universe   = vanilla
getenv     = true
input      = foobar.in
output     = foobar.out
error      = foobar.error
Log        = /tmp/brodbd/foobar.log
arguments  = "-a -n"
Queue

A few of these lines require explanation.

  • Universe = vanilla indicates that this is an ordinary program that does not support checkpointing. Other possibilities include standard, for programs that are linked with the Condor libraries and support checkpointing and restarting; java, for running Java programs directly; and PVM, for PVM applications. See the Condor manual for more information about these universes.
  • getenv = true transfers all the environment variables that are set in the submitter's shell. This is what you want most of the time; much of our software depends on environment variables to locate binaries and libraries.
  • Log indicates where the Condor log file for this job should go. Condor complains if this is located on an NFS filesystem, so putting it in a subdirectory of /tmp is a good idea.

Submitting the job

Now that you have a description file, submitting it is as simple as: condor_submit foobar.cmd

The job will be queued and run on the first available machine. You will receive an email message when it completes.

Managing jobs

The easiest way to track the progress of your job is to check the logfile. The following commands are also helpful:
  • condor_status lists available nodes and their status
  • condor_q lists the job queue.
  • condor_hold and condor_rm put a job on hold and delete it from the queue, respectively.

Advanced options

It's possible to submit multiple jobs with one submit file, using multiple Queue lines. Each submission can have different parameters. This can also be automated; for example, if we wanted to run the above job three times, with input files named "foobar.in0" through "foobar.in2", we could do the following:
Executable = foobar
Universe   = vanilla
getenv     = true
input      = foobar.in$(Process)
output     = foobar.out$(Process)
error      = foobar.error$(Process)
Log        = /tmp/brodbd/foobar.log
arguments  = "-a -n"
Queue 3

Consult the condor_submit manpage (man condor_submit) for more details.

Things to keep in mind

Because the job will actually be run on a compute node, it's important to make sure that it will be able to access all the files it needs. Home directories, /opt, and /corpora are shared; however, /tmp is not, so if you place input or output files there you'll need to investigate the should_transfer_files option, described in the condor_submit manpage and in the Condor user manual.

-- DavidBrodbeck - 10 Jul 2007

 
This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
Privacy Statement Terms & Conditions