Basic commands on your machine for Condor
What is this about?
Here you will find the basic commands to check the status of your installed Condor software. Much more details are provided in the official Condor Version 6.6.10 Manual.
Note that all commands are run from the command line, for Windows this means you have to run "cmd" (Start menu->Run->cmd) and change to the Condor directory (e.g. "cd \Condor\bin"). Or go to the overview page
Checking the status
- condor_status shows all machine in the cluster and whether they run jobs, should contain yours as well.
- condor_status -server shows memory and computing power of the cluster.
- condor_status -java shows the Java version available on each machine in the cluster.
- condor_userprio -all shows the current priority status of all active users. This will change with the amount of jobs you submit, so that in the longrun every user gets a fair share of the available computing power.
- condor_userprio -allusers -all as above, but also shows data about users not currently active.
Checking your submitted jobs
Details on how to submit jobs are given in the local submit section, only a compact listing here:
In the above, $JOB_NUM can always be either a specific job number (e.g. 120.3) or a whole batch (e.g. 120).
- condor_q shows your job queue plus job status (R: running, I: inactive, X: marked for removal)
- condor_q -global as above, but shows all queued jobs
- condor_q -run shows only your running jobs
- condor_q -run -global as above but all running jobs
- condor_submit $FILE submit jobs described by $FILE to your queue
- condor_submit_dagman $FILE submit jobs that depend on being run in a certain order
- condor_reschedule immediately talk to the central manager, after submitting jobs this might speed up their start
- condor_prio -p $prio $JOB_NUM job $JOB_NUM gets priority $prio assigned, with prio ranging from -20 (least important) to +20; however only priority among your jobs is changed, other resources are not affected.
- condor_rm $JOB_NUM removes the job $JOB_NUM from your queue, whether running or idle
- condor_rm -forcex $JOB_NUM immediately kills the job
- condor_qedit $USER ImageSize 60 this changes the amount of memory needed by your job. For some reason there seems to be a bug in Condor that makes it think jobs are bigger than they are and then they might get stuck. This will get them going again (important!).
Useful bash scripts for bigger projects
If you have Bash and Grep (most Linux distros and CygWin) you can download these scripts which the author has written for housekeeping with many jobs:
Click script name to download, then change "CONDOR_BIN=/usr/local/condor/bin" to your Condor path in these scripts.
- condor_grep $GREP_STRING $CONDOR_COMMAND $CONDOR_COMMAND (possibly with parameters) will be applied to all jobs matching $GREP_STRING
- condor_forcex will forcefully remove all jobs form the queue which are marked X
- condor_randprio will randomize the priorities of all your jobs (believe me, can be useful)