Difference: BatchSystem (1 vs. 16)

Revision 162017-05-30 - GordonStewart

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 63 to 63
  $ qsub <FILENAME>
Changed:
<
<
After running this command, the ID of the newly-submitted will be output. For example, to submit a job defined by the submission script test.pbs:
>
>
After running this command, the ID of the newly-submitted job will be output. For example, to submit a job defined by the submission script test.pbs:
 
$ qsub test.pbs

Revision 152016-07-07 - GordonStewart

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 9 to 9
 The PPE group maintains a PBS cluster for running small quantities of jobs. If you need to run large numbers of jobs, you should investigate the possibility of running on ScotGrid. The current composition of the batch system is as follows:

Nodes Operating System Total CPU Cores
Added:
>
>
node001 to node003 Scientific Linux 6 96
 
node007 Scientific Linux 6 40
node008 Scientific Linux 5 4
Changed:
<
<
node013 to node017 Scientific Linux 5 20
node019 Scientific Linux 6 4
>
>
node013 to node015 Scientific Linux 5 12
 
node034 Scientific Linux 6 56
Deleted:
<
<
tempnode001 to tempnode006 Scientific Linux 5 24
tempnode007 to tempnode015 Scientific Linux 6 36
  The PBS headnode is offler.ppe.gla.ac.uk, and you will see this name in the output of various PBS commands.

Revision 142016-04-25 - GordonStewart

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Added:
>
>

Overview

 The PPE group maintains a PBS cluster for running small quantities of jobs. If you need to run large numbers of jobs, you should investigate the possibility of running on ScotGrid. The current composition of the batch system is as follows:

Nodes Operating System Total CPU Cores
Line: 15 to 17
 
tempnode001 to tempnode006 Scientific Linux 5 24
tempnode007 to tempnode015 Scientific Linux 6 36
Changed:
<
<
The following queues are provided:
>
>
The PBS headnode is offler.ppe.gla.ac.uk, and you will see this name in the output of various PBS commands.

Queues

 
Name Operating System Maximum runtime
short5 Scientific Linux 5 1 hour
Line: 29 to 34
  Jobs running in the vlong* queues can be pre-empted by jobs in the short* and medium* queues. A pre-empted job is placed in the suspended state; it remains in memory on the compute node, but is no longer being executed. Once the pre-empting job has finished, the pre-empted job will be allowed to continue.
Changed:
<
<
The PBS headnode is offler.ppe.gla.ac.uk, and you will see this name in the output of various PBS commands.
>
>

Job Prioritisation

The cluster is configured with a fair-share scheduler, which aims to distribute compute time fairly among users. When multiple users are competing for resources, preference will be shown to users whose recent usage has been lower. Short jobs are also generally given priority over longer jobs.

 

Using PBS

Line: 42 to 51
 
#PBS -N TestJob

Changed:
<
<
#PBS -l walltime=1,mem=1024Mb #PBS -m abe #PBS -M user@machine #
>
>
#PBS -o test.log #PBS -j oe #PBS -l mem=1024Mb
 echo "This is a test..."
Line: 75 to 86
  Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------- -------- ---------------- ------ ----- --- ------ ----- - -----
Changed:
<
<
1000151.offler.p rrabbit medium6 maus_sim_814 56289 1 1 -- 05:59 R 03:21 node034 1000152.offler.p bbunny long6 test_job 29669 1 1 -- 24:00 R 01:24 node007
>
>
1000151.offler.p rrabbit medium6 test_job_123 56299 1 1 -- 05:59 R 03:21 node034 1000152.offler.p bbunny long6 test_job 29369 1 1 -- 24:00 R 01:24 node007
 
Changed:
<
<
This

Queues

There are currently eight queues on the batch system. The four queues ending in '4' will run jobs on SL4 machines and the four queues ending in '5' will run jobs on SL5 machines:
>
>
You can also provide a job ID to limit the output to a particular job:
 

Changed:
<
<
Queue Memory CPU Time Walltime Node Run Que Lm State
------ -------- -------- ---- --- --- -- ----- short4 -- -- 01:00:00 -- 0 0 -- E R medium4 -- -- 06:00:00 -- 0 0 -- E R long4 -- -- 24:00:00 -- 0 0 -- E R vlong4 -- -- 120:00:0 -- 0 0 -- E R short5 -- -- 01:00:00 -- 0 0 -- E R medium5 -- -- 06:00:00 -- 0 0 -- E R long5 -- -- 24:00:00 -- 0 0 -- E R vlong5 -- -- 120:00:0 -- 0 0 -- E R

where short5 is the default queue and Walltime is the maximum walltime allowed on each queue.

While it is possible to view your own jobs with qstat, the command will not display all jobs. To display all jobs use the Maui client command showq

To see the current priorities of waiting jobs use the command showq -i.

>
>
$ qstat 1000151
 
Changed:
<
<

Job Priority

The priority of a job is the sum of several weighting factors.

>
>
offler.ppe.gla.ac.uk: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------- -------- ---------------- ------ ----- --- ------ ----- - ----- 1000151.offler.p rrabbit medium6 test_job_123 56299 1 1 -- 05:59 R 03:21 node034
 
Deleted:
<
<
  • There is a constant weighting given to short jobs and smaller weighting given to medium and long jobs. So that if all other factors are equal short jobs will have priority.
  • The primary weighting is user fairshare. As a users jobs run their usage increases and the priority of their queued jobs decreases. This is balanced so that a user who uses exactly their fairshare allotment (currently 20% of the cpu averaged over the previous 48 days) will have their medium job priority decreased such that the medium job priority is equal to someone else's vlong job priority who has not used the batch system in the previous 48 days.
  • Waiting jobs priority slowly increases as a function of time waiting in the queue. Currently a vlong job would have to wait several weeks to match the priority of a medium queue job all other things being equal.
 

Delete a job

Revision 132016-04-22 - GordonStewart

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Changed:
<
<
The PPE group maintains a PBS cluster for running small quantities of jobs. If you need to run large numbers of jobs, you should investigate the possibility of running on ScotGrid.
>
>
 
Changed:
<
<
The batch system uses the TORQUE resource manager (based on OpenPBS) and the Maui scheduler. It can be accessed from any Linux desktop using the commands described below.

The current composition of the batch system is as follows:

>
>
The PPE group maintains a PBS cluster for running small quantities of jobs. If you need to run large numbers of jobs, you should investigate the possibility of running on ScotGrid. The current composition of the batch system is as follows:
 
Nodes Operating System Total CPU Cores
Changed:
<
<
node123 to node456 SL5 999
>
>
node007 Scientific Linux 6 40
node008 Scientific Linux 5 4
node013 to node017 Scientific Linux 5 20
node019 Scientific Linux 6 4
node034 Scientific Linux 6 56
tempnode001 to tempnode006 Scientific Linux 5 24
tempnode007 to tempnode015 Scientific Linux 6 36
  The following queues are provided:

Name Operating System Maximum runtime
Changed:
<
<
short5 SL5 1 hour
medium5 SL5 6 hours
long5 SL5 1 day
vlong5 SL5 5 days
short6 SL6 1 hour
medium6 SL6 6 hours
long6 SL6 1 day
vlong6 SL6 5 days
>
>
short5 Scientific Linux 5 1 hour
medium5 Scientific Linux 5 6 hours
long5 Scientific Linux 5 1 day
vlong5 Scientific Linux 5 5 days
short6 Scientific Linux 6 1 hour
medium6 Scientific Linux 6 6 hours
long6 Scientific Linux 6 1 day
vlong6 Scientific Linux 6 5 days

Jobs running in the vlong* queues can be pre-empted by jobs in the short* and medium* queues. A pre-empted job is placed in the suspended state; it remains in memory on the compute node, but is no longer being executed. Once the pre-empting job has finished, the pre-empted job will be allowed to continue.

The PBS headnode is offler.ppe.gla.ac.uk, and you will see this name in the output of various PBS commands.

 

Using PBS

Added:
>
>
Batch jobs can be submitted and managed from any Linux desktop using the commands described in this section. Further information on these commands can be found in the linked documentation and Linux man pages at the bottom of this page.
 

Create a submission script

Jobs are defined using a submission script, which is like a shell script with the addition of certain directives (indicated by the #PBS prefix) which tell PBS how the job should be handled. A simple submission script might look like the following:

Line: 44 to 55
  $ qsub <FILENAME>
Changed:
<
<
To submit a job defined by the submission script test.pbs:
>
>
After running this command, the ID of the newly-submitted will be output. For example, to submit a job defined by the submission script test.pbs:
 
Changed:
<
<
$ qsub test.pbs
>
>
$ qsub test.pbs
1000150.offler.ppe.gla.ac.uk
 
Changed:
<
<
More details can be found in the qsub man page.
>
>
The numerical portion of this ID (1000150 in this example) can be used to manage the job in the future.
 

Show running jobs

Added:
>
>
You can view details of submitted jobs using the qstat command:

$ qstat

offler.ppe.gla.ac.uk:
                                                                         Req'd  Req'd   Elap
Job ID               Username Queue    Jobname          SessID NDS   TSK Memory Time  S Time
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
1000151.offler.p     rrabbit  medium6  maus_sim_814      56289     1   1    --  05:59 R 03:21   node034
1000152.offler.p     bbunny   long6    test_job          29669     1   1    --  24:00 R 01:24   node007

This

 
Line: 78 to 106
  To see the current priorities of waiting jobs use the command showq -i.
Deleted:
<
<

Job Pre-emption

Jobs in the vlong4 and vlong5 queues can be preempted by jobs waiting in the short4, short5, medium4 or medium5 queues. A preempted job is placed in the suspended state - it remains in memory but is not longer being executed. Once the preempting job has finished the preempted job starts executing again.

 

Job Priority

The priority of a job is the sum of several weighting factors.

Line: 90 to 114
 
  • The primary weighting is user fairshare. As a users jobs run their usage increases and the priority of their queued jobs decreases. This is balanced so that a user who uses exactly their fairshare allotment (currently 20% of the cpu averaged over the previous 48 days) will have their medium job priority decreased such that the medium job priority is equal to someone else's vlong job priority who has not used the batch system in the previous 48 days.
  • Waiting jobs priority slowly increases as a function of time waiting in the queue. Currently a vlong job would have to wait several weeks to match the priority of a medium queue job all other things being equal.
Changed:
<
<

Killing a job

>
>

Delete a job

Jobs are deleted using the qdel command:

$ qsub <JOB_ID>

To delete the job with ID 12345:

$ qdel 12345

References

 
Changed:
<
<
Jobs may be terminated by executing qdel JOBID where the JOBID is the numerical ID code returned in the qstat listing.
>
>

Revision 122016-04-22 - GordonStewart

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Changed:
<
<
The PPE group has limited resources for batch computing. The ppepbs batch system is provided for running a small number of jobs. If a large number of jobs are needed then please use the Grid or try the Compute Cluster.
>
>
The PPE group maintains a PBS cluster for running small quantities of jobs. If you need to run large numbers of jobs, you should investigate the possibility of running on ScotGrid.
 
Changed:
<
<
The PPE batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. The batch system can be accessed from any linux desktop using the pbs commands described below.
>
>
The batch system uses the TORQUE resource manager (based on OpenPBS) and the Maui scheduler. It can be accessed from any Linux desktop using the commands described below.
 
Changed:
<
<
The batch nodes are installed with mixture of 64 bit SL4 and SL5. There are 47 cpus for SL5 jobs and 40 cpus for SL4 jobs. Eight queues are available spilt into two groups: four queues for SL4 jobs and four queues for SL5 jobs (see the queues section below). Executables should be built on one of the PPE Linux desktops machines of the required flavour. The version of scientific linux install on a machine can be checked by examining the /etc/redhat-release file:
>
>
The current composition of the batch system is as follows:
 
Changed:
<
<
cat /etc/redhat-release
>
>
Nodes Operating System Total CPU Cores
node123 to node456 SL5 999
 
Changed:
<
<
and to check if the a machine is 32 or 64 bit:
>
>
The following queues are provided:
 
Changed:
<
<
uname -m
>
>
Name Operating System Maximum runtime
short5 SL5 1 hour
medium5 SL5 6 hours
long5 SL5 1 day
vlong5 SL5 5 days
short6 SL6 1 hour
medium6 SL6 6 hours
long6 SL6 1 day
vlong6 SL6 5 days
 
Changed:
<
<
a 64 bit machine will return x86_64 and 32 bit machine i686.
>
>

Using PBS

 
Changed:
<
<

Job submission

From any ppe linux desktop jobs can be submitted to a TORQUE queue via qsub, e.g.:
>
>

Create a submission script

 
Changed:
<
<
qsub test.job

where test.job might contain

>
>
Jobs are defined using a submission script, which is like a shell script with the addition of certain directives (indicated by the #PBS prefix) which tell PBS how the job should be handled. A simple submission script might look like the following:
 
#PBS -N TestJob

Line: 38 to 38
 echo "This is a test..."
Changed:
<
<
More documentation is given in the qsub man page.
>
>

Submit a job

Jobs are submitted using the qsub command:

$ qsub <FILENAME>

To submit a job defined by the submission script test.pbs:

$ qsub test.pbs

More details can be found in the qsub man page.

Show running jobs

 

Queues

There are currently eight queues on the batch system. The four queues ending in '4' will run jobs on SL4 machines and the four queues ending in '5' will run jobs on SL5 machines:
Line: 77 to 93
 

Killing a job

Jobs may be terminated by executing qdel JOBID where the JOBID is the numerical ID code returned in the qstat listing.

Deleted:
<
<
-- AndrewPickford - 12 Jan 2009
 \ No newline at end of file

Revision 112015-05-28 - GavinKirby

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 74 to 74
 
  • The primary weighting is user fairshare. As a users jobs run their usage increases and the priority of their queued jobs decreases. This is balanced so that a user who uses exactly their fairshare allotment (currently 20% of the cpu averaged over the previous 48 days) will have their medium job priority decreased such that the medium job priority is equal to someone else's vlong job priority who has not used the batch system in the previous 48 days.
  • Waiting jobs priority slowly increases as a function of time waiting in the queue. Currently a vlong job would have to wait several weeks to match the priority of a medium queue job all other things being equal.
Added:
>
>

Killing a job

Jobs may be terminated by executing qdel JOBID where the JOBID is the numerical ID code returned in the qstat listing.

  -- AndrewPickford - 12 Jan 2009 \ No newline at end of file

Revision 102013-05-21 - AndrewPickford

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 70 to 70
  The priority of a job is the sum of several weighting factors.
Changed:
<
<
  • There is a constant weighting given to short jobs and smaller weighting given to medium and long jobs. So that if all other factors are equal short job will have priority.
  • The primary weighting is user fairshare. As a users jobs run their usage increases and the priority of their queued jobs decreases. This is balanced so that a user who uses exactly their fairshare allotment (currently 20% of the cpu averaged over the previous week) will have their medium job priority decreased such that the medium job priority is equal to someone elses vlong job priority who has not used the batch system in the previous week.
  • A waiting jobs priority slowly increases as a function of time waiting in the queue. Currently a vlong job would have to wait several weeks to match the priority of a medium queue job all other things being equal.
>
>
  • There is a constant weighting given to short jobs and smaller weighting given to medium and long jobs. So that if all other factors are equal short jobs will have priority.
  • The primary weighting is user fairshare. As a users jobs run their usage increases and the priority of their queued jobs decreases. This is balanced so that a user who uses exactly their fairshare allotment (currently 20% of the cpu averaged over the previous 48 days) will have their medium job priority decreased such that the medium job priority is equal to someone else's vlong job priority who has not used the batch system in the previous 48 days.
  • Waiting jobs priority slowly increases as a function of time waiting in the queue. Currently a vlong job would have to wait several weeks to match the priority of a medium queue job all other things being equal.
 

-- AndrewPickford - 12 Jan 2009 \ No newline at end of file

Revision 92013-05-07 - AndrewPickford

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 24 to 24
 From any ppe linux desktop jobs can be submitted to a TORQUE queue via qsub, e.g.:


Deleted:
<
<
ssh ppepbs
 qsub test.job

Revision 82013-01-17 - AndrewPickford

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Changed:
<
<
The PPE group has limited resources for batch computing. The ppepbs batch system is provided for running a small number of jobs. If a large number of jobs are needed then please use the Grid or try the Compute Cluster
>
>
The PPE group has limited resources for batch computing. The ppepbs batch system is provided for running a small number of jobs. If a large number of jobs are needed then please use the Grid or try the Compute Cluster.
 
Changed:
<
<

PPEPBS

>
>
The PPE batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. The batch system can be accessed from any linux desktop using the pbs commands described below.
 
Changed:
<
<
The PPE group has a small batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. The batch nodes are installed with mixture of 64 bit SL4 and SL5. There are 47 cpus for SL5 jobs and 40 cpus for SL4 jobs. Eight queues are available spilt into two groups: four queues for SL4 jobs and four queues for SL5 jobs (see the queues section below). Executables should be built on one of the PPE Linux desktops machines of the required flavour. The version of scientific linux install on a machine can be checked by examining the /etc/redhat-release file:
>
>
The batch nodes are installed with mixture of 64 bit SL4 and SL5. There are 47 cpus for SL5 jobs and 40 cpus for SL4 jobs. Eight queues are available spilt into two groups: four queues for SL4 jobs and four queues for SL5 jobs (see the queues section below). Executables should be built on one of the PPE Linux desktops machines of the required flavour. The version of scientific linux install on a machine can be checked by examining the /etc/redhat-release file:
 
cat /etc/redhat-release

Line: 21 to 21
 a 64 bit machine will return x86_64 and 32 bit machine i686.

Job submission

Changed:
<
<
Jobs can be submitted to a TORQUE queue via qsub, e.g.:
>
>
From any ppe linux desktop jobs can be submitted to a TORQUE queue via qsub, e.g.:
 
ssh ppepbs

Line: 39 to 39
 echo "This is a test..."
Changed:
<
<
More documentation is given in the qsub man page. The TORQUE documentation pages installed on ppepbs can be listed via

ssh ppepbs
rpm -ql torque-docs
>
>
More documentation is given in the qsub man page.
 

Queues

Changed:
<
<
There are currently eight queues on ppepbs. The four queues ending in '4' will run jobs on SL4 machines and the four queues ending in '5' will run jobs on SL5 machines:
>
>
There are currently eight queues on the batch system. The four queues ending in '4' will run jobs on SL4 machines and the four queues ending in '5' will run jobs on SL5 machines:
 
Queue            Memory CPU Time Walltime Node  Run Que Lm  State

Revision 72012-01-18 - AndrewPickford

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 64 to 64
  where short5 is the default queue and Walltime is the maximum walltime allowed on each queue.
Added:
>
>
While it is possible to view your own jobs with qstat, the command will not display all jobs. To display all jobs use the Maui client command showq

To see the current priorities of waiting jobs use the command showq -i.

Job Pre-emption

 Jobs in the vlong4 and vlong5 queues can be preempted by jobs waiting in the short4, short5, medium4 or medium5 queues. A preempted job is placed in the suspended state - it remains in memory but is not longer being executed. Once the preempting job has finished the preempted job starts executing again.
Changed:
<
<
While it is possible to view your own jobs with qstat, the command will not display all jobs. To display all jobs use the Maui client command showq
>
>

Job Priority

The priority of a job is the sum of several weighting factors.

  • There is a constant weighting given to short jobs and smaller weighting given to medium and long jobs. So that if all other factors are equal short job will have priority.
  • The primary weighting is user fairshare. As a users jobs run their usage increases and the priority of their queued jobs decreases. This is balanced so that a user who uses exactly their fairshare allotment (currently 20% of the cpu averaged over the previous week) will have their medium job priority decreased such that the medium job priority is equal to someone elses vlong job priority who has not used the batch system in the previous week.
  • A waiting jobs priority slowly increases as a function of time waiting in the queue. Currently a vlong job would have to wait several weeks to match the priority of a medium queue job all other things being equal.
 

-- AndrewPickford - 12 Jan 2009 \ No newline at end of file

Revision 62011-05-10 - AndrewPickford

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 62 to 62
 vlong5 -- -- 120:00:0 -- 0 0 -- E R
Changed:
<
<
where short5 is the default queue and Walltime is the maximum walltime allowed on each queue. While it is possible to view your own jobs with qstat, the command will not display all jobs. To display all jobs use the Maui client command showq
>
>
where short5 is the default queue and Walltime is the maximum walltime allowed on each queue.

Jobs in the vlong4 and vlong5 queues can be preempted by jobs waiting in the short4, short5, medium4 or medium5 queues. A preempted job is placed in the suspended state - it remains in memory but is not longer being executed. Once the preempting job has finished the preempted job starts executing again.

While it is possible to view your own jobs with qstat, the command will not display all jobs. To display all jobs use the Maui client command showq

 

-- AndrewPickford - 12 Jan 2009 \ No newline at end of file

Revision 52010-07-21 - AndrewPickford

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 6 to 6
 

PPEPBS

Changed:
<
<
The PPE group has a small batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. The batch nodes are installed with mixture of 64 bit SL4 and SL5. Eight queues are available spilt into two groups: four queues for SL4 jobs and four queues for SL5 jobs (see the queues section below). Executables should be built on one of the PPE Linux desktops machines of the required flavour. The version of scientific linux install on a machine can be checked by examining the /etc/redhat-release file:
>
>
The PPE group has a small batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. The batch nodes are installed with mixture of 64 bit SL4 and SL5. There are 47 cpus for SL5 jobs and 40 cpus for SL4 jobs. Eight queues are available spilt into two groups: four queues for SL4 jobs and four queues for SL5 jobs (see the queues section below). Executables should be built on one of the PPE Linux desktops machines of the required flavour. The version of scientific linux install on a machine can be checked by examining the /etc/redhat-release file:
 
cat /etc/redhat-release

Revision 42010-07-21 - AndrewPickford

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 6 to 6
 

PPEPBS

Changed:
<
<
The PPE group has a small batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. Each of the nodes within this batch system has 64 bit SL 4.4 installed. Currently the LINUX desktops are a mixture of SL4x and SL5x. To run on the batch system executables should therefore be built on an 64 bit SL4x PPE LINUX desktop or ppelx64sl4 (see LoginServices). To check the version scientific linux installed:
>
>
The PPE group has a small batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. The batch nodes are installed with mixture of 64 bit SL4 and SL5. Eight queues are available spilt into two groups: four queues for SL4 jobs and four queues for SL5 jobs (see the queues section below). Executables should be built on one of the PPE Linux desktops machines of the required flavour. The version of scientific linux install on a machine can be checked by examining the /etc/redhat-release file:
 
cat /etc/redhat-release

Line: 47 to 47
 

Queues

Changed:
<
<
There are currently three queues on ppepbs:
>
>
There are currently eight queues on ppepbs. The four queues ending in '4' will run jobs on SL4 machines and the four queues ending in '5' will run jobs on SL5 machines:
 
Queue            Memory CPU Time Walltime Node  Run Que Lm  State
---------------- ------ -------- -------- ----  --- --- --  -----

Changed:
<
<
short -- -- 01:00:00 -- 0 0 -- E R medium -- -- 06:00:00 -- 0 0 -- E R long -- -- 24:00:00 -- 0 0 -- E R vlong -- -- 120:00:0 -- 0 0 -- E R
>
>
short4 -- -- 01:00:00 -- 0 0 -- E R medium4 -- -- 06:00:00 -- 0 0 -- E R long4 -- -- 24:00:00 -- 0 0 -- E R vlong4 -- -- 120:00:0 -- 0 0 -- E R short5 -- -- 01:00:00 -- 0 0 -- E R medium5 -- -- 06:00:00 -- 0 0 -- E R long5 -- -- 24:00:00 -- 0 0 -- E R vlong5 -- -- 120:00:0 -- 0 0 -- E R
 
Changed:
<
<
where short is the default queue and Walltime is the maximum walltime allowed on each queue. While it is possible to view your own jobs with qstat, the command will not display all jobs. To display all jobs use the Maui client command showq
>
>
where short5 is the default queue and Walltime is the maximum walltime allowed on each queue. While it is possible to view your own jobs with qstat, the command will not display all jobs. To display all jobs use the Maui client command showq
 

-- AndrewPickford - 12 Jan 2009 \ No newline at end of file

Revision 32010-03-03 - MichaelWright

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Revision 22009-05-08 - AndrewPickford

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

Batch System

Line: 6 to 6
 

PPEPBS

Changed:
<
<
The PPE group has a small batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. Each of the nodes within this batch system has 64 bit SL 4.4 installed. Currently the LINUX desktops are a mixture of SL4x and SL5x. To run on the batch system executables should therefore be built on an SL4x PPE LINUX desktop or PPELX (see LoginServices). To check the version scientific linux installed:
>
>
The PPE group has a small batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. Each of the nodes within this batch system has 64 bit SL 4.4 installed. Currently the LINUX desktops are a mixture of SL4x and SL5x. To run on the batch system executables should therefore be built on an 64 bit SL4x PPE LINUX desktop or ppelx64sl4 (see LoginServices). To check the version scientific linux installed:
 
cat /etc/redhat-release
Added:
>
>
and to check if the a machine is 32 or 64 bit:

uname -m

a 64 bit machine will return x86_64 and 32 bit machine i686.

 

Job submission

Jobs can be submitted to a TORQUE queue via qsub, e.g.:

Revision 12009-01-12 - AndrewPickford

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

Batch System

The PPE group has limited resources for batch computing. The ppepbs batch system is provided for running a small number of jobs. If a large number of jobs are needed then please use the Grid or try the Compute Cluster

PPEPBS

The PPE group has a small batch system that is managed via the TORQUE Resource Manager (based on OpenPBS) and the Maui scheduler. Each of the nodes within this batch system has 64 bit SL 4.4 installed. Currently the LINUX desktops are a mixture of SL4x and SL5x. To run on the batch system executables should therefore be built on an SL4x PPE LINUX desktop or PPELX (see LoginServices). To check the version scientific linux installed:

cat /etc/redhat-release

Job submission

Jobs can be submitted to a TORQUE queue via qsub, e.g.:

ssh ppepbs
qsub test.job

where test.job might contain

#PBS -N TestJob
#PBS -l walltime=1,mem=1024Mb
#PBS -m abe
#PBS -M user@machine
#
echo "This is a test..."

More documentation is given in the qsub man page. The TORQUE documentation pages installed on ppepbs can be listed via

ssh ppepbs
rpm -ql torque-docs

Queues

There are currently three queues on ppepbs:

Queue            Memory CPU Time Walltime Node  Run Que Lm  State
---------------- ------ -------- -------- ----  --- --- --  -----
short              --      --    01:00:00   --    0   0 --   E R
medium             --      --    06:00:00   --    0   0 --   E R
long               --      --    24:00:00   --    0   0 --   E R
vlong              --      --    120:00:0   --    0   0 --   E R

where short is the default queue and Walltime is the maximum walltime allowed on each queue. While it is possible to view your own jobs with qstat, the command will not display all jobs. To display all jobs use the Maui client command showq

-- AndrewPickford - 12 Jan 2009

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback