Difference: ProductionRota (1 vs. 21)

Revision 212016-03-11 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 14 to 14
  When this is done, and if this wiki is not updated, continue with the next production on the list, moving upwards on the page.
Changed:
<
<
ALERT! This wiki section will be updated often, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately. Read the logbook/blog for more up-to-date information: http://na62shifts.wordpress.com
>
>
ALERT! This wiki section will be updated often, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately. Read the logbook/blog for more up-to-date information: https://na62.gla.ac.uk/elog (or older elog at http://na62shifts.wordpress.com/).
 

Shifts sign up

Line: 27 to 27
  The production plan is located here:
Changed:
<
<
http://na62.gla.ac.uk/index.php?task=production
>
>
https://na62.gla.ac.uk/index.php?task=production
  This displays a database-generated table that fills up along the way. New items may be added by the production coordinator along the way. Only Mark and Dan can use Ganga for job management at this time, but it will be available soon for everyone.
Line: 41 to 41
 
  • intervene if possible to fix the error(s)
  • notify the production coordinator if not possible to directly intervene to rectify the error
Changed:
<
<
  • notify the site admin if jobs systematically fail at a given site. Check the jobs history to detect any problems at sites (e.g. jobs consistently finishing early, or going to status CLEARED without registering any output). ALERT! If jobs fail at a site, exclude it from job submissions and notify the site admin!
>
>
  • notify the site admin if jobs systematically fail at a given site. Check the jobs history to detect any problems at sites (e.g. jobs consistently finishing early, or going to status CLEARED without registering any output). ALERT! If jobs fail at a site, exclude it from job submissions and notify the site admin!
 
  • notify Janusz if outputs fail to be replicated on CERN Castor (labelled LS instead of CC in the files table)

Checklist

The person in charge must do the following:

Changed:
<
<
  1. check current production status from the wiki and from our logbook
>
>
  1. check current production status from the wiki and from our logbook
 
  1. keep an eye on the jobs list and make sure we have 200 jobs RUNNING at all times and not more than 50 SCHEDULED. ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift, read the (b)logbook and eventually contact the shifter before you for clarifications.
  2. if queue is low (less than 100 RUNNING), use online script creator to submit a new batch of jobs
Changed:
<
<
  1. check production page to see if we've reached 100%. If yes, then start next production round on the list. Rounds that need to be done are the ones with 0 runs, and 0 files, and marked with "Not yet started".
>
>
  1. check production page to see if we've reached 100%. If yes, then start next production round on the list. Rounds that need to be done are the ones with 0 runs, and 0 files, and marked with "Not yet started".
 
  1. re-validate results weekly (every Monday) with the physics group (email Tonino to check the outputs on Castor)
  2. notify production coordinator if any major errors occur
Changed:
<
<
  1. log everything in our logbook/blog: http://na62shifts.wordpress.com
>
>
  1. log everything in our logbook (https://na62.gla.ac.uk/elog) or blog (http://na62shifts.wordpress.com, no longer used)
 

Troubleshooting

Revision 202013-03-18 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 8 to 8
 

Today's production plan

Changed:
<
<
Keep the queue full, which means 300+ jobs running at all times, and less than 100 scheduled.
>
>
Keep the queue full, which means 200-300 jobs running at all times, and less than 100 scheduled.
 
Changed:
<
<
We run decay type 1, Kch2pipi0 simulations with software version v7/r193 at IC, RAL, GLA, LIV and BIR. Revision 193 is 5 times faster so we will run 6000 events runs. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
>
>
We run decay type 10, Kch2pipipi simulations with software version v9/r261 at IC, RAL, GLA, LIV, BIR and UCL. We run 6000 events jobs.
 
Changed:
<
<
Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 6000 events for v7 runs. Please pay attention when submitting jobs.

When the Kch2pipi0-1 production round is done, we will switch to the next round, which is Kch2munu-1 (decay type 30). For Kch2munu, it takes only 1 s per event, so we could run 40000-50000 events jobs. Check jobs table to extract these statistics.

>
>
When this is done, and if this wiki is not updated, continue with the next production on the list, moving upwards on the page.
  ALERT! This wiki section will be updated often, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately. Read the logbook/blog for more up-to-date information: http://na62shifts.wordpress.com

Shifts sign up

Changed:
<
<
As of October 15, 2012 we have the following names: Antonio Cassese, Mark Slater, Philip Rubin, Paolo Massarotti, Monica Pepe, Spasimir Balev, Vito Palladino, Mario Vormstein , Karim Massri.
>
>
We have the following shifters: Antonio Cassese, Mark Slater, Philip Rubin, Paolo Massarotti, Monica Pepe, Spasimir Balev, Vito Palladino, Mario Vormstein , Karim Massri.
 
Changed:
<
<
We have a Doodle poll entitled "NA62 Grid Production Shifts", where volunteers can sign up for shifts. If you've volunteered here for shifts, please check your email for the poll address. Here's a snapshot from Doodle: schedule 1.
>
>
We have a Doodle poll entitled "NA62 Grid Production Shifts", where volunteers can sign up for shifts. If you've volunteered here for shifts, please check your email for the poll address.
 

Production plan

Line: 31 to 29
  http://na62.gla.ac.uk/index.php?task=production
Changed:
<
<
This displays a database-generated table that fills up along the way. New items will be added by the production coordinator as discussed at the Siena meeting in August 2012.
>
>
This displays a database-generated table that fills up along the way. New items may be added by the production coordinator along the way. Only Mark and Dan can use Ganga for job management at this time, but it will be available soon for everyone.
 

Grid Production Shifts

Line: 40 to 37
 

What to do on shift

Changed:
<
<
MC jobs should be submitted manually. The person overseeing the production (the "shift" taker) will have to check periodically if everything goes according to plan, and
>
>
Until Ganga is available, MC jobs should be submitted manually. The person overseeing the production (the "shift" taker) will have to check periodically if everything goes according to plan, and
 
  • intervene if possible to fix the error(s)
  • notify the production coordinator if not possible to directly intervene to rectify the error
Line: 52 to 49
 The person in charge must do the following:

  1. check current production status from the wiki and from our logbook
Changed:
<
<
  1. keep an eye on the jobs list and make sure we have 200 jobs RUNNING at all times and not more than 50 SCHEDULED. ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift.
>
>
  1. keep an eye on the jobs list and make sure we have 200 jobs RUNNING at all times and not more than 50 SCHEDULED. ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift, read the (b)logbook and eventually contact the shifter before you for clarifications.
 
  1. if queue is low (less than 100 RUNNING), use online script creator to submit a new batch of jobs
Changed:
<
<
  1. check production page to see if we've reached 100%. If yes, then start next production round on the list.
  2. re-validate results weekly (every Monday) with the physics group (email Tonino to check the outputs on CAstor)
>
>
  1. check production page to see if we've reached 100%. If yes, then start next production round on the list. Rounds that need to be done are the ones with 0 runs, and 0 files, and marked with "Not yet started".
  2. re-validate results weekly (every Monday) with the physics group (email Tonino to check the outputs on Castor)
 
  1. notify production coordinator if any major errors occur
  2. log everything in our logbook/blog: http://na62shifts.wordpress.com

Revision 192012-10-25 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 8 to 8
 

Today's production plan

Changed:
<
<
Keep the queue full, which means 200+ jobs running at all times, and less than 100 scheduled.
>
>
Keep the queue full, which means 300+ jobs running at all times, and less than 100 scheduled.
 
Changed:
<
<
We run decay type 1, Kch2pipi0 simulations with software version v7/r193 at IC, RAL, GLA and BIR. Revision 193 is 5 times faster so we will run 6000 events runs. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
>
>
We run decay type 1, Kch2pipi0 simulations with software version v7/r193 at IC, RAL, GLA, LIV and BIR. Revision 193 is 5 times faster so we will run 6000 events runs. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
  Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 6000 events for v7 runs. Please pay attention when submitting jobs.

Revision 182012-10-24 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 10 to 10
  Keep the queue full, which means 200+ jobs running at all times, and less than 100 scheduled.
Changed:
<
<
We run decay type 1, Kch2pipi0 simulations with software version v7/r193 at IC, RAL and BIR. Revision 193 is 5 times faster so we will run 6000 events runs. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
>
>
We run decay type 1, Kch2pipi0 simulations with software version v7/r193 at IC, RAL, GLA and BIR. Revision 193 is 5 times faster so we will run 6000 events runs. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
  Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 6000 events for v7 runs. Please pay attention when submitting jobs.

Revision 172012-10-24 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 14 to 14
  Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 6000 events for v7 runs. Please pay attention when submitting jobs.
Changed:
<
<
When the Kch2pipi0-1 production round is done, we will switch to the next round, which is Kch2munu-1 (decay type 30).
>
>
When the Kch2pipi0-1 production round is done, we will switch to the next round, which is Kch2munu-1 (decay type 30). For Kch2munu, it takes only 1 s per event, so we could run 40000-50000 events jobs. Check jobs table to extract these statistics.
  ALERT! This wiki section will be updated often, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately. Read the logbook/blog for more up-to-date information: http://na62shifts.wordpress.com
Line: 23 to 23
  As of October 15, 2012 we have the following names: Antonio Cassese, Mark Slater, Philip Rubin, Paolo Massarotti, Monica Pepe, Spasimir Balev, Vito Palladino, Mario Vormstein , Karim Massri.
Changed:
<
<
We have a Doodle poll entitled "NA62 Grid Production Shifts", where volunteers can sign up for shifts. If you've volunteered here for shifts, please check your email for the poll address.
>
>
We have a Doodle poll entitled "NA62 Grid Production Shifts", where volunteers can sign up for shifts. If you've volunteered here for shifts, please check your email for the poll address. Here's a snapshot from Doodle: schedule 1.
 

Production plan

Line: 64 to 64
 In case you find an error produced by the online interface, please immediately notify Dan, Janusz and Tonino.

META FILEATTACHMENT attachment="NA62_requirements_table.pdf" attr="" comment="Production plan (August 2012)" date="1348235473" name="NA62_requirements_table.pdf" path="NA62_requirements_table.pdf" size="38998" stream="NA62_requirements_table.pdf" tmpFilename="/usr/tmp/CGItemp26671" user="DanProtopopescu" version="1"
Added:
>
>
META FILEATTACHMENT attachment="MCshifts1.png" attr="" comment="MC shifts snapshot from Doodle (1)" date="1351079875" name="MCshifts1.png" path="MCshifts1.png" size="52170" stream="MCshifts1.png" tmpFilename="/usr/tmp/CGItemp24832" user="DanProtopopescu" version="1"

Revision 162012-10-24 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 8 to 8
 

Today's production plan

Added:
>
>
Keep the queue full, which means 200+ jobs running at all times, and less than 100 scheduled.
 We run decay type 1, Kch2pipi0 simulations with software version v7/r193 at IC, RAL and BIR. Revision 193 is 5 times faster so we will run 6000 events runs. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.

Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 6000 events for v7 runs. Please pay attention when submitting jobs.

Revision 152012-10-23 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 36 to 36
  If all goes well, GP "shifts" will not require too much work. Shifts are defined as 9am to 9pm CERN time but it is enough if the person on shift checks the production status and submits jobs twice a day or so. Is problems arise, this might require another hour or so of your time, since you will not be expected to actually fix the problem but only to notify the admins.
Changed:
<
<

How it all works

>
>

What to do on shift

 
Changed:
<
<
MC jobs will be submitted by automaticall by cron jobs. The person overseeing the production (the "shift" taker) will have to check periodically if everything goes according to plan, and
>
>
MC jobs should be submitted manually. The person overseeing the production (the "shift" taker) will have to check periodically if everything goes according to plan, and
 
  • intervene if possible to fix the error(s)
  • notify the production coordinator if not possible to directly intervene to rectify the error

Revision 142012-10-23 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 8 to 8
 

Today's production plan

Changed:
<
<
We run decay type 1, Kch2pipi0 simulations with software version v6/r188 at GLA and RAL, and with v7/r193 at IC, RAL and BIR. We will switch to r193 (which is 5 times faster) as soon as it is installed on sites. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
>
>
We run decay type 1, Kch2pipi0 simulations with software version v7/r193 at IC, RAL and BIR. Revision 193 is 5 times faster so we will run 6000 events runs. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
 
Changed:
<
<
Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 5000 events for v7 runs. Please pay attention when submitting jobs.
>
>
Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 6000 events for v7 runs. Please pay attention when submitting jobs.
  When the Kch2pipi0-1 production round is done, we will switch to the next round, which is Kch2munu-1 (decay type 30).
Changed:
<
<
ALERT! This wiki section will be updated along the way, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately. Read the logbook/blog for more up-to-date information: http://na62shifts.wordpress.com
>
>
ALERT! This wiki section will be updated often, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately. Read the logbook/blog for more up-to-date information: http://na62shifts.wordpress.com
 

Shifts sign up

Revision 132012-10-22 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 8 to 8
 

Today's production plan

Changed:
<
<
We run decay type 1, Kch2pipi0 simulations with software version v6/r188 at GLA and RAL, and with v7/r193 at IC and BIR. We will switch to r193 (which is 5 times faster) as soon as it is installed on sites. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
>
>
We run decay type 1, Kch2pipi0 simulations with software version v6/r188 at GLA and RAL, and with v7/r193 at IC, RAL and BIR. We will switch to r193 (which is 5 times faster) as soon as it is installed on sites. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
  Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 5000 events for v7 runs. Please pay attention when submitting jobs.
Line: 49 to 49
  The person in charge must do the following:
Added:
>
>
  1. check current production status from the wiki and from our logbook
 
  1. keep an eye on the jobs list and make sure we have 200 jobs RUNNING at all times and not more than 50 SCHEDULED. ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift.
  2. if queue is low (less than 100 RUNNING), use online script creator to submit a new batch of jobs
  3. check production page to see if we've reached 100%. If yes, then start next production round on the list.

Revision 122012-10-22 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 14 to 14
  When the Kch2pipi0-1 production round is done, we will switch to the next round, which is Kch2munu-1 (decay type 30).
Changed:
<
<
ALERT! This wiki section will be updated along the way, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately.
>
>
ALERT! This wiki section will be updated along the way, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately. Read the logbook/blog for more up-to-date information: http://na62shifts.wordpress.com
 

Shifts sign up

Line: 54 to 54
 
  1. check production page to see if we've reached 100%. If yes, then start next production round on the list.
  2. re-validate results weekly (every Monday) with the physics group (email Tonino to check the outputs on CAstor)
  3. notify production coordinator if any major errors occur
Added:
>
>
  1. log everything in our logbook/blog: http://na62shifts.wordpress.com
 

Troubleshooting

Revision 112012-10-22 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 8 to 8
 

Today's production plan

Changed:
<
<
We run decay type 1, Kch2pipi0 simulations with software version v6/r188 at GLA and BIR, and with v7/r193 at IC. We will switch to r193 (which is 5 times faster) as soon as it is installed on sites. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
>
>
We run decay type 1, Kch2pipi0 simulations with software version v6/r188 at GLA and RAL, and with v7/r193 at IC and BIR. We will switch to r193 (which is 5 times faster) as soon as it is installed on sites. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.
  Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 5000 events for v7 runs. Please pay attention when submitting jobs.

Revision 102012-10-22 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 6 to 6
 
Added:
>
>

Today's production plan

We run decay type 1, Kch2pipi0 simulations with software version v6/r188 at GLA and BIR, and with v7/r193 at IC. We will switch to r193 (which is 5 times faster) as soon as it is installed on sites. More sites will be included in the production, as soon as the software is installed on them. Check here for current status.

Production jobs from the Kch2pipi0-1 production round have decay type 1, description set to "Kch2pipi0-1 production round", 1500 events for v6 runs and 5000 events for v7 runs. Please pay attention when submitting jobs.

When the Kch2pipi0-1 production round is done, we will switch to the next round, which is Kch2munu-1 (decay type 30).

ALERT! This wiki section will be updated along the way, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately.

 

Shifts sign up

As of October 15, 2012 we have the following names: Antonio Cassese, Mark Slater, Philip Rubin, Paolo Massarotti, Monica Pepe, Spasimir Balev, Vito Palladino, Mario Vormstein , Karim Massri.

Revision 92012-10-16 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 38 to 38
  The person in charge must do the following:
Changed:
<
<
  1. keep an eye on the jobs list and make sure we 200 jobs RUNNING at all times and not more than 50 SCHEDULED.* ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift.
>
>
  1. keep an eye on the jobs list and make sure we have 200 jobs RUNNING at all times and not more than 50 SCHEDULED. ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift.
 
  1. if queue is low (less than 100 RUNNING), use online script creator to submit a new batch of jobs
  2. check production page to see if we've reached 100%. If yes, then start next production round on the list.
  3. re-validate results weekly (every Monday) with the physics group (email Tonino to check the outputs on CAstor)

Revision 82012-10-16 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"
Changed:
<
<

NA62 Monte Carlo Grid Production Rota

>
>

NA62 Monte Carlo Grid Production Shifts

  This wiki contains information about and for the volunteers for the NA62 MC Grid production rota. Check this page regularly for updates.

Changed:
<
<

List of volunteers

>
>

Shifts sign up

  As of October 15, 2012 we have the following names: Antonio Cassese, Mark Slater, Philip Rubin, Paolo Massarotti, Monica Pepe, Spasimir Balev, Vito Palladino, Mario Vormstein , Karim Massri.
Deleted:
<
<
A schedule will be posted in here as soon as people sign up for shifts.
 
Added:
>
>
We have a Doodle poll entitled "NA62 Grid Production Shifts", where volunteers can sign up for shifts. If you've volunteered here for shifts, please check your email for the poll address.
 

Production plan

Line: 23 to 23
 

Grid Production Shifts

Changed:
<
<
GP "shifts" will not require too much work, and it is enough if the person on shift will check the production status twice a day or so. It is expected that GP 'shifts' will be at least a day long.
>
>
If all goes well, GP "shifts" will not require too much work. Shifts are defined as 9am to 9pm CERN time but it is enough if the person on shift checks the production status and submits jobs twice a day or so. Is problems arise, this might require another hour or so of your time, since you will not be expected to actually fix the problem but only to notify the admins.
 

How it all works

Revision 72012-10-16 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Rota

Line: 38 to 38
  The person in charge must do the following:
Changed:
<
<
  1. keep an eye on the jobs list and make sure we 200 jobs RUNNING at all times and not more than 100 SCHEDULED.* ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift.
>
>
  1. keep an eye on the jobs list and make sure we 200 jobs RUNNING at all times and not more than 50 SCHEDULED.* ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift.
 
  1. if queue is low (less than 100 RUNNING), use online script creator to submit a new batch of jobs
Changed:
<
<
  1. check production page to see if we've reached 100%
>
>
  1. check production page to see if we've reached 100%. If yes, then start next production round on the list.
 
  1. re-validate results weekly (every Monday) with the physics group (email Tonino to check the outputs on CAstor)
  2. notify production coordinator if any major errors occur

Revision 62012-10-15 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Rota

Line: 31 to 31
 
  • intervene if possible to fix the error(s)
  • notify the production coordinator if not possible to directly intervene to rectify the error
Changed:
<
<
  • notify the site admin if jobs systematically fail at a given site
>
>
  • notify the site admin if jobs systematically fail at a given site. Check the jobs history to detect any problems at sites (e.g. jobs consistently finishing early, or going to status CLEARED without registering any output). ALERT! If jobs fail at a site, exclude it from job submissions and notify the site admin!
 
  • notify Janusz if outputs fail to be replicated on CERN Castor (labelled LS instead of CC in the files table)

Checklist

The person in charge must do the following:

Changed:
<
<
  1. keep an eye on the jobs list and make sure we have 200+ jobs queued at all times
  2. if queue is low, use online script creator to submit a new batch of jobs
>
>
  1. keep an eye on the jobs list and make sure we 200 jobs RUNNING at all times and not more than 100 SCHEDULED.* ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift.
  2. if queue is low (less than 100 RUNNING), use online script creator to submit a new batch of jobs
 
  1. check production page to see if we've reached 100%
Changed:
<
<
  1. re-validate results weekly (every Monday) with the physics group (email Tonino)
>
>
  1. re-validate results weekly (every Monday) with the physics group (email Tonino to check the outputs on CAstor)
 
  1. notify production coordinator if any major errors occur

Troubleshooting

Revision 52012-10-15 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Rota

Line: 8 to 8
 

List of volunteers

Changed:
<
<
As of October 12, 2012 we have the following names: Antonio Cassese, Mark Slater, Philip Rubin.
>
>
As of October 15, 2012 we have the following names: Antonio Cassese, Mark Slater, Philip Rubin, Paolo Massarotti, Monica Pepe, Spasimir Balev, Vito Palladino, Mario Vormstein , Karim Massri. A schedule will be posted in here as soon as people sign up for shifts.
 

Production plan

Revision 42012-10-12 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Rota

Changed:
<
<
This wiki will contain information about and for the volunteers for the NA62 MC Grid production rota.
>
>
This wiki contains information about and for the volunteers for the NA62 MC Grid production rota. Check this page regularly for updates.
 

List of volunteers

Changed:
<
<
To be filled.
>
>
As of October 12, 2012 we have the following names: Antonio Cassese, Mark Slater, Philip Rubin.
 

Production plan

The production plan is located here:

Changed:
<
<
http://ppewww.physics.gla.ac.uk/~protopop/na62/mc/tools/index.php?task=production
>
>
http://na62.gla.ac.uk/index.php?task=production
  This displays a database-generated table that fills up along the way. New items will be added by the production coordinator as discussed at the Siena meeting in August 2012.

Grid Production Shifts

Changed:
<
<
GP "shifts" will not require too much work, and it is enough if the person on shift will check the production status every few hours or so. It is expected that GP 'shifts' will be at least a day long.
>
>
GP "shifts" will not require too much work, and it is enough if the person on shift will check the production status twice a day or so. It is expected that GP 'shifts' will be at least a day long.
 

How it all works

Line: 29 to 29
 
  • intervene if possible to fix the error(s)
  • notify the production coordinator if not possible to directly intervene to rectify the error
Added:
>
>
  • notify the site admin if jobs systematically fail at a given site
  • notify Janusz if outputs fail to be replicated on CERN Castor (labelled LS instead of CC in the files table)
 

Checklist

The person in charge must do the following:

  1. keep an eye on the jobs list and make sure we have 200+ jobs queued at all times
Changed:
<
<
  1. if queue is low, use online script creator to submit a new batch of jobs
>
>
  1. if queue is low, use online script creator to submit a new batch of jobs
 
  1. check production page to see if we've reached 100%
Changed:
<
<
  1. re-validate results weekly (every Tuesday) with the physics group
>
>
  1. re-validate results weekly (every Monday) with the physics group (email Tonino)
 
  1. notify production coordinator if any major errors occur

Troubleshooting

In case you find an error produced by the online interface, please immediately notify Dan, Janusz and Tonino.

Deleted:
<
<
-- DanProtopopescu - 2012-09-20
 
META FILEATTACHMENT attachment="NA62_requirements_table.pdf" attr="" comment="Production plan (August 2012)" date="1348235473" name="NA62_requirements_table.pdf" path="NA62_requirements_table.pdf" size="38998" stream="NA62_requirements_table.pdf" tmpFilename="/usr/tmp/CGItemp26671" user="DanProtopopescu" version="1"

Revision 32012-09-21 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Rota

Line: 45 to 45
 In case you find an error produced by the online interface, please immediately notify Dan, Janusz and Tonino.

-- DanProtopopescu - 2012-09-20

Added:
>
>
META FILEATTACHMENT attachment="NA62_requirements_table.pdf" attr="" comment="Production plan (August 2012)" date="1348235473" name="NA62_requirements_table.pdf" path="NA62_requirements_table.pdf" size="38998" stream="NA62_requirements_table.pdf" tmpFilename="/usr/tmp/CGItemp26671" user="DanProtopopescu" version="1"

Revision 22012-09-20 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Rota

Line: 28 to 28
 MC jobs will be submitted by automaticall by cron jobs. The person overseeing the production (the "shift" taker) will have to check periodically if everything goes according to plan, and

* intervene if possible to fix the error(s)

Deleted:
<
<
  * notify the production coordinator if not possible to directly intervene to rectify the error

Checklist

The person in charge must do the following:

Changed:
<
<
1. keep an eye on the jobs list and make sure we have 200+ jobs queued at all times
2. if queue is low, use online script creator to submit a new batch of jobs
3. check production page to see if we've reached 100%
4. re-validate results weekly (every Tuesday) with the physics group
5. notify production coordinator if any major errors occur
>
>
  1. keep an eye on the jobs list and make sure we have 200+ jobs queued at all times
  2. if queue is low, use online script creator to submit a new batch of jobs
  3. check production page to see if we've reached 100%
  4. re-validate results weekly (every Tuesday) with the physics group
  5. notify production coordinator if any major errors occur
 

Troubleshooting

Revision 12012-09-20 - DanProtopopescu

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Rota

This wiki will contain information about and for the volunteers for the NA62 MC Grid production rota.

List of volunteers

To be filled.

Production plan

The production plan is located here:

http://ppewww.physics.gla.ac.uk/~protopop/na62/mc/tools/index.php?task=production

This displays a database-generated table that fills up along the way. New items will be added by the production coordinator as discussed at the Siena meeting in August 2012.

Grid Production Shifts

GP "shifts" will not require too much work, and it is enough if the person on shift will check the production status every few hours or so. It is expected that GP 'shifts' will be at least a day long.

How it all works

MC jobs will be submitted by automaticall by cron jobs. The person overseeing the production (the "shift" taker) will have to check periodically if everything goes according to plan, and

* intervene if possible to fix the error(s)

* notify the production coordinator if not possible to directly intervene to rectify the error

Checklist

The person in charge must do the following:

1. keep an eye on the jobs list and make sure we have 200+ jobs queued at all times
2. if queue is low, use online script creator to submit a new batch of jobs
3. check production page to see if we've reached 100%
4. re-validate results weekly (every Tuesday) with the physics group
5. notify production coordinator if any major errors occur

Troubleshooting

In case you find an error produced by the online interface, please immediately notify Dan, Janusz and Tonino.

-- DanProtopopescu - 2012-09-20

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback