Difference: ProductionRota (20 vs. 21)

Revision 212016-03-11 - DanProtopopescu

Line: 1 to 1
 
META TOPICPARENT name="WebHome"

NA62 Monte Carlo Grid Production Shifts

Line: 14 to 14
  When this is done, and if this wiki is not updated, continue with the next production on the list, moving upwards on the page.
Changed:
<
<
ALERT! This wiki section will be updated often, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately. Read the logbook/blog for more up-to-date information: http://na62shifts.wordpress.com
>
>
ALERT! This wiki section will be updated often, so please check carefully before your shift. If you find any discrepancies, please notify Dan and Janusz immediately. Read the logbook/blog for more up-to-date information: https://na62.gla.ac.uk/elog (or older elog at http://na62shifts.wordpress.com/).
 

Shifts sign up

Line: 27 to 27
  The production plan is located here:
Changed:
<
<
http://na62.gla.ac.uk/index.php?task=production
>
>
https://na62.gla.ac.uk/index.php?task=production
  This displays a database-generated table that fills up along the way. New items may be added by the production coordinator along the way. Only Mark and Dan can use Ganga for job management at this time, but it will be available soon for everyone.
Line: 41 to 41
 
  • intervene if possible to fix the error(s)
  • notify the production coordinator if not possible to directly intervene to rectify the error
Changed:
<
<
  • notify the site admin if jobs systematically fail at a given site. Check the jobs history to detect any problems at sites (e.g. jobs consistently finishing early, or going to status CLEARED without registering any output). ALERT! If jobs fail at a site, exclude it from job submissions and notify the site admin!
>
>
  • notify the site admin if jobs systematically fail at a given site. Check the jobs history to detect any problems at sites (e.g. jobs consistently finishing early, or going to status CLEARED without registering any output). ALERT! If jobs fail at a site, exclude it from job submissions and notify the site admin!
 
  • notify Janusz if outputs fail to be replicated on CERN Castor (labelled LS instead of CC in the files table)

Checklist

The person in charge must do the following:

Changed:
<
<
  1. check current production status from the wiki and from our logbook
>
>
  1. check current production status from the wiki and from our logbook
 
  1. keep an eye on the jobs list and make sure we have 200 jobs RUNNING at all times and not more than 50 SCHEDULED. ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift, read the (b)logbook and eventually contact the shifter before you for clarifications.
  2. if queue is low (less than 100 RUNNING), use online script creator to submit a new batch of jobs
Changed:
<
<
  1. check production page to see if we've reached 100%. If yes, then start next production round on the list. Rounds that need to be done are the ones with 0 runs, and 0 files, and marked with "Not yet started".
>
>
  1. check production page to see if we've reached 100%. If yes, then start next production round on the list. Rounds that need to be done are the ones with 0 runs, and 0 files, and marked with "Not yet started".
 
  1. re-validate results weekly (every Monday) with the physics group (email Tonino to check the outputs on Castor)
  2. notify production coordinator if any major errors occur
Changed:
<
<
  1. log everything in our logbook/blog: http://na62shifts.wordpress.com
>
>
  1. log everything in our logbook (https://na62.gla.ac.uk/elog) or blog (http://na62shifts.wordpress.com, no longer used)
 

Troubleshooting

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback