TWiki> NA62 Web>ProductionRota (revision 7) Edit Raw edit Attach Print version

NA62 Monte Carlo Grid Production Rota

This wiki contains information about and for the volunteers for the NA62 MC Grid production rota. Check this page regularly for updates.

List of volunteers

As of October 15, 2012 we have the following names: Antonio Cassese, Mark Slater, Philip Rubin, Paolo Massarotti, Monica Pepe, Spasimir Balev, Vito Palladino, Mario Vormstein , Karim Massri. A schedule will be posted in here as soon as people sign up for shifts.

Production plan

The production plan is located here:

http://na62.gla.ac.uk/index.php?task=production

This displays a database-generated table that fills up along the way. New items will be added by the production coordinator as discussed at the Siena meeting in August 2012.

Grid Production Shifts

GP "shifts" will not require too much work, and it is enough if the person on shift will check the production status twice a day or so. It is expected that GP 'shifts' will be at least a day long.

How it all works

MC jobs will be submitted by automaticall by cron jobs. The person overseeing the production (the "shift" taker) will have to check periodically if everything goes according to plan, and

  • intervene if possible to fix the error(s)
  • notify the production coordinator if not possible to directly intervene to rectify the error
  • notify the site admin if jobs systematically fail at a given site. Check the jobs history to detect any problems at sites (e.g. jobs consistently finishing early, or going to status CLEARED without registering any output). ALERT! If jobs fail at a site, exclude it from job submissions and notify the site admin!
  • notify Janusz if outputs fail to be replicated on CERN Castor (labelled LS instead of CC in the files table)

Checklist

The person in charge must do the following:

  1. keep an eye on the jobs list and make sure we 200 jobs RUNNING at all times and not more than 50 SCHEDULED.* ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift.
  2. if queue is low (less than 100 RUNNING), use online script creator to submit a new batch of jobs
  3. check production page to see if we've reached 100%. If yes, then start next production round on the list.
  4. re-validate results weekly (every Monday) with the physics group (email Tonino to check the outputs on CAstor)
  5. notify production coordinator if any major errors occur

Troubleshooting

In case you find an error produced by the online interface, please immediately notify Dan, Janusz and Tonino.

Topic attachments
I Attachment History Action Size Date Who Comment
PDFpdf NA62_requirements_table.pdf r1 manage 38.1 K 2012-09-21 - 13:51 DanProtopopescu Production plan (August 2012)
Edit | Attach | Print version | History: r21 | r9 < r8 < r7 < r6 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r7 - 2012-10-16 - DanProtopopescu
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback