TWiki> NA62 Web>ProductionHowto (revision 12) Edit Raw edit Attach Print version

NA62 Monte Carlo Production Howto

This wiki explains how to submit NA62 Monte Carlo jobs on the Grid using the custom-written tools and online interface for this. This wiki is written for NA62 members who have volunteered to participate in the production rota.

Monitoring

The web interface for NA62 MC Grid jobs scripting, monitoring and accounting is located at:

http://na62.gla.ac.uk/index.php?task=production

You can use this interface to monitor running and completed jobs, output files and production status. You can also use the iPhone app to monitor jobs, files and production status.

You can get the iPhone app from here Get the NA62 iPhone App

The aim is to maintain the production rate at its maximum (whatever that is, depending on the resources available) and for this the person on shift must submit new jobs when the number of waiting and running jobs is low. How is low defined ? We should have 200 jobs RUNNING at all times and not more than 50 SCHEDULED. ALERT! Please note that these numbers will change when new resources are added. Check this number at the beginning of each shift.

Job submissions in production mode are done via the Scripter interface, as explained below.

Scripter

The Scripter is an user-friendly UI for producing all necessary job submission scripts (JDL, wrapper and .mac file), in both single- and multiple job submission scripts and commands for NA62 MC job submission. The Scripter is located here:

http://na62.gla.ac.uk/scripter.php

This is an HTML form with many input filelds, most of them self-explanatory. The pre-filled values are inherited from the previous submission (which could have been a test job for example), so you must check that they fit the production round you are managing.

Here is how the scripter interface looks like:

ScripterUI.001.png

Description of the form fields:

  1. Description - this is a short description of the job. It must contain the production tag (get it from here if unsure), and it must contain the keyword "production". Should not contain quotes or any other non-text characters. Leave out the tag and replace production with test if you want to submit a few test jobs.
  2. Run interval - the start run is pre-filled with the next available run number (from the DB). Choose the upper limit such that you submit not more than 100 jobs at a time. Do not use job cloning for production at the moment.
  3. Number of events - this is the number of events per job (run). We aim to keep the job runtime below 20 hours, so for channel 1 that means 1500 events per job. For other channels it could be more or less than that. Check previous production jobs to find the optimal number of events. Leave the random seed as it is, because it will be set automatically for each run.
  4. MC software version - you must use the latest software version (check here if unsure), and make sure the wrapper script supports this version. Take a look at the scripts of previous jobs to make sure (click here, select the last production job, then click the corresponding green box link on the "Exe" column and check the line vers="v?"). There is a grid "version" for each installed software revision (e.g. v6/r188), see this wiki.
  5. Decay type - is the reaction channel to simulate. Choose from the drop-down menu and make sure it corresponds to the current production tag and description (check here if unsure). Leave default values for the remaining options, unless instructed otherwise.
  6. Destination - tick here only the sites that have the chosen MC software version installed. Check this table to make sure. Check the jobs history to detect any problems at sites (e.g. jobs consistently finishing early, or going to status CLEARED without registering any output). ALERT! If jobs fail at a site, uncheck it here and notify the site admin!
  7. Executable - automatically selected now - this is the name of the wrapper script that is executed on the worker node. It checks if the software is installed, runs the actual MC simulation, registers the output and triggers the FTS transfer(s). In single jobs mode, you can display commented lines from scripts in case you would like to check extra settings, comments etc. For production, leave this unchecked.
  8. User and password - for multiple job submissions, you need to tick the "Write scripts to disk" checkbox, and introduce your uid and password for this interface. ALERT! You must have registered and your credentials must have been validated for this to work. In single job mode, uid and password are not needed, since you will have to submit the (test) job with your credentials from your UI.
  9. Click Prepare, and you are taken to a new page. ALERT! If the page says "There are scheduled submissions in there. Please try again in 10 minutes" it means that you have (or someone else has) just scheduled another batch of jobs and you have to wait for these to be actually submitted, else the scripts may be overwritten - with unpredictable results.

Multiple Submissions

Below is the screen that will be displayed after clicking the Prepare button for multiple submissions. Carefully double-check the settings here as well:

ScripterUI2.002.png

This example shows only two jobs. You can submit up to 100 at a time, but it is best to submit batches of 50 (these numbers may change, check this wiki before your shift). You can open the linked files to check is all settings are correct. Do not use manual submission. Click Schedule to send these jobs to the bot. Relax. A cronjob will pick these commands and execute them within the next 10 minutes. You will be able to see the result of your multiple submission by checking the jobs table.

ALERT! If the action is not executed, you must notify the Glasgow Scotgrid team <uki-scotgrid-glasgow@physics.gla.ac.uk> as soon as possible, because something is wrong with the WMS endpoint.

Manual job submission

Jobs can be submitted manually one by one with your credentials (i.e. grid certificate), from command line on your Grid UI. Run the scripter in single run mode (no password is required), paste the commands provided by the scripter into your UI terminal, and press enter. If you have submitted a job this way (e.g. for testing the system), then use the form at the bottom of the page

DBForm.png

to insert the job specs and status URL in the run database. ALERT! You must have registered and your credentials must have been validated for this to work.

Remember that to submit jobs with your credentials, you must:

  1. have a valid Grid certificate (e.g. a CERN certificate)
  2. register for NA62 VO membership via https://voms.gridpp.ac.uk:8443/voms/na62.vo.gridpp.ac.uk/user/home.action (you must have the certificate uploaded into the browser for this).
  3. have access to a Grid UI (a computer with the necessary software and settings)

Make sure you have all the above. Familiarize yourself with Grid commands before trying this feature.

Troubleshooting

In case you find an error produced by the online interface, please immediately notify Dan, Janusz and Tonino.

Edit | Attach | Print version | History: r16 | r14 < r13 < r12 < r11 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r12 - 2012-10-22 - DanProtopopescu
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback