ATLAS Production System Twiki

Join this site and follow this blog to be up to date, or simply subscribe to our RSS feed!

Permanent Documentation Links
Blog Tags: prodsys1, prodsys2

Thursday, November 21, 2013

Notes on Template Based Job Parametrization

1. Datasets

Problem: job templates created by the converter are not really templates, since they contain information that varies from job to job, such as dataset names.

Solution: use a more appropriate information source, i.e. the DEFT dataset DB table, which does have provisions for names and other attributes of the dataset. Use the same "placeholder"/variable approach as with other parameters, and same syntax.

  "jobParameters": [
            "dataset": "${DEFT_DATASET_IN}",
            "param_type": "input",
            "format": "AOD"
            "type": "template",
            "value": "inputAODFile=${IN}"
            "type": "constant",
            "value": "maxEvents=1000 RunNumber=213816 autoConfiguration=everything preExec=\"from BTagging.BTaggingFlags import BTaggingFlags;BTaggingFlags.CalibrationTag=\"BTagCalibALL-07-02\"\""
            "attribute": "repeat,nosplit",
            "dataset": "${DEFT_DATASET_IN}",
            "param_type": "input",
            "flavor": "dbrelease",
            "type": "template",
            "value": "DBRelease=${DBR}"
            "type": "constant",
            "value": "AMITag=p1462"
            "dataset": "${DEFT_OUTPUT}",
            "param_type": "output",
            "flavor": "pool",
            "format": "root",
            "token": "ATLASDATADISK",
            "type": "template",
            "value": "${SN}"

2. TRF

The following nomenclature is followed:
  • "TRANSUSES" - defines the base release of ATLAS software to be used by the transform
  • "TRANSHOME" - the cache release, which effectively overlays the base release
  • "TRANSPATH" - simply the path (pretty much the filename) of the transformation script
Action item in Nov.2013 - the JEDI-alpha "template" has this hardcoded (similar to the dataset case) so this needs to be changed. These are in fact proper attributes in the DEFT_TASK table and JEDI can easily obtain this information, as opposed to consuming a prefab string.

3. Architecture, corecount and other attributes

I observed that there are a few other parameters that are parsed from JSON (in the mid-November version of JEDI) and inserted as proper columns into the JEDI_TASKS table. It obviously makes sense to augment the DEFT schemas accordingly for consistency and to save a little JSON, and enable searches (e.g. on architecture).

Other examples: VO, Working Group, cloud.

4. Summary of attributes to be read by JEDI from the DEFT tables

For backward compatibility, I propose the following:
  • JEDI attempts to locate the usual attributes (corecount, architecture etc) in the DEFT table, for each task
  • If such attribute is not found, JEDI takes these values from the parsed JSON data
This way the "alpha/converter" functionality will still work, while a proper DEFT schema becomes possible.

In summary, the following parameters have been refactored from JSON into RDBMS:
  • dataset, along with its format and "flavor"
  • TRANS*
  • Architecture
  • Corecount
  • VO
  • Working Group
  • Cloud
Run number also needs to be added for consistency.

5. More on Datasets


Task ID will be read by JEDI.

Sunday, November 3, 2013

DEFT Development in November 2013

Development after 11/8:
  • Corrected the dataset object schema (will need further tweaking)
  • Added the dataset view
  • Improved the "developer's editor", added controls to delete datasets from the DB 
  • Corrected deft-cli to catch up with the schema changes
  • Worked out an improved version of the task template (parametrization of the dataset and similar info)

Summary 11/8/2013:
  • First integration test worked, i.e. JEDI picked up the test task from DEFT, and put it into its own queue
  • Added a stub for XML input to the template library page in the UI. The idea is to reuse the DEFT-CLI functionality to inject templates from XML source at will.
  •  Added PRODSYS_COMM interface to the UI
  • Cleaned up a few pages (removed redundant columns etc).
  • Testing form-based editor (works for all important fields)
  • Ensured non-editable fields
  • Corrected the PRODSYS_COMM schema which was simple but obsolete
    • new input from Tadashi
    • need to add the "recipient" column for cleaner logic
  • Started using "real" job templates in task templates
  • Added PRODSYS_COMM interface to CLI
  • TWiki updated
  • Small bug fixes 
  • Added the new DEFT_JOB_TEMPLATE table to handle job templates. TBD with Sasha, Dmitry and others.