1. Datasets
Problem: job templates created by the converter are not really templates, since they contain information that varies from job to job, such as dataset names.Solution: use a more appropriate information source, i.e. the DEFT dataset DB table, which does have provisions for names and other attributes of the dataset. Use the same "placeholder"/variable approach as with other parameters, and same syntax.
"jobParameters": [
{
"dataset": "${DEFT_DATASET_IN}",
"param_type": "input",
"format": "AOD"
"type": "template",
"value": "inputAODFile=${IN}"
},
{
"type": "constant",
"value": "maxEvents=1000 RunNumber=213816 autoConfiguration=everything preExec=\"from BTagging.BTaggingFlags import BTaggingFlags;BTaggingFlags.CalibrationTag=\"BTagCalibALL-07-02\"\""
},
{
"attribute": "repeat,nosplit",
"dataset": "${DEFT_DATASET_IN}",
"param_type": "input",
"flavor": "dbrelease",
"type": "template",
"value": "DBRelease=${DBR}"
},
{
"type": "constant",
"value": "AMITag=p1462"
},
{
"dataset": "${DEFT_OUTPUT}",
"param_type": "output",
"flavor": "pool",
"format": "root",
"token": "ATLASDATADISK",
"type": "template",
"value": "${SN}"
}
]
2. TRF
The following nomenclature is followed:- "TRANSUSES" - defines the base release of ATLAS software to be used by the transform
- "TRANSHOME" - the cache release, which effectively overlays the base release
- "TRANSPATH" - simply the path (pretty much the filename) of the transformation script
3. Architecture, corecount and other attributes
I observed that there are a few other parameters that are parsed from JSON (in the mid-November version of JEDI) and inserted as proper columns into the JEDI_TASKS table. It obviously makes sense to augment the DEFT schemas accordingly for consistency and to save a little JSON, and enable searches (e.g. on architecture).Other examples: VO, Working Group, cloud.
4. Summary of attributes to be read by JEDI from the DEFT tables
For backward compatibility, I propose the following:- JEDI attempts to locate the usual attributes (corecount, architecture etc) in the DEFT table, for each task
- If such attribute is not found, JEDI takes these values from the parsed JSON data
In summary, the following parameters have been refactored from JSON into RDBMS:
- dataset, along with its format and "flavor"
- TRANS*
- Architecture
- Corecount
- VO
- Working Group
- Cloud
5. More on Datasets
nameoffset
Task ID will be read by JEDI.
No comments:
Post a Comment