Atlas Grid Production System Development (ProdSys II): March 2013

Thursday, March 14, 2013

March 2013. Updated List of requirements (ProdSys SW development).

This is an updated list, the previous list can be found using link below

http://prodsys.blogspot.ch/2012/10/prodsys-splinter-meeting-october-2012.html

1. AP transient datasets deletion
Alexei before May 1st
Mar 30. version for testing is ready. Reported to MC Coordination
2. 'clone' tasks use-cases
Valeri, Dmitry (Wolfgang for validation and testing)
3. fair share and priority policy
Kaushik, Tadashi (Rod, Kaushik for testing)
4. New G4 TRF integration (March-August)
Sasha, Dmitry (Wolfgang, Jose for validation and testing)
4.1 StoppedParticleG4_tf.py ☑ done March 27, 2013
4.2 ISF Simulation: Sim_tf.py ☑ done April 11, 2013
4.3 Simulation:
4.3.1 AtlasG4_tf.py ☑ done April 11, 2013
4.3.2 HITSMerge_tf.py ☑ done
4.3.3 FilterHit_tf.py ☑ done
4.4 Reconstruction:
4.4.1 Reco_tf.py ☑ done
4.4.2 Digi_tf.py ☑ done
4.4.3 AODMerge_tf.py ☑ done
4.4.4 ESDMerge_tf.py ☑ done
4.5 Overlay:
4.5.1. RAWOverlayFilter_trf.py ☑ done
4.5.2. BSOverlayFilter_trf.py
5. log files archiving (TBD)
Simone for technical specs
6. For prodsysII: If a task has more than one output dataset, destination should be configurable per dataset. For example AOD datasets should go to DATADISK (default), RDO/ESD datasets should be replicated to group space. log files should go to DATADISK (default).
MORE INFO IS NEEDED. PLEASE DO NOT POST RANDOM REQUIREMENTS W/O DISCUSSING IT FIRST.
7. Implementation of the FTK emulation in the production system (March)
High Priority for Trigger TDRs in April and September 2013.
MC samples with FTK simulation are needed.
7.1 Skim silicon data (1 RDO event -> 1 FTK input event) ☑ done
7.2 FTK emulation for each FTK input event split into 64 tower regions ☑ done
a) Emulate FTK response with TrigFTKSim_tf.py for each region split into four subregions
b) Merge every four subregions into one tower region with TrigFTKMerge_tf.py
7.3 Merge 64 regions into 1 FTK event with TrigFTKMerge_tf.py ☑ done
7.4 Combine same RDO and FTK events for reconstruction ☑ done
8. Integrate new FTK transformations (April)
Sasha, Dmitry (Wolfgang for validation and testing)
8.1 TrigFTKSM4_tf.py ☑ done

8.2 TrigFTKMergeReco_tf.py ☑ done

9. Provide RW needed for dynamic task brokerage (TBD)
Sasha
10. Support for event counting in MC and GP (TBD)
Sasha

Friday, March 8, 2013

DEFT/JEDI Communication Redux

General Notes on Communication

There has been progress in the design of the JEDI database schemas, documented in a separate section of the JEDI Twiki. Among a few other detail, there is a "COMMAND" column in the task table. This is a reminder that the DB is acting as the point of interaction and effectively an asynchronous messaging medium for DEFT and JEDI. Both are allowed to post requests to each other. Human operators are also capable to post requests to either of these systems, under certain conditions.

We reiterate what was previously stated with regards to DEFT/JEDI interaction:

both components periodically do a database sweep, i.e. the operation is 100% asynchronous and there are no in-memory processes that are bound to a specific Panda Task or job
the database is the medium for DEFT/JEDI communication
JEDI never creates or deletes tasks or modifies the meta-task topology otherwise. It can post a request to DEFT to get this done. Thus, the important functionality remains withing one unit of code and is manageable. This sounds simple but it's fundamental for the system viability.
examples for the previous item in action involve live task augmentation, and a related subject of handling the missing file issue. In the latter case, an additional task is added to the Meta-Task to make up for the lost statistics. Such request is formulated by JEDI, picked up by DEFT and is translated into a task, which is in turned picked up by JEDI.

After some consideration, we arrived to a solution which has a separate table to store the semaphores/commands

Task Parameters

Task parameters are a necessary attribute of the task object. They are essentially schema-free tuples of strings (could have a different implementation, but that's the reality). We will use CLOB to store these in Oracle DB, probably in a separate table to not impact performance.

In the context of JEDI-alpha, a good choice (agreed by most) is using JSON as the format for storage and inter-process communication.

ATLAS Production System Twiki