ATLAS Production System Twiki

Join this site and follow this blog to be up to date, or simply subscribe to our RSS feed!

Permanent Documentation Links
Blog Tags: prodsys1, prodsys2
Showing posts with label maxim. Show all posts
Showing posts with label maxim. Show all posts

Monday, February 11, 2013

February-March 2013: ProdSys II progress report (Maxim)


Documentation work:
  • This blog: created tags "prodsys1" and "prodsys2" for better search capability.
  • Created a common navigation header (bar) that can be included in all ProdSys TWiki pages.
  • References to DEFT and further details added to documentation on the ProdSys pages.
  • Corrections in DEFT/JEDI interface description as per Tadashi's comments.
  • Prepared Abstract for the ProdSys paper (CHEP). Abstract approved by ATLAS and submitted.
  • Presentation for the CMS/ATLAS Common Analysis Platform on 2/28/2013:
    • Based on the announcement on 2/14 of a CMS development largely parallel to what we do in ATLAS
    • Potential redundancy, under-utilization of PanDA capability, suboptimal database load
    • Clear potential for common development and platform
  • Presentation for the ATLAS Software and Computing Workshop, March 11-15 2013
  • Meeting with Wolfgang to discuss progress and requirements
Development:
  • DEFT prototype: functionality complete
  • SVN project created, code checked in
    • Continuous updates and checkpoints
    • Naming of the SVN tree as per Tadashi's comments
  • Tested database schemas for the Task, Dataset and Meta-Task objects.
  • Extensive refactoring and rewrite of the main code unit due to lots of new functionality and increased complexity, the application has become a simple CLI driver for underlying classes.
  • Dedicated test of the code state-switching functionality
  • Improvements in logging functionality, Logger class created based on standard Python package
  • Started work on the Dependency Model for datasets

Friday, January 11, 2013

January 2013: ProdSys II Progress Report (Maxim)

01/10/13 to 01/31/13

Documentation work:
  • Updates in the general PanDA TWiki pages
  • References to Jedi documentation on the ProdSys pages
  • Additions to Workflow page based on recent experience
Development:
  • Deft prototype development in progress with code cleanup and proper logging added
  • Implemented XML file-based meta-task database as a testing and integration tool
  • Implemented and tested the meta-task template functionality as defined in the requirements
  • Set-up of Oracle software at BNL to move towards DB integration with JEDI
Graph processing:

  • Installation of, and experimentation with, the following tools: graphviz and Gephi
  • Conclusion: adequate graph editing and visualization capabilities of Gephi, good enough to actually edit meta-tasks
  • R&D with Wireit and jsPlumb packages as tools for visualizing and editing meta-tasks and tasks, in a Web client such as browser
 Misc:
  • LBNE Documentation Review. "Redmine" pages, SVN integration

Sunday, December 2, 2012

December 2012: ProdSys II Progress Report (Maxim)

12/16/12 to 12/31/12

Documentation work:
  • Quick links added to the top of all core pages in the ProdSys TWiki
  • Added "Meta-Task Recovery" to the Requirements and to the Workflow pages, to reflect the previous documented requirements for the system. This dates back to February 2012, and has been mentioned elsewhere.
Development:
  • Evaluation of Python packages for Workflow Management:
    • Soma workflow
    • Weaver workflow
    • Other items as per TWiki documentation
  • Evaluation: Graph representation in XML, standard solutions studied in conjunction with parsing tools
  • Preliminary selection of GraphML as the language enjoying standardization, fairly simple sytax and parsing support
  • A prototype of the graph builder and workflow engine created, based on:
    • "GraphML" as the input language
    • "Networkx" as the serializer/deserializer
    • "PyUtilib" as the workflow engine

12/01/12 to 12/15/12

Meetings:
  • BNL
    • a meeting was held at BNL to discuss the current status of the ProdSys documentation and the initial design of the new system. Present: K.De, V.Fine, A.Klimentov, S.Panitkin, M.Potekhin, T.Wenaus
    • the Graph Model was reviewed and approved
    • decision made to avoid heavy-weight XML-based solutions
  • FNAL/CMS
    • a Common Platform Meeting was held at FNAL (Dec.5-7). M.Potekhin attending in person,  and T.Wenaus remotely.
    • a review of the work previously done under the mandate of Feasibility Study of PanDA/GlideinWMS Integration. Fairly detailed discussion of scalability issues.
    • a brief overview of the current work on ProdSys II. Burt has suggested looking into DagMan application (note: rejected after documentation review).
  • FNAL/LBNE
    • Initial meeting at FNAL with LBNE team, some ProdSys ideas discussed, among other issues pertaining to the general software management in LBNE.
  • FNAL/BNL
    • Initial meeting at BNL with the local LBNE team. Corroborated on FNAL discussion, agreed on basic principles of software coordination
    • A draft document created describing the principles of Software Coordination at LBNE

Documentation work:
  • Workflow page  has been created
  • Considerable cleanup of TWiki pages for Panda, ProdSys, TaskModel
  • Some presentation material added
  • Added "Augmenting Live Meta-Task" and "Partially Defined Meta-Task" use cases
  • Permanent links added to this Blog (front page)
Development:
  • Evaluation of Python packages for Workflow Management:
    • pyutilib.workflow
    • romanchyla/workflow on Github

Monday, November 19, 2012

November 2012: ProdSys II Progress Report (Maxim)

11/16/12 to 11/30/12

Design and Documentation work:
  • Added the description of a few more tables to the DB page (current DB)
  • Added a chapter on RDBMS representation of the graph model, with four methods of graph representation considered
  • Worked on the ProdSys object model and schemas for the following components:
    • Meta-Task
    • Task
    • Adjacency map, as the apparently most efficient way of representation for the Tasks in RDBMS
What's new - the model:
  • If datasets are properties of the edges in the graph representing the Meta-Task, this makes for a reasonable implementation of  the workflow logic, since the dependencies between tasks adjacent in the graph, in the model currently used by Coordinators, is established on the basis of the data being available for the next step
  • New set of "states" for the task, aligned with JEDI
  • Introduced Pseudo-tasks: entry and exit, a common practice in Grid-based workflow management

11/01/12 to 11/15/12

Documentation work
  • Cleaned up documentation on the main ProdSys page
  • Added descriptions of a few more "T-tables" to the DB page. Up to 20 tables have been identified as no longer used, orphaned or invalid
  • More information has been added to the Main ProdSys Twiki page, based on the Production Group documentation and inspection of the code used in preparation of the LIST data.
  •  An additional Task Model Page has been created for better organization of the documentation.
  • The description of the Production Database has been supplemented with information about additional tables
 Operations and Development
  • Performed maintenance of the development server at BNL, necessary due to migration to new hardware
  • Continued practicing with the "Spreadsheet Process" workflow management scripts, inspected produced data, documented the experience on the ProdSys page