ATLAS Production System Twiki

Join this site and follow this blog to be up to date, or simply subscribe to our RSS feed!

Permanent Documentation Links
Blog Tags: prodsys1, prodsys2

Monday, October 29, 2012

ProdSys splinter meeting (October 2012) action items


  1. Combined list requests (Dmitry, Sasha), October 2012 ☑ done
  2. SSO for list request (Dmitry), October 2012☑ done
  3. Automatic task splitting (Dmitry), End October - November 2012
  4. Tasks cloning (Alexei, Valeri, Dmitry), October, 2012
    1. I/F part is ready. Nov 5, 2012. : http://pandamon.cern.ch/tasks/clonetask
    2. Task Request I/F parameters checking is in progress
  5. Running from nightlies (Andrej, Rod), October 2012
  6. Tag definition I/F. New implementation (Sasha, Dmitry), December 2012
  7. TR features for Group Production. Hiding unnecessary fields (Nurcan). Not assigned. More info is needed to implement it
  8. Pile up tasks start up before simulation is done. (Sasha), October 2012 ☑ done
  9. 1% issue. Implementation is postponed
  10. Scouts info usage for simulation tasks (Andrej, Rod, Wolfgang), Oct-Nov 2012 ☑ done
  11. FTK, file naming convention for merging step (Sasha, Graeme), October 2012☑ done
  12. CPU consumption information taken from TRF (Sasha, Graeme, 'Wuppertal group'), November 2012
  13. Meta-Language for Task Requests. ☑ done  ( GraphML schema chosen - Maxim)
  14. CAPTCHA in TR I/F (Dmitry), October-November 2012☑ done
    1. it will be implemented as SSO and CAPTCHA option won't be needed anymore
  15. Requestor I/F . RIF. Wolfgang, Maxim, Valeri
    1. RIF specs (Wolfgang, Maxim)
    2. Twiki from Maxim : https://twiki.cern.ch/twiki/bin/viewauth/Atlas/ProdSys
    3. mid-Nov : Wolfgang will prepare an initial list of requirements
  16. Tаsk  Request  CLI 
    1. postponed until ProdSys II
  17. Documentation, Savannah, Twiki (Maxim, Dmitry)
  18. AGIS/PanDA integration (Ale, Alden, AlexeyA)
    1. "Alden part " December 2012
    2. End-to-end test Jan 2013
    3. Production version, Feb 2013
  19. Task search options (Alexei), October 2012, ☑ done
  20. Monitoring.
    1. long running jobs/tasks
    2. Task progress based on  task's submission info
    3. Failed jobs monitoring (by error type)
    4. 'Stuck' tasks 
    5. Integration of existing group production monitoring tools with PanDA Classical and Dashboard monitoring (Nurcan, Jarka, Laura, Valeri)
    6. PanDA classical pages response time (Valeri)


9 comments:

  1. The doc reads,
    " . . .
    19. Monitoring.
    5. Integration of existing group production monitoring tools with PanDA
    . . . "

    Is it possible to provide some sort of URL to "existing group production monitoring tools" or /and list those tools explicitly?

    ReplyDelete
    Replies
    1. Valeri, a customized monitoring page was set up by Rob Henderson to monitor the semi-automatic submission of the merging tasks in group production:

      http://lapb.lancs.ac.uk/atlas_validation/gmerge.php

      We can choose an output type from this page. The displayed information is the run number (first column), a link to the input tasks (second column) and a link to the merging task (third column). The idea was to see the last run submitted and if the input tasks are done and the merging tasks are submitted w/o any delay.

      As far as I followed from this ticket:

      https://savannah.cern.ch/bugs/index.php?98935

      You have set up a monitoring page to choose a particular output type for a given p-tag. Rob's page covers all p-tags used in the merging of a given output type. I'll discuss with Rob how much of the info in Rob's page we can monitor in Panda. I'll reply back here.

      Delete
    2. This comment has been removed by the author.

      Delete
  2. Alexei, I have a comment on the item "7. TR features for Group Production. Hiding unnecessary fields (Nurcan). Not assigned. More info is needed to implement it". This refers to the task request page:

    http://panda.cern.ch/server/pandamon/query?mode=reqtask1

    where we only fill 3 fields to submit a task; Dataset, Configuration tag, userid. The other 3 fields can be hidden; Project, Transformation Type, Transformation Version.

    ReplyDelete
  3. Hi Nurcan,

    please give me default values to be used for hidden fields. Cheers, Alexei

    ReplyDelete
  4. As to Meta-Language for Task Requests: we are very close to the consensus (and probably arrived to the decision) that we'll use the graph model to describe Tasks. This is completely in line with what's done in industry and research. As such, the problem of picking the language becomes at least partially defined, as we need to look at representation of graphs and their state ("markings").

    ReplyDelete
  5. Progress report:
    4. Tasks cloning.
    The application has been installed and available
    https://pandamon.cern.ch/tasks/clonetask
    The details are available via Savannah ticket https://savannah.cern.ch/bugs/?98392

    4.1. 'https' protocol and CERN SSO are used to allow the restricted access and populating the "Task Request" fields like 'email' and "comment" from the user Grid certificate automatically

    4.2. The clone task logic is present with hthttp://pandamon.atlascloud.org/static/doc/tasks/clonetask.html pending Sasha, Dmitri review.
    4.3. The application API is
    curl http://pandamon.cern.ch/clonetask?tid=[task Request Id]
    It is available with New Monitor ‘?’ link.

    4.4. The "action" parameter was added to the "List the Task Requests" application to allow using the well-known Web API to search and select the "Task Requests" and apply "action=cloneaction" for each selected request if needed.
    https://pandamon.cern.ch/tasks/listtasks1?action=clonetask has been

    4.5. Wolfgang is testing the application (see https://savannah.cern.ch/bugs/?98392 )

    4.6. Main issue is the lack of the reliable up-to-date “sanity” function from prodsys group.

    ReplyDelete
  6. Progress report:

    15. Requestor I/F . RIF. Wolfgang, Maxim, Valeri

    Neither clear requirement nor spec has been provided to be implemented yet.

    ReplyDelete
  7. Progress report.

    20. Monitoring

    20.6 PanDA classical pages response time (Valeri)

    The link http://panda.cern.ch/server/pandamon/query?mode=listtask is now connected with the new Panda Monitor Server providing 2-5 sec response time. The parameter “classic=true” can be used to access the old page if needed. This link is available from the new page as well.

    ReplyDelete