This lesson is being piloted (Beta version)

DUNE Computing Training January 2023: Glossary

Key Points

Workshop Welcome and Introduction
  • This workshop is brought to you by the DUNE Computing Consortium.

  • The goals are to give you the computing basis to work on DUNE.

Storage Spaces
  • Home directories are centrally managed by Computing Division and meant to store setup scripts, do NOT store certificates here.

  • Network attached storage (NAS) /dune/app is primarily for code development.

  • The NAS /dune/data is for store ntuples and small datasets.

  • dCache volumes (tape, resilient, scratch, persistent) offer large storage with various retention lifetime.

  • The tool suites idfh and XRootD allow for accessing data with appropriate transfer method and in a scalable way.

Data Management
  • SAM and Rucio are data handling systems used by the DUNE collaboration to retrieve data.

  • Staging is a necessary step to make sure files are on disk in dCache (as opposed to only on tape).

  • Xrootd allows user to stream data file.

  • The Unix Product Setup (UPS) is a tool to ensure consistency between different software versions and reproducibility.

  • The multi-repository build (mrb) tool allows code modification in multiple repositories, which is relevant for a large project like LArSoft with different cases (end user and developers) demanding consistency between the builds.

  • CVMFS distributes software and related files without installing them on the target computer (using a VM, Virtual Machine).

Introduction to art and LArSoft
  • Art provides the tools physicists in a large collaboration need in order to contribute software to a large, shared effort without getting in each others’ way.

  • Art helps us keep track of our data and job configuration, reducing the chances of producing mystery data that no one knows where it came from.

  • LArSoft is a set of simulation and reconstruction tools shared among the liquid-argon TPC collaborations.

Expert in the Room - LArSoft How to modify a module
  • DUNE’s software stack is built out of a tree of UPS products.

  • You don’t have to build all of the software to make modifications – you can check out and build one or more products to achieve your goals.

  • You can set up pre-built CVMFS versions of products you aren’t developing, and UPS will check version consistency, though it is up to you to request the right versions.

  • mrb is the tool DUNE uses to check out software from multiple repositories and build it in a single test release.

  • mrb uses git and cmake, though aspects of both are exposed to the user.

Code-makeover on how to code for better efficiency
  • CPU, memory, and build time optimizations are possible when good code practices are followed.

Grid Job Submission and Common Errors
  • When in doubt, ask! Understand that policies and procedures that seem annoying, overly complicated, or unnecessary (especially when compared to running an interactive test) are there to ensure efficient operation and scalability. They are also often the result of someone breaking something in the past, or of simpler approaches not scaling well.

  • Send test jobs after creating new workflows or making changes to existing ones. If things don’t work, don’t blindly resubmit and expect things to magically work the next time.

  • Only copy what you need in input tar files. In particular, avoid copying log files, .git directories, temporary files, etc. from interactive areas.

  • Take care to follow best practices when setting up input and output file locations.

  • Always, always, always prestage input datasets. No exceptions.

Code-makeover - Submit with POMS
  • Always, always, always prestage input datasets. No exceptions.

Closing Remarks
  • The DUNE Computing Consortium has presented this workshop so as to broaden the use of software tools used for analysis.

Glossary

FIXME