Online Tutorial Welcome and Introduction
|
|
Storage Spaces (2024)
|
Home directories are centrally managed by Computing Division and meant to store setup scripts, do NOT store certificates here.
Network attached storage (NAS) /dune/app is primarily for code development.
The NAS /dune/data is for store ntuples and small datasets.
dCache volumes (tape, resilient, scratch, persistent) offer large storage with various retention lifetime.
The tool suites idfh and XRootD allow for accessing data with appropriate transfer method and in a scalable way.
|
Data Management (2024 updated for metacat/justin/rucio)
|
SAM and Rucio are data handling systems used by the DUNE collaboration to retrieve data.
Staging is a necessary step to make sure files are on disk in dCache (as opposed to only on tape).
Xrootd allows user to stream data files.
|
The old UPS code management system
|
The Unix Product Setup (UPS) is a tool to ensure consistency between different software versions and reproducibility.
CVMFS distributes software and related files without installing them on the target computer (using a VM, Virtual Machine).
|
CVMFS distributed file system
|
|
Introduction to art and LArSoft (2024 - Apptainer version)
|
Art provides the tools physicists in a large collaboration need in order to contribute software to a large, shared effort without getting in each others’ way.
Art helps us keep track of our data and job configuration, reducing the chances of producing mystery data that no one knows where it came from.
LArSoft is a set of simulation and reconstruction tools shared among the liquid-argon TPC collaborations.
|
End of the basics lesson - Continue on your own to learn how to build code and submit batch jobs
|
|
Bonus episode -- Code-makeover on how to code for better efficiency
|
|
Multi Repository Build (mrb) system (2024)
|
|
Expert in the Room - LArSoft How to modify a module - in progress
|
|
Grid Job Submission and Common Errors
|
When in doubt, ask! Understand that policies and procedures that seem annoying, overly complicated, or unnecessary (especially when compared to running an interactive test) are there to ensure efficient operation and scalability. They are also often the result of someone breaking something in the past, or of simpler approaches not scaling well.
Send test jobs after creating new workflows or making changes to existing ones. If things don’t work, don’t blindly resubmit and expect things to magically work the next time.
Only copy what you need in input tar files. In particular, avoid copying log files, .git directories, temporary files, etc. from interactive areas.
Take care to follow best practices when setting up input and output file locations.
Always, always, always prestage input datasets. No exceptions.
|
Submit grid jobs with JustIn
|
|
Expert in the Room Grid and Batch System
|
|
Closing Remarks
|
|