PLEASE USE THE NEW justIN SYSTEM INSTEAD OF POMS

The JustIn Tutorial is currently in docdb at: JustIn Tutorial

The JustIn system is described in detail at:

JustIn Home

JustIn Docs

Note More documentation coming soon

justIN

is the new workflow system replacing POMS
It can be used to process several input files by submitting batch jobs on the grid
justIN is a workflow system that processes data by satisfying the requirements of data location/data catalog, rapid code distribution service and job submission to the grid.

justIN ties together:

MetaCat search queries that obtain lists of files to process
Rucio knowledge of where replicas of files are
a table of site-to-storage distances to make best choices about where to run each type of job

To process data using justIN:

You need to provide a jobscript(shell script) with some basic tasks:

Setup software environment
Use rucio to ind where the data is
Process the data
Save the output in a defined location

justin simple-workflow <args...>

once you run the command, you get the workflow ID.

In case of any problem, you can stop your workflow by running

finish-workflow --workflow-id <ID>

Next topics:

Understand how a jobscriptis structured
Process data using standard code
Process data using customized fclfiles and/or customized code
Select the input dataset
Specify where your output should go (jobs writing to scratch)

Examples of jobscripts are provided in the GitHub production repository.

A jobscripts checklist is available in the backup

Two general remarks:

Note ALWAYS test code and jobscriptbefore sending jobs to the grid

For any large processing (MC or DATA) producing large output that has to be shared within the Collaboration, please contact the production group.

Things you can do

Process data (submit a job to the grid) if you are using code from the base release and you don’t actually modify any of it
Once you have identified what data you want to process, you can see the most recent data (official data) sets available at:

https://wiki.dunescience.org/wiki/Data_Collections_Manager/data_sets

Example: Let’s say you want to run mergeanafor electron neutrinos,

First: Where is the data?

In DUNE we provided datasets to easily identify a collection of files

for example:

fardet-hd:fardet-hd__fd_mc_2023a_reco2__full-reconstructed__v09_81_00d02__standard_reco2_dune10kt_nu_1x2x6__prodgenie_nue_dune10kt_1x2x6__out1__validation

Dataset names tend to be self explanatory and includes the type of detector, which fcl files were used to produce it, the software version, data tier, and a tag, in this case, the tag is validation.

Lets try to process mergeanain the first 100 files that in the data sets,
MetaCat relies on Metacat Query Language (MQL) queries to select a collection of files. In this case to select the first 100 files of a given data set. The query would be something like:

"files from fardet-hd:fardet-hd__fd_mc_2023a_reco2__full-reconstructed__v09_81_00d02__standard_reco2_dune10kt_nu_1x2x6__prodgenie_nue_dune10kt_1x2x6__out1__validation ordered limit 100 "

The flag ‘ordered’ is crucial to ensure reproducibility

example jobscript

https://github.com/DUNE/dune-prod-utils/blob/main/justIN-examples/submit_ana.jobscript

# fcl file and DUNE software version/qualifier to be used
FCL_FILE=${FCL_FILE:-standard_ana_dune10kt_1x2x6.fcl}
DUNE_VERSION=${DUNE_VERSION:-v09_81_00d02}
DUNE_QUALIFIER=${DUNE_QUALIFIER:-e26:prof}

a bit further down

# Setup DUNE environment
source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
setup dunesw "$DUNE_VERSION" -q "$DUNE_QUALIFIER"

and here is how you do the actual processing:

# Here is where the LArSoft command is called 
(
# Do the scary preload stuff in a subshell!
export LD_PRELOAD=${XROOTD_LIB}/libXrdPosixPreload.so
echo "$LD_PRELOAD"

lar -c $FCL_FILE $events_option -o $outFile "$pfn" > ${fname}_ana_${now}.log 2>&1
)

The scary preload is to allow xroot to read hdf5 files.

‘Process data (submit a job to the grid) if you are just using code from the base release and you don’t actually modify any of it

$ USERF=$USER $ FNALURL=’https://fndcadoor.fnal.gov:2880/dune/scratch/users’ $ justinsimple-workflow –mql”files from fardet-hd:fardet-hd__fd_mc_2023a_reco2_full-reconstructed__v09_81_00d02__standard_reco2_dune10kt_nu_1x2x6__prodgenie_nu_dune10kt_1x2x6__out1__validation skip 5 limit 5 ordered “ –jobscriptsubmit_ana.jobscript–rss-mb 4000 –output-pattern ‘*_ana*.root:$FNALURL/$USERF” ‘You can look at your job status by using justIN dashboard https://justin.dune.hep.ac.uk/dashboard/?method=list-workflows

Custom fcl file

Process data (submit a job to the grid) if you are using code from the base release and you want to use a customized FCL file
To do that, the best is to use the Rapid Code Distribution Service (RCDS) via cvmfs as explained in the tutorial
Let’s say you have a customized FCL file that you need to run over some datasets. As per instruction from the DUNE justINtutorial you need to tar the files needed and put them in cvmfs.

$ tar cvzmy_fcls.tar my_fcls
$ source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
$ setup justin
$ rm -f /tmp/x509up_u`id -u`
$ kx509
$INPUT_TAR_DIR_LOCAL=`justin-cvmfs-upload my_fcls.tar`

Wait a few minutes to check the files

$ ls -l $INPUT_TAR_DIR_LOCAL

You can look at the example at https://github.com/DUNE/dune- prod-utils/blob/main/justIN-examples/submit_local_fcl.jobscript

‘The key part of the code is the following

justin simple-workflow --mql "files from fardet-hd:fardet-hd__fd_mc_2023a_reco2__full-reconstructed__v09_81_00d02__standard_reco2_dune10kt_nu_1x2x6__prodgenie_nu_dune10kt_1x2x6__out1__validation skip 5 limit 5 ordered ' --jobscript submit_local_fcl.jobscript --rss-mb 4000 --env INPUT_TAR_DIR_LOCAL="$INPUT_TAR_DIR_LOCAL"

Things you can do

Image

‘Process data (submit a job to the grid) if you are NOT using code from the base release and you want to use customized code

‘Probably you are developing some reconstruction algand you want to check the results in a large sample, before committing your software to GitHub

‘You can use your customized software (e.g. local installation of dunereco) and use justINto process the data with your new LArSoftmodule

‘Similar to the previous part, you will need to provide all pieces in a tar file and put them in cvmfs

$ tar cvz my_code.tar my_code ‘Here my_code.tar includes a directory with my_fcls files and one with my local products (e.g. local Products_larsoft_v09_85_00_e26_prof) this is similar to what you used to do when using jobsub and using customized code/

Things you can do

how to ‘navigate’ in justINdashboard. Example: you want to check outputs/logs for jobs from workflow 1850

To access full statistics: -sites where jobs ran -storage used for input/output

To access details of each job (see next page)

To access log files

For each file, you see where it was processed and which RucioStorage Element it came from.

How it looks like if there are failed jobs

To list storage elements (where data can be)

backup

How to setup MetaCat, Rucio and justIN(on dunegpvm) first run: /cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainershell –shell=/bin/bash -B /cvmfs,/exp,/nashome,/pnfs/dune,/opt,/run/user,/etc/hostname,/etc/hosts,/etc/krb5.conf –ipc–pid/cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest Then: source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh setup python v3_9_15 setup rucio kx509 export RUCIO_ACCOUNT= $USER export METACAT_SERVER_URL=https://metacat.fnal.gov:9443/dune_meta_prod/app export METACAT_AUTH_SERVER_URL=https://metacat.fnal.gov:8143/auth/dune setup metacat setup justin justinversion rm -f /var/tmp/justin.session.id-u justintime

Links MetacatWEB interface: https://metacat.fnal.gov:9443/dune_meta_prod/app/auth/login

justIN: https://justin.dune.hep.ac.uk/docs/

Slack channels: #workflow

Computing Basics for DUNE - Revised 2025 edition: Submit grid jobs with justIN

PLEASE USE THE NEW justIN SYSTEM INSTEAD OF POMS

Note More documentation coming soon

justIN

To process data using justIN:

Next topics:

Note ALWAYS test code and jobscriptbefore sending jobs to the grid

For any large processing (MC or DATA) producing large output that has to be shared within the Collaboration, please contact the production group.

Things you can do

Example: Let’s say you want to run mergeanafor electron neutrinos,

First: Where is the data?

example jobscript

Custom fcl file