Output format¶

larnd-sim can generate a realistic datastream saved into an HDF5 file format. HDF5 is a widely-used, highly performant file format, see the HDF5 website for more details on the file format. Internally, larnd-sim uses h5py, an open source cython package, to manage access to the output file, which we highly recommended if you are getting started with HDF5.

Charge data¶

For the charge simulation output, larnd-sim uses the same datasets as generated by larpix-control to provide minimal modifications to downstream analysis code when working with data or simulation.

You can read more about this format here.

In addition to the datasets defined in the above link, larnd-sim adds true particle association data within the mc_packets_assn dataset. This dataset is a 2-dimensional array with the same first dimension as the packets dataset and a second dimension corresponding to the edep-sim track segment that contributed to the ADC value. The dataset has two fields; track_ids, the index into the tracks dataset for each entry; and fraction, the fraction of the ADC’s value that can be attributed to that edep-sim segment. Because an arbitrary number of track segments can contribute to each trigger, this dataset is a “ragged” array with null entries of track_ids == -1.

Light data¶

For the light simulation output, larnd-sim uses an analogous data structure to the ADC64 format generated the the DUNE ND-LAr light system readout electronics. The datasets are summarized below

light_trig, shape (n_triggers,): meta data associated with each light system self-trigger

op_channel, shape (n_optical_channels_per_trig,): 32-bit integer indicating the optical channel ids included in this trigger

ts_s: 64-bit double indicating the global timestamp of the trigger in seconds

ts_sync: 64-bit unsigned integer indicating the larpix timestamp of the trigger

light_wvfm, shape (n_triggers, n_optical_channels_per_trig, n_adc_samples): the ADC samples of each light system self-trigger

light_dat, shape (n_edepsim_segments, n_optical_channels): the true number of photoelectrons and first photon arrival time on each optical channel generated by each edep-sim track segment

n_photons_det: number of photoelectrons generated by the track segment

t0_det: arrival time of first photon from the track segment

light_wvfm_mc_assn, shape (n_triggers, n_optical_channels_per_trig, n_adc_samples), (only generated if full MC truth propogation is enabled): the true contribution of each edep-sim track segment to each ADC sample in the light waveform data

track_ids, shape (n_max_tracks,): index of true edep-sim segment contributing to ADC value

pe_current, shape (n_max_tracks,): true equivalent photocurrent generated by the edep-sim segment

Charge/light matching¶

Because there is no common trigger for the charge and light digitizations, association between the two datastreams must be done using the timestamp. A short example using numpy is provided here:

import h5py
import numpy as np

f = h5py.File(<larnd-sim file>, 'r')

charge_packets = f['packets'][:]
charge_trigger = charge_packets['packet_type'] == 7
charge_trigger_timestamp = charge_packets[charge_trigger]['timestamp']

light_trigger_timestamp = f['light_trig'][:]['ts_sync']

timestamp, packet_index, light_index = np.intersect1d(charge_trigger_timestamp, light_trigger_timestamp)

event0_packets = charge_packets[packet_index[0]:packet_index[1]]
event0_waveforms = f['light_wvfm'][light_index[0]:light_index[1]]