Output format¶
larnd-sim
can generate a realistic datastream saved into an HDF5 file
format. HDF5 is a widely-used, highly performant file format, see the HDF5 website for more details on the file format.
Internally, larnd-sim
uses h5py, an open source
cython package, to manage access to the output file, which we highly
recommended if you are getting started with HDF5.
Charge data¶
For the charge simulation output, larnd-sim
uses the same datasets
as generated by larpix-control
to provide minimal modifications to
downstream analysis code when working with data or simulation.
You can read more about this format here.
In addition to the datasets defined in the above link, larnd-sim
adds
true particle association data within the mc_packets_assn
dataset. This
dataset is a 2-dimensional array with the same first dimension as the
packets
dataset and a second dimension corresponding to the edep-sim track
segment that contributed to the ADC value. The dataset has two fields;
track_ids
, the index into the tracks
dataset for each
entry; and fraction
, the fraction of the ADC’s value that can be attributed
to that edep-sim segment. Because an arbitrary number of track segments can
contribute to each trigger, this dataset is a “ragged” array with null entries
of track_ids == -1
.
Light data¶
For the light simulation output, larnd-sim
uses an analogous data structure
to the ADC64 format generated the the DUNE ND-LAr light system readout
electronics. The datasets are summarized below
light_trig
, shape(n_triggers,)
: meta data associated with each light system self-trigger
op_channel
, shape(n_optical_channels_per_trig,)
: 32-bit integer indicating the optical channel ids included in this trigger
ts_s
: 64-bit double indicating the global timestamp of the trigger in seconds
ts_sync
: 64-bit unsigned integer indicating the larpix timestamp of the trigger
light_wvfm
, shape(n_triggers, n_optical_channels_per_trig, n_adc_samples)
: the ADC samples of each light system self-trigger
light_dat
, shape(n_edepsim_segments, n_optical_channels)
: the true number of photoelectrons and first photon arrival time on each optical channel generated by each edep-sim track segment
n_photons_det
: number of photoelectrons generated by the track segment
t0_det
: arrival time of first photon from the track segment
light_wvfm_mc_assn
, shape(n_triggers, n_optical_channels_per_trig, n_adc_samples)
, (only generated if full MC truth propogation is enabled): the true contribution of each edep-sim track segment to each ADC sample in the light waveform data
track_ids
, shape(n_max_tracks,)
: index of true edep-sim segment contributing to ADC value
pe_current
, shape(n_max_tracks,)
: true equivalent photocurrent generated by the edep-sim segment
Charge/light matching¶
Because there is no common trigger for the charge and light digitizations, association between the two datastreams must be done using the timestamp. A short example using numpy is provided here:
import h5py
import numpy as np
f = h5py.File(<larnd-sim file>, 'r')
charge_packets = f['packets'][:]
charge_trigger = charge_packets['packet_type'] == 7
charge_trigger_timestamp = charge_packets[charge_trigger]['timestamp']
light_trigger_timestamp = f['light_trig'][:]['ts_sync']
timestamp, packet_index, light_index = np.intersect1d(charge_trigger_timestamp, light_trigger_timestamp)
event0_packets = charge_packets[packet_index[0]:packet_index[1]]
event0_waveforms = f['light_wvfm'][light_index[0]:light_index[1]]