Output format ================= ``larnd-sim`` can generate a realistic datastream saved into an HDF5 file format. HDF5 is a widely-used, highly performant file format, see `the HDF5 website `_ for more details on the file format. Internally, ``larnd-sim`` uses `h5py `_, an open source cython package, to manage access to the output file, which we highly recommended if you are getting started with HDF5. Charge data ----------- For the charge simulation output, ``larnd-sim`` uses the same datasets as generated by ``larpix-control`` to provide minimal modifications to downstream analysis code when working with data or simulation. You can read more about this format `here `_. In addition to the datasets defined in the above link, ``larnd-sim`` adds true particle association data within the ``mc_packets_assn`` dataset. This dataset is a 2-dimensional array with the same first dimension as the ``packets`` dataset and a second dimension corresponding to the edep-sim track segment that contributed to the ADC value. The dataset has two fields; ``track_ids``, the index into the ``tracks`` dataset for each entry; and ``fraction``, the fraction of the ADC's value that can be attributed to that edep-sim segment. Because an arbitrary number of track segments can contribute to each trigger, this dataset is a "ragged" array with null entries of ``track_ids == -1``. Light data ---------- For the light simulation output, ``larnd-sim`` uses an analogous data structure to the ADC64 format generated the the DUNE ND-LAr light system readout electronics. The datasets are summarized below - ``light_trig``, shape ``(n_triggers,)``: meta data associated with each light system self-trigger - ``op_channel``, shape ``(n_optical_channels_per_trig,)``: 32-bit integer indicating the optical channel ids included in this trigger - ``ts_s``: 64-bit double indicating the global timestamp of the trigger in seconds - ``ts_sync``: 64-bit unsigned integer indicating the larpix timestamp of the trigger - ``light_wvfm``, shape ``(n_triggers, n_optical_channels_per_trig, n_adc_samples)``: the ADC samples of each light system self-trigger - ``light_dat``, shape ``(n_edepsim_segments, n_optical_channels)``: the true number of photoelectrons and first photon arrival time on each optical channel generated by each edep-sim track segment - ``n_photons_det``: number of photoelectrons generated by the track segment - ``t0_det``: arrival time of first photon from the track segment - ``light_wvfm_mc_assn``, shape ``(n_triggers, n_optical_channels_per_trig, n_adc_samples)``, (only generated if full MC truth propogation is enabled): the true contribution of each edep-sim track segment to each ADC sample in the light waveform data - ``track_ids``, shape ``(n_max_tracks,)``: index of true edep-sim segment contributing to ADC value - ``pe_current``, shape ``(n_max_tracks,)``: true equivalent photocurrent generated by the edep-sim segment Charge/light matching --------------------- Because there is no common trigger for the charge and light digitizations, association between the two datastreams must be done using the timestamp. A short example using numpy is provided here:: import h5py import numpy as np f = h5py.File(, 'r') charge_packets = f['packets'][:] charge_trigger = charge_packets['packet_type'] == 7 charge_trigger_timestamp = charge_packets[charge_trigger]['timestamp'] light_trigger_timestamp = f['light_trig'][:]['ts_sync'] timestamp, packet_index, light_index = np.intersect1d(charge_trigger_timestamp, light_trigger_timestamp) event0_packets = charge_packets[packet_index[0]:packet_index[1]] event0_waveforms = f['light_wvfm'][light_index[0]:light_index[1]]