Introduction To The Conditions Database

Overview

Teaching: 30 min
Exercises: 0 min

Questions

What is the conditions database?

What kind of data is stored in this database?

Objectives

Learn the basic structure and what information is stored on the conditions database

The Conditions Database

HEP experiments take bast amounts of data with the main detectors, but they also have a lot of data, also known as metadata, coming from other sources. This metadata describes the data coming from the read-out of the primary detectors.

Conditions data

The conditions data is a subset of the experimental metadata, specifically referring to the metadata required for offline data analysis and reconstruction. The conditions data of ProtoDUNE is stored in a dedicated database, the Conditions Database.

The conditions database consists of two PostgreSQL databases.

The master store of metadata (UConDB) database is the centralized place where all the information is stored as blobs. Adding new information is straightforward, and there is no need to have a predefined database schema.
The run conditions database stores a subset of the UConDB metadata in a table. This facilitates querying the metadata with conditions on the table values, for example to get all the runs with certain characteristics, like runs with High Voltage = 175 kV.

Conditions database architecture

An illustration of both ProtoDUNE’s metadata stream and the design of the conditions database. The direction of the metadata stream is shown by the arrows. Users may obtain the metadata by interacting with the conditions database APIs.

The conditions database has the following characteristics:

It contains metadata, uploaded daily via cron jobs, from several parts of the experiment, such as: DAQ configurations, slow control parameters, beam instrumentation, data quality, and calibration parameters.
The database allows the stored metadata to be index by time (like slow controls), or by run (like DAQ run configurations).
The following API’s facilitate the interaction of users with the conditions data: an interface with art which is the framework used by the offline software to process LArTPC data from ProtoDUNE; a python rest API; a C++ API; and an interface with Metacat which is ProtoDUNE’s metadata file catalog.
There is detailed documentation on how to install and use the UconDB and the run conditions database python API’s that, amongst other things, lets users upload or download data from the databases.
Admin users are the only ones allowed to upload data to the databases. Nevertheless all users can view and download the data.

The master store of metadata (UConDB) database

ProtoDUNE UConDB has a collection of folders which in turn can have a collection of objects. Each object has a unique key that should be used to retrieve the information. The key can be the run number or a timestamp.

The ucondb will store all ProtoDUNE metadata in the folder protodune_conditions. Each kind of metadata will have it’s own object, for example:

daq_config_v2 - the configuration of the DAQ processes in each run and the configuration of the front-end electronics
ifbeam_imagnet - the current of the magnet “” coming from the IFbeam DB
ifbeam_momentum - the beam central momentum calculated using the magnet current
np04_hv_setting - the setting of the high voltage comming from the slow cotrols DB

Getting started

There are different ways to access and upload/download the contents of the ucondb.

A REST API was created by Igor V Mandrichenko. It also contains a command line interface. Instructions on how to install it and how to use it can be found in the UConDB documentation.

The UConDB URL server must be provided, for ProtoDUNE data use:

export UCONDB_SERVER_URL='https://dbdata0vm.fnal.gov:9443/protodune_ucon_prod/app'

The ProtoDUNE folder should also be provided which is: protodune_conditions

Quick Access

Alternatively to the REST API, the curl command can be used to retrieve the data:

curl -o output.file "{UCONDB_SERVER_URL}/get?folder={folder_name}&object={object_name}&tv={key}"

where {UCONDB_SERVER_URL}, {folder_name}, {object_name}, and {key} must be replaced with the desired information. An example that retrieves data from the folder named test, the object named test, usesing the run key of 12008 is provided below.

curl -o output.file "https://dbdata0vm.fnal.gov:9443/protodune_ucon_prod/app/get?folder=test&object=test&tv=12008"

For ProtoDUNE data the folder name protodune_conditions should be used.

Remember

ProtoDUNE data is stored in the folder: protodune_conditions

Upload data to the database

UConDB implements strong client authentication for all requests which modify the state of the database. A username and a password must be used to upload content to the DB. Contact Ana Paula Vizcaya or Norm Buchanan from the database group to get them.

The command line interface of the python web API can be used to upload data, as explained in it’s documentation webpage.

Alternatively, the curl command can also be used:

curl -T data.file --digest -u user:password -X PUT "{UCONDB_SERVER_URL}/data/{folder_name}/{object_name}/key={key}"

where {UCONDB_SERVER_URL}, {folder_name}, {object_name}, and {key} must be replaced with the desired information. The key can be a float, usually corresponding to the run number or timestamp.

Key Points

There are several APIs available to access the information in the conditions database. Which to use depends on your specific needs

Run Conditions Database (Condb2)

Overview

Teaching: 30 min
Exercises: 0 min

Questions

What is the Run Conditions Database (Ucondb2)?

How to access it?

Objectives

Learn how to upload and download data from the Run Conditions Database

The Run Conditions Database (Condb2)

Is a PostgreSQL relational database that records and keeps track of the conditions parameters of ProtoDUNE. It is based on the conditions database at FNAL

It stores a subset of the ProtoDUNE metadata, which is kept organized in tables. Some examples are:

Run Conditions table
Calibration tables

Quick look at the data

The Conditions Database web interface shows all the table names, of the conditions database. There, it is also possible to plot some values of certain tables as shown in the following picture.

Example plot in conditions database web interface

Alternatively to the web interface, the curl command can be used to quickly access the data in the tables

curl "{CONDB_SERVER_URL}/get?folder={folder_name}&t={key}"

where {CONDB_SERVER_URL}, {folder_name}, and {key} must be replaced with the desired information. The folder_name must be replaced with the format schema.table_name (like: pdunesp.test), and the key is usually a run number or t >= 0. ProtoDUNE tables are located in the schema pdunesp. An example that retrieves data from the table pdunesp.test to get info from run 23300 is provided below.

curl "https://dbdata0vm.fnal.gov:9443/dune_runcon_prod/get?folder=pdunesp.test&t=23300"

Getting started with the condb2 api

There are two methos for installing the condb2 python API used to interact with the database.

Follow the instructions on their wepabge condb2
- Where the ConDB URL server for ProtoDUNE data is:
```
CONDB_SERVER_URL='https://dbdata0vm.fnal.gov:9443/dune_runcon_prod'
```

Intall it using Spack

On a dunegpvm machine on Alma9 run the following comands:

 source /cvmfs/larsoft.opensciencegrid.org/spack-packages/setup-env.sh
 spack load py-condb2@2.1.4
 condb2 

and it should give the following output

  condb 
    create [options] <database name> <folder_name> <column>:<type> [...]
    write  [options] <database name> <folder_name> < <CSV file>
    read   [options] <database name> <folder_name>
    put    [options] <folder_name> < <CSV file>
    get    [options] <folder_name>
    tag    [options] <folder_name> <tag name>

How to upload data

Alert

Condb2 implements strong client authentication for all requests which modify the state of the database. A username and a password must be provided to create a table and upload content to the DB. Contact Ana Paula Vizcaya or Norm Buchanan to get them.

Depending on what information the user wants to store, there are different ways to create and fill out the tables.

Creating tables

When creating a table, the condb2 API will automatically create the following columns:

channel number used to give the table another dimension.
tv key of the table, its data type is a float. There are different possibilities of how to use it and I will go into more detail later on, but they usually are: run numbers, timestamp of when the row was uploaded, or the same value for all rows.
tr timestamp in unix time of when the column data was uploaded, this is used to create the tags, or versions pointers of the tables
data_type optional key of the table. With it, is possible to specify an extra condition, for example detector type, or data vs montecarlo

The next table shows the four columns of all conditions tables in the order in which they are created and they have to be filled, plus a ‘user_defined’ column that represents the rest of the columns that the users can add.

channel	tv	tr	data_type	user_defined
0	1	30000	TEST	other

Table with run number as tv_key

In this type of table is possible to store information concerning a run. The column tv is used to store the run number. The column channel can be left with a constant value (cero), if there are no multiple channels, or it can be used as intended with one number per channel. The column tr is filled automatically with the timestamp (unix time) when the data from the row was uploaded. And the column data_type is optional and can be used as an extra key, so the user can choose to specify a string or not.

The next table is an example of a table with run number as key. The monte carlo data has just one channel and so if the data is retrieved for those run numbers it will return just one value. Since the raw data has multiple channels, if its retrieved for run 3 it will return 3 rows of data.

channel	tv	tr	data_type	user_defined
0	1	1712184741	mc_data	other
0	2	1712184742	mc_data	other
0	3	1712184743	raw_data	other
1	3	1712184744	raw_data	other
2	3	1712184745	raw_data	other

Run numbers are interpolated

It is important to note that when using the run number ot tv_key to extract data, the function interpolates the tv_key values. So, if there is a run that does not have any column, the ‘get’ function will return the row with the run number closest to the value given.

Table with timestam as tv_key

In this type of table its possible to store information that changes with time, for example slow control parameters.

The column tv is used to store the timestamp in unix time when the data was taken. The column channel can be left with a constant value, if there are no multiple channels, or it can be used as intended with one number per channel. The column tr is filled automatically with the timestamp (unix time) when the data from the row was uploaded. And the column data_type is optional, so the user can choose to specify a string or not.

The next table shows an example with timestamp as tv_key.

channel	tv	tr	data_type	user_defined
0	1712184001	1712184741	mc_data	other
0	1712184002	1712184742	mc_data	other
0	1712184003	1712184743	raw_data	other
1	1712184003	1712184744	raw_data	other
2	1712184003	1712184745	raw_data	other

Example of how to create a table using the condb2 python API

The following is an example of how to create a table using the condb2 api

import os, subprocess, re

# Table information
table_name = 'pdunesp.test_prueba' #schema.table_name
host = ###
port = ###
user = ####
passw = ####
r_permission = ### # DB users to grant read permissions to
w_permission = ### #DB users to grant write permissions to
database = ###

payloads = 'start_time:float run_type:text extra_condition:hstore'
com = f'condb create -h {host} -p {port} -U {user} -w {passw} -s -R {r_permission} -W {w_permission} {database} {table_name} {payloads}'

comm = re.split(' ', com)
try:
    subprocess.run(comm)
except:
    print(f'something didnt work with the creation of the table')

Uploading data to the table

Now its time to upload data to the previously created table.

The following examples use the ucondb2 API.

Using the command line interface

condb2 put [options] <folder_name> <CSV file>

  Options:
      -s <server URL>             CONDB_SERVER_URL envirinment variable can be used too
      -U <username>
      -w <password>
      -d <data type>

Using the web access python API

In the following example it is possible to fill out all the columns of the table or specify which ones to fill

from condb2 import ConDBClient

client = ConDBClient('https://dbdata0vm.fnal.gov:9443/dune_runcon_prod')
folder = 'pdune.ExampleTable'

data = [0, 18000, "test"]   #Starting with channel, tv, 
columns = [“channel”, “tv”, "user_defined"]   #Specify columns to fill or leave blank for all
client.put_data(folder, data, columns,  data_type="pdune_hd") # data type can be used as another key in the table

Tag a table

It is possible to tag a table to create different versions. The API will always returns values from the ‘newer’ version unless otherwise specified. The tag is related to the tr column of when the data was uploaded. It is also possible to leave coments when creating a new table version.

The following is an example of how to create a new version (v1.001) of the table “ExampleTable” in the namespace “pdune” using the ucondb2 API and using the web access python API

from condb2 import ConDBClient

client = ConDBClient('https://dbdata0vm.fnal.gov:9443/dune_runcon_prod')
folder = 'pdune.ExampleTable'

client.tag_state(folder, tag='v1.001', tr=None, copy_from=None, override=False)

It is also possible to look at the tags that a table already has. For that we need to use the Direct Access Python API. In general, it is better to use the web access python API, and more information can be found in the condb2 webpage. The following is an example to look at the tags.

from condb2 import ConDB

def connect(self):
    host    =  ### 
    port    =  ###
    db_name =  ###
    user    =  ###
    passw   =  ###
    connstr = f'host={host} port={port} dbname={db_name} user={user} password={passw}'
    db = ConDB(connstr = connstr)
    return db.openFolder('pdune.ExampleTable')

folder = connect()

tags = folder.tags()
for tag in tags:
            print(tag, 'This is the tag')

More in depth look at the data

It is possible to examine the data by appliying conditions to the table columns to filter resutls.

from condb2 import ConDBClient

client = ConDBClient('https://dbdata0vm.fnal.gov:9443/dune_runcon_prod')
folder = 'pdune.ExampleTable'

# get_data example - Get data from given run
run = 18000
columns, data = client.get_data(table, t0=run)
print("columns:", ','.join(columns))
for line in data:
    print(line)

# search_data example - Search data, or runs that comply with the following conditions
con = [("run_type","=",'PROD'),("buffer",">=",0)] # Example conditions on the data
col, data1 = client.search_data(folder, conditions=con)
print("columns:", ','.join(col))
for line in data1:
    print(line)

Key Points

There are several APIs available to access the information in the conditions database. Which to use depends on your specific needs

Run Conditions Table

Overview

Teaching: 30 min
Exercises: 0 min

Questions

What is the Run Conditions Table?

What data is stored in this table?

How to access the data?

Objectives

Learn how to access and stream the data from tables in the conditions database by using the run conditions table as an example

The Run Conditions Table

The run conditions table is stored at the ProtoDUNE conditions database. It contains the conditions for each run, and the data can be accessed through various methods to suit different user needs.

Table information

As of now, the table contains the following metadata:

	tv (run)	tr	data_type	upload_time	start_time	stop_time	run_type	detector_id	software_version
Unit	N/A	Unix	N/A	Unix	Unix	Unix	N/A	N/A	N/A
Example	25034	1713497099.738875	np02_coldbox or np04_hd	1713497099.7388604	1713268519.0	1713269109.0	PROD	np02_coldbox or np04_hd or np02_hermes_WIB_conf	fddaq-v4.4.0-rc3-a9
Comment	Run number	Used for versioning					data_stream

data_quality	ac_couple	baseline	buffering	enabled	gain	gain_match	leak	leak_10x	leak_f	peak_time	pulse_dac
N/A	N/A	N/A	N/A	N/A	mV/fC	N/A	pA	N/A	pA	us	N/A
good or bad	dc_coupling or ac_coupling	2	0	True	14.0	True	500.0	False	None	2.0	0
offline good runs		0 (900 mV), 1 (200 mV), 2 (200 mV collection, 900 mV induction)	0 (no buffer), 1 (se buffer), 2 (sedc buffer)	True of FEMB should be configured and read out by WIB	Options: 14, 25, 7.8, 4.7 mV/fC	Enable pulser DAC gain matching		Multiply leak current by 10 if true	final leak value	Channel peak time selector	Pulser DAC setting [0-63]

strobe_delay	strobe_length	strobe_skip	test_cap	adc_test_pattern	cold	detector_type	pulser
N/A	N/A	N/A	N/A	N/A	N/A	N/A	N/A
255	255	255	False	False	False	wib_default	False
64MHz periods to skip after 2MHz edge for strobe (pulser offset 0-255)	Length of strobe in 64MHz periods (pulser length 0-255)	2MHz periods to skip after strobe (pulser period 0-255)	Enable the test capacitor	True if the COLDADC test pattern should be enabled	True if the front end electronics are COLD (77k)	Options: WIB default, upper APA, lower APA, CRP	True if the calibration pulser should be enabled

beam_momentum	beam_polarity	detector_hv
GeV/c	N/A	V
+5	positive or negative	175000
indirectly calculated using magnet MBPL.022.692 current, and rounded up	looking at magnet MBPL.022.692 current	from sensorID: 47894774153498 which is NP04_DCS_01:Heinz_V

How to access the data

Curl command

There are several methods available for users to access the data of the run conditions table. The most straightforward way is to access the data with a curl command on the terminal or on the web.

What you need to know

The database url. For our example: https://dbdata0vm.fnal.gov:9443/dune_runcon_prod

The table name. For our example: pdunesp.run_conditionstest

curl "https://dbdata0vm.fnal.gov:9443/dune_runcon_prod/get?folder=pdunesp.run_conditionstest&t=28650"

The above example queries the conditions of run “25034” or the closest run to “25034”.

Remember

If the given run does not exists the previous query will return the closest run.

The column representing run number is called tv.

To get the conditions of a range of runs, for example [25100,25115] use the following query:

curl "https://dbdata0vm.fnal.gov:9443/dune_runcon_prod/get?folder=pdunesp.run_conditionstest&t0=28650&t1=28655"

Python API (Condb2)

Detailed documentation about the conditions database python REST API can be found in the condb2 webpage.

To use the API, it is necessary to install it, as it is explained in the webpage. Once that is done, it is recommended to use the condb2 web access python API, instead of the direct access. For the run conditions table, the url that must be provided is:

The database url: https://dbdata0vm.fnal.gov:9443/dune_runcon_prod
The table/folder name: pdunesp.run_conditionstest

The web access python API includes functions for retrieving and uploading data, as well as creating table and table versions. Detailed documentation on how to use these functions can be found in the web access python API webpage. To use the functions get_data or search_data it is not necessary to provide username or password, nevertheless they are required to use the function put_data. To get a valid username and password contact Ana Paula Vizcaya or Norm Buchanan from the database team.

c++ interface and art service

The run conditions table uses the c++ interface and art service of the conditions database mentioned in the conditions database of ProtoDUNE webpage.

It is located with the dunecalib service. The github repository is dunecalib. A presentation with information on how to use the run conditions c++ interface and service can be found here.

C++ interface

To start using run conditions parameters include the following header files on your c++ file

#include "dunecalib/ConInt/RunConditionsProtoDUNE.h"
#include "nuevdb/IFDatabase/Table.h"

and set up the table to upload

condb::RunConditionsProtoDUNE* runCond = new condb::RunConditionsProtoDUNE();
runCond->SetTableURL("https://dbdata0vm.fnal.gov:9443/dune_runcon_prod/");
runCond->SetTableName("pdunesp.run_conditionstest");
runCond->SetVerbosity(0); // How much output, (0,3) - (none, more)
runCond->SetRunNumber1(0); //Change if a range of runs is desired
runCond->UpdateRN(25034); //Run Number
//runCond->SetTag(gDBTag); // If database has more than one version
runCond->LoadConditionsT();

Now you are ready to load and use the conditions parameters! The following is just one example to output some conditions on the terminal:

condb::RunCond_t rc = runCond->GetRunConditions(run);
std::cout << "\tStart time = " << rc.start_time
            << "\n\tdata type = " << rc.data_type
            << "\n\trun Number/sofw = " << rc.run_number
  	        << "\n\tupload time = " << rc.upload_t
            << "\n\tsoftware version = " << rc.software_version
            << "\n\tstop_time = " << rc.stop_time 
            << "\n\tbuffer = " << rc.buffer
            << "\n\tac_couple = " << rc.ac_couple
            << "\n\trun type = " << rc.run_type << std::endl;

A complete example script can be found in: dunecalib/dunecalib/ConInt/getRunConditionsPDUNE.cc and that example can be run as follows:

getRunConditionsPDUNE -r 25016

art service

You can also opt to use an art service. A presentation with some information on how to use it can be found here. The source scripts are located in the git dunecalib git repository: dunecalib/dunecalib/ConIntServices.

Alert

This section needs more details

To set up the art service in your fhicl file. Include the header files:

#include "runconditions_pdune.fcl"

And then in services:

services:
{
   servicerun: @local::pdune_runconditions
}

That should run the run conditinos table service!

And the fcl file with the variables that can change, such as run number: dunecalib/dunecalib/ConIntServices/runconditions_pdune.fcl. There the url and name of the run conditions table are also specified:

BEGIN_PROLOG

pdune_runconditions :
{
  service_provider: "RunConditionsServicePDUNE"

  TableURL: "https://dbdata0vm.fnal.gov:9443/dune_runcon_prod/" 
  TableName: "pdunesp.test"
  RunNumber: 23302.0
  RunNumber1: 0
  DBTag:     "v1.1"
  Verbosity: 1
}

END_PROLOG

Metacat filter

The run conditions table also works as a filter in ProtoDUNE’s metadata catalog (Metacat) queries. With it, you can search for run files using parameters from the run conditions table as filters.

What you need to know

The run conditions table filter name, for metacat, is dune_runshistdb.

To use the filter in metacat to query files, follow the next steps.

Go to Metacat web interface
Click on the query option on the top of the webpage
Enter the query and run

The following is an example of a Metacat query to look for files of run 25016 where the run_type is PROD. This last condition looks at the data from the run conditions table.

filter dune_runshistdb() (files from hd-protodune:hd-protodune_25016) where runs_history.run_type = PROD

The first part contains the filter name ** dune_runshistdb. The code between () contains the dataset of files to look from (files from hd-protodune:hd-protodune_25016). And the final part contain the filter using the run condition parameter **runs_history.run_type = PROD.

Key Points

There are several APIs available to access the information in the conditions database. Which to use depends on your specific needs.

Conditions Database Tutorial

Introduction To The Conditions Database

Overview

The Conditions Database

Conditions data

The master store of metadata (UConDB) database

Getting started

Quick Access

Remember

Upload data to the database

Key Points

Run Conditions Database (Condb2)

Overview

The Run Conditions Database (Condb2)

Quick look at the data

Getting started with the condb2 api

How to upload data

Alert

Creating tables

Table with run number as tv_key

Run numbers are interpolated

Table with timestam as tv_key

Example of how to create a table using the condb2 python API

Uploading data to the table

Using the command line interface

Using the web access python API

Tag a table

More in depth look at the data

Key Points

Run Conditions Table

Overview

The Run Conditions Table

Table information

How to access the data

Curl command

What you need to know

Remember

Python API (Condb2)

c++ interface and art service

C++ interface

art service

Alert

Metacat filter

What you need to know

Key Points