{ "cells": [ { "cell_type": "markdown", "id": "3beb21b8-a705-45a7-bd38-9e9fb5afe9f4", "metadata": {}, "source": [ "# Usage\n", "\n", "Here is an example of using the pre-configured models in this package." ] }, { "cell_type": "markdown", "id": "7c03136b-dc3b-47a1-bf15-f77dd43a68f1", "metadata": {}, "source": [ "```{eval-rst}\n", ".. currentmodule:: ctao_datamodel\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "2eb21616-bfee-44d2-9152-71670261b473", "metadata": {}, "outputs": [], "source": [ "import uuid\n", "\n", "import ctao_datamodel as dm\n", "import ctao_datamodel.models.dataproducts as dp\n", "from astropy.time import Time" ] }, { "cell_type": "markdown", "id": "b47d7c60-c118-4b63-8036-5baad234e097", "metadata": {}, "source": [ "Data product metadata is stored in the {py:class}`Product ` class. You can get more information about it using the functionality of this package. This is helpful to get an overview of what fields are available, which types are used, and which are optional. " ] }, { "cell_type": "code", "execution_count": null, "id": "d036344b-9bca-4979-aa27-25ea2a377e86", "metadata": {}, "outputs": [], "source": [ "dm.print_model(dp.ProductType)" ] }, { "cell_type": "markdown", "id": "58af060c-8667-46ac-ace2-fd1a7442cc61", "metadata": {}, "source": [ "## Creating metadata for a data product" ] }, { "cell_type": "markdown", "id": "0ca991be-4329-4db8-b3fc-dcf98979c413", "metadata": {}, "source": [ "First, we'll determine the data product type we need. For that, see the _CTAO DataProducts Data Model Specification_, and see the table in the appendix to choose the right type. In this example, we will describe a category-3 (final) DL3 event list from an observation. \n", "\n", "### ProductType\n", "\n", "The {py:class}`ProductType ` represents the \"class\" of the data product; all products with the same type should contain the same metadata fields and data format.\n", "\n", "To fill in the {py:class}`ProductType `, we can either use the enumerations directly, or just use the string representations if you know them:" ] }, { "cell_type": "code", "execution_count": null, "id": "3a9b0f8f-7fde-407a-9eed-cf13262a3163", "metadata": {}, "outputs": [], "source": [ "thetype = dp.ProductType(\n", " level=\"DL3\", division=\"Event\", association=\"Subarray\", type=dp.DataType.OBSERVATION\n", ")\n", "thetype" ] }, { "cell_type": "code", "execution_count": null, "id": "e2e31649-62e3-4955-93d0-18fd4e3c3083", "metadata": {}, "outputs": [], "source": [ "print(thetype)" ] }, { "cell_type": "markdown", "id": "9b1d6b99-4ffc-47e9-bc23-178718add923", "metadata": {}, "source": [ "Note that when we turn it into a string, it gets represented in a standard format. You can also create one from an existing string:" ] }, { "cell_type": "code", "execution_count": null, "id": "c2b28260-8438-4dc9-af20-088c9f9b6f37", "metadata": {}, "outputs": [], "source": [ "thetype = dp.ProductType.from_str(\"DL3/Event/Subarray/Observation\")\n", "thetype" ] }, { "cell_type": "markdown", "id": "5b693555-d92b-4387-840a-58082625f7fd", "metadata": {}, "source": [ "We can use the {py:func}`flatten_model_instance` function to see where we are so far:" ] }, { "cell_type": "code", "execution_count": null, "id": "cd8d85cf-34f5-4a06-867c-1c5377bb46ab", "metadata": {}, "outputs": [], "source": [ "dm.flatten_model_instance(thetype)" ] }, { "cell_type": "markdown", "id": "da12acc9-6431-48f5-ba10-de38a87fbdad", "metadata": {}, "source": [ "### InstanceIdentifier\n", "\n", "Next we need to know which instance identifier fields are required for this data product. Again, consult the table in the _DataProducts Data Model Specification_, where we can see that we need the ``obs_id``. Let's use ``1000012345``." ] }, { "cell_type": "code", "execution_count": null, "id": "2f0ec956-48a5-4593-a836-2f44dcd3e74e", "metadata": {}, "outputs": [], "source": [ "instance = dp.InstanceIdentifier(obs_id=1000012345)\n", "dm.flatten_model_instance(instance)" ] }, { "cell_type": "markdown", "id": "748f0dc1-9721-4b91-8dd5-07a539e79b45", "metadata": {}, "source": [ "### Curation" ] }, { "cell_type": "code", "execution_count": null, "id": "62bb3d3b-f3f7-41e3-8252-bb53cd5e44d0", "metadata": {}, "outputs": [], "source": [ "curation = dp.Curation(\n", " release=\"CTAO/DL3-DR1\", copyright=\"CTAO gGmbH\", rights=dp.DataRights.PUBLIC\n", ")\n", "dm.flatten_model_instance(curation)" ] }, { "cell_type": "markdown", "id": "946b877c-e0e8-43d7-b8f2-0a19faf67a95", "metadata": {}, "source": [ "### DataModel and Contact\n", "\n", "We also need to describe the data model that was used to serialize the data product, as well as some contact information:" ] }, { "cell_type": "code", "execution_count": null, "id": "97c02b44-cd58-4d15-a5c0-fef3f1a40b10", "metadata": {}, "outputs": [], "source": [ "model = dp.DataModel(\n", " name=\"GADF\",\n", " version=\"v0.3\",\n", " url=\"https://gamma-astro-data-formats.readthedocs.io/en/v0.3/\",\n", ")\n", "model" ] }, { "cell_type": "code", "execution_count": null, "id": "e49a1cfa-aae4-40a4-95a4-15168de3eab4", "metadata": {}, "outputs": [], "source": [ "contact = dp.Contact(name=\"CTAO HelpDesk\", email=\"help@ctao.org\", organization=\"CTAO\")\n", "contact" ] }, { "cell_type": "markdown", "id": "a155cc6d-dca3-4f6c-92a1-dd6e224b3ad8", "metadata": {}, "source": [ "### Activity Provenance metadata\n", "\n", "The {py:class}`models.dataproducts.Activity` metadata describes part of the local provenance of the data product by specifying the software or human process that generated it. Let's define an example here. The ``id`` of the activity should be a unique UUID generated when the activity started, and it allows one to link multiple data products that were generated by the same software.\n", "\n", "Optionally, we can also add linked data products that were used as input to the activity" ] }, { "cell_type": "code", "execution_count": null, "id": "78f5baaa-6f49-4acb-9ee9-3d0fecfe7d50", "metadata": {}, "outputs": [], "source": [ "activity = dp.Activity(\n", " process=dp.ObservatoryProcess.DATA_PROCESSING,\n", " name=\"generate-dl3.cwl\",\n", " description=\"Workflow that produces DL3 data products for an observation\",\n", " id=uuid.uuid1(),\n", " start=Time(\"2025-11-28 13:45:12.62\"),\n", " software=dp.Software(\n", " name=\"ctao-datapipe\",\n", " version=\"v1.1.0\",\n", " url=\"http://cta-computing.gitlab-pages.cta-observatory.org/dpps/datapipe/datapipe/latest/\",\n", " ),\n", " configuration_id=\"hillas-standard\",\n", ")\n", "dm.flatten_model_instance(activity)" ] }, { "cell_type": "markdown", "id": "7de4ac38-a413-44e4-b006-7520de9df68b", "metadata": {}, "source": [ "Optionally, we can also add linked data products that were used as input to the activity. Here we expect the {py:class}`models.dataproducts.ExternalDataProduct` _id_ to be the unique _instance identifier id_ of the other data product." ] }, { "cell_type": "code", "execution_count": null, "id": "4b26e000-88ef-4575-ae01-f937ba703ec2", "metadata": {}, "outputs": [], "source": [ "activity.inputs = [\n", " dp.ExternalDataProduct(\n", " id=\"1fcade22-d04d-11f0-84af-acde48001122\",\n", " uri=\"file:./irf_calibration.fits\",\n", " role=\"IRF calibration coefficients\",\n", " ),\n", " dp.ExternalDataProduct(\n", " id=\"1e6f6306-d050-11f0-8af5-acde48001122\",\n", " uri=\"file:./other_input.fits\",\n", " role=\"Some critical input\",\n", " ),\n", "]" ] }, { "cell_type": "markdown", "id": "89bc8c92-83a7-48a7-944e-3ece9ab82d36", "metadata": {}, "source": [ "### Observation context\n", "Since this data product is associated with an observation, we can include some optional information about the spatial/temporal/spectral coverage of the data product. It is optional, since it is linked to the ``obs_id`` we already set, however, including it in the data product metadata itself is often useful for discoverability. " ] }, { "cell_type": "code", "execution_count": null, "id": "f92cbce8-6035-40e4-8cb2-c3b7d946900d", "metadata": {}, "outputs": [], "source": [ "observation = dp.Observation(\n", " coverage=dp.Coverage(\n", " time=dp.TimeCoverage(\n", " t_min=\"2026-10-02 15:13:21.1\", t_max=\"2026-10-02 15:13:41.244\"\n", " ),\n", " space=dp.SpaceCoverage(frame=\"ICRS\", ra=129.23, dec=-42.102, field_of_view=6.0),\n", " energy=dp.EnergyCoverage(energy_unit=\"TeV\", energy_min=0.003, energy_max=300.0),\n", " ),\n", ")\n", "dm.flatten_model_instance(observation)" ] }, { "cell_type": "markdown", "id": "b557c797-2c57-477d-beff-132c6ad3aadd", "metadata": {}, "source": [ "### Full Product metadata\n", "Finally, let's build the full {py:class}`Product ` metadata:" ] }, { "cell_type": "code", "execution_count": null, "id": "bde3e2cc-4deb-4331-8fbf-f09c409bd86d", "metadata": {}, "outputs": [], "source": [ "product = dp.Product(\n", " data=thetype,\n", " instance=instance,\n", " description=\"An example DL3 Event list\",\n", " creation_time=Time(\"2025-11-28 14:15:16.123\"),\n", " curation=curation,\n", " model=model,\n", " contact=contact,\n", " activity=activity,\n", " observation=observation,\n", ")\n", "dm.flatten_model_instance(product)" ] }, { "cell_type": "markdown", "id": "682e41ba-b0b8-49d2-8956-0ae410111e7b", "metadata": {}, "source": [ "Note that a few fields have been filled in automatically, like ``instance.id``, which should be unique when this metadata is created." ] }, { "cell_type": "markdown", "id": "ce41c5d1-69b7-409e-97df-aa34c720f2ae", "metadata": {}, "source": [ "## Conversion to and from FITS style keys\n", "\n", "Note that by default any keyword with a ``fits_keyword`` mapping attribute is translated automatically. Keys without one will use the ``HIERARCH CTAO X X X`` long-keyword convention." ] }, { "cell_type": "code", "execution_count": null, "id": "7353e3af-8abf-4326-9b95-4b112a57d454", "metadata": {}, "outputs": [], "source": [ "header = dm.instance_to_fits_header(product)\n", "header" ] }, { "cell_type": "code", "execution_count": null, "id": "8e12f250-c836-4ae0-9584-99d92df76c08", "metadata": {}, "outputs": [], "source": [ "new_instance = dm.fits_header_to_instance(header, dp.Product)\n", "new_instance" ] }, { "cell_type": "markdown", "id": "d3f79e28-9c8c-43b7-a67c-b66afd5641e4", "metadata": {}, "source": [ "## Visualizing\n", "\n", "This code includes a simple wrapper class for visualizing models with PlantUML in a notebook, which can be used as follows:" ] }, { "cell_type": "code", "execution_count": null, "id": "200037b1-db1a-42d3-9830-9aa3f4a683b7", "metadata": {}, "outputs": [], "source": [ "dm.PlantUMLDiagram(dp.ProductType)" ] }, { "cell_type": "markdown", "id": "b69c0c97-5ffc-401a-aa7d-1ef3a28ad81f", "metadata": {}, "source": [ "By default, relationship classes show without details, however to add more detail, you can combine diagrams:" ] }, { "cell_type": "code", "execution_count": null, "id": "25f6bd5d-9445-47a5-864f-dc52a65996cf", "metadata": {}, "outputs": [], "source": [ "dm.PlantUMLDiagram(dp.ProductType) + dm.PlantUMLDiagram(dp.DataDivision)" ] }, { "cell_type": "markdown", "id": "a81c806a-77c7-4e4a-b0a1-394e5eb587eb", "metadata": {}, "source": [ "To visualize full diagrams, use the ``details`` option" ] }, { "cell_type": "code", "execution_count": null, "id": "6a5191f4-b820-4687-9943-d35022f4ef2e", "metadata": {}, "outputs": [], "source": [ "dm.PlantUMLDiagram(dp.ProductType, details=True)" ] }, { "cell_type": "markdown", "id": "392542b8-d49c-485a-ab29-ac3fe694046b", "metadata": {}, "source": [ "You can also add custom diagram text by summing diagrams, for example yu can use it to change colors or to add some hidden connectors to influence the diagram output." ] }, { "cell_type": "code", "execution_count": null, "id": "f4cc3d2a-8835-4962-8225-a399f68cdcc6", "metadata": {}, "outputs": [], "source": [ "preamble = dm.PlantUMLDiagram(\n", " \"\"\"\n", " hide circles\n", " package CTAO.DataProducts #lightblue-white {\n", " }\n", " class CTAO.DataProducts.DataType #red\n", "\n", " CTAO.DataProducts.ProductType -[hidden]u-> CTAO.DataProducts.DataType\n", " \"\"\"\n", ")\n", "\n", "preamble + dm.PlantUMLDiagram(dp.ProductType)" ] } ], "metadata": {}, "nbformat": 4, "nbformat_minor": 5 }