Data-processing services
New in version 0.6.0
One of the core feature of the Galactica web application is to provide access to online data-processing services through a web form. Authenticated users can submit job requests online to trigger the remote execution of post-processing services (a.k.a. Terminus services). Once these jobs are completed, the web application notifies the requesting users by email so they can retrieve their post-processed datasets online.
For Galactica platform contributors, the astrophysix
package provides a way to set bindings
between data-processing services already available (and defined by an admin) on the
Galactica web application and :
to allow authenticated visitors of the web application to submit job requests on these
Snapshots
/Catalogs
.
Warning
Upon uploading your SimulationStudy
HDF5 file on Galactica, you must be
in the list service providers
of the Terminus data host servers you defined into your study. Otherwise, you
won’t have the necessary permissions to bind your
Snapshots
/Catalogs
to its available services. Get in touch with a Galactica admin. to register as a service provider for a specific
data host.
Snapshot-bound services
To link a particular Snapshot
with a data-processing service, you must
define a DataProcessingService
with mandatory service_name
and
data_host
attributes and attach it into the Snapshot.processing_services
list property :
>>> from astrophysix.simdm.results import Snapshot
>>> from astrophysix.simdm.services import DataProcessingService
>>>
>>> >>> sn_Z2 = Snapshot(name="Z~2", data_reference="output_00481")
>>>
>>> # Add data processing services to a snapshot
>>> dps = DataProcessingService(service_name="column_density_map",
... data_host="My_Dept_Cluster")
>>> sn_Z2.processing_services.add(dps)
Catalog-bound services
To link the items of a particular Catalog
with a data-processing service,
you must define a CatalogDataProcessingService
with mandatory
service_name
and data_host
attributes and attach it into the
Catalog.processing_services
list property :
>>> from astrophysix.simdm.services import CatalogDataProcessingService
>>> from astrophysix.simdm.catalogs import TargetObject, ObjectProperty, Catalog, CatalogField
>>> from astrophysix import units as U
>>>
>>> # Define a Target object : a spiral galaxy
>>> cluster = TargetObject(name="Spiral galaxy")
>>> x = tobj.object_properties.add(ObjectProperty(property_name="x", unit=U.Mpc,
... description="Galaxy position coordinate along x-axis"))
>>> y = tobj.object_properties.add(ObjectProperty(property_name="y", unit=U.Mpc,
... description="Galaxy position coordinate along y-axis"))
>>> z = tobj.object_properties.add(ObjectProperty(property_name="z", unit=U.Mpc,
... description="Galaxy position coordinate along z-axis"))
>>> rad = tobj.object_properties.add(ObjectProperty(property_name="radius", unit=U.kpc,
... description="Galaxy half-mass radius"))
>>> m = tobj.object_properties.add(ObjectProperty(property_name="M_gas", unit=U.Msun,
... description="Galaxy gas mass"))
>>>
>>> # Define a catalog of spiral galaxies
>>> gal_cat = Catalog(target_object=tobj, name="Spiral galaxy catalog")
>>> # Add the catalog fields into the catalog (positions, radiuses, masses)
>>> fx = gal_cat.catalog_fields.add(CatalogField(x, values=N.array([...]))) # xgal1, xgal2, ... xgaln
>>> fy = gal_cat.catalog_fields.add(CatalogField(y, values=N.array([...]))) # ygal1, ygal2, ... ygaln
>>> fz = gal_cat.catalog_fields.add(CatalogField(z, values=N.array([...]))) # zgal1, zgal2, ... zgaln
>>> frad = gal_cat.catalog_fields.add(CatalogField(rad, values=N.array([...]))) # rgal1, rgal2, ... rgaln
>>> fm = gal_cat.catalog_fields.add(CatalogField(m, values=N.array([...]))) # mgal1, mgal2, ... mgaln
>>>
>>> # Add the catalog in the snapshot (won't work if you insert it into a GenericResult instead)
>>> sn.catalogs.add(gal_cat)
>>>
>>> # Add a data processing service to the galaxy catalog
>>> dps = CatalogDataProcessingService(service_name="column_density_map",
... data_host="Inst_cluster")
>>> gal_cat.processing_services.add(dps)
Warning
Only Catalogs
belonging to a
Snapshot
can be bound to a
CatalogDataProcessingService
.
Catalog field bindings
For Catalogs
, a data-processing service is meant to target
a user-selected item in the catalog. To execute a service for that specific catalog item, (at least) some properties of
the catalog item must be linked to some parameters of the data-processing service.
Otherwise, the data-processing service does not specifically target any item of the catalog. It is only executed as
a generic data-processing service on the Catalog
’s parent
Snapshot
.
As an example, let us assume one need to execute a 2D column density map
(with e.g. map center coordinates,
map size, image resolution parameters) service on a selection of galaxies identified in a catalog of spiral galaxies
out of a cosmological simulation.
All the galaxies of the catalog are characterized by x/y/z coordinates, mass and radius properties.
To post-process column density maps of a set of galaxies from this catalog :
the coordinates (x/y/z) of the galaxies need to be used as map center coordinates parameter values of the service,
the radius of the galaxies need to be used as map size parameter values of the service (modulo a chosen scaling factor).
To define which CatalogField
must be used as input value for a given
data-processing service parameter, CatalogFieldBinding
instances must be
created and added into the CatalogDataProcessingService
using its
catalog_field_bindings
property. Optionally,
you can define a scaling relation :\(\textrm{param_value} = \textrm{scale} \times \textrm{field_value} + \textrm{offset}\):
>>> from astrophysix.simdm.services import CatalogFieldBinding
>>>
>>> # Here the galaxy coordinates are defined in the catalog wrt to the box (100 Mpc wide)
>>> # center, in the range [-50;50] Mpc.
>>> # Galaxy position normalisation [-50 Mpc; 50 Mpc] / 100 Mpc + 0.5 = [0.0; 1.0]
>>> fbx = CatalogFieldBinding(param_key="xmap", catalog_field=fx,
... scale=1.0e-2, offset=0.5)
>>> fbx = CatalogFieldBinding(param_key="ymap", catalog_field=fz,
... scale=1.0e-2, offset=0.5)
>>> fbz = CatalogFieldBinding(param_key="zmap", catalog_field=fy,
... scale=1.0e-2, offset=0.5)
>>> # The 'column_density_map' service map center parameters are in box normalised units ([0.; 1.])
>>> gal_cat.catalog_field_bindings.add(fbx)
>>> gal_cat.catalog_field_bindings.add(fby)
>>> gal_cat.catalog_field_bindings.add(fbz)
>>>
>>> # Here we choose to create a map four times larger than the galaxy radius.
>>> fb_rad = CatalogFieldBinding(param_key="map_size", catalog_field=frad,
... scale=4.0)
>>> gal_cat.catalog_field_bindings.add(fb_rad)
Note
By default, the scaling factor is 1.0
and the offset is 0.0
(no scaling).