Data Products

Within 4MOST we define three different “levels” of data as follows: Level 0 (L0) data are raw data, calibration data, environmental data, and log files. Level 1 (L1) data are one-dimensional, calibrated, science-ready spectra extracted from the raw data. Level 2 (L2) data are the products of science analyses of the 1D spectra, in particular physical properties of 4MOST targets. Examples for L2 data products include element abundances for stars or redshifts and stellar ages for galaxies. L2 products also include spectra stacked over several OBs. L2 products that are to be delivered back to ESO in Phase 3 for public dissemination through ESO’s Science Archive Facility (SAF) are called deliverable L2 (DL2) products. Most DL2 products of Participating Surveys will be generated by the advanced pipelines described below. Any survey may of course generate and publish additional advanced data (AL2) products.

Raw (L0) data

Since all observations with 4MOST will be obtained within the framework of ESO Public Surveys, all L0 data will be publicly available in ESO’s SAF immediately.

Spectra (L1 data)

The data reduction pipeline is currently under development. This pipeline will remove the instrumental signatures and calibrate the raw data. It will produce all L1 data products, including the science-ready, calibrated 1D spectra, their associated variances and bad pixel masks as well as any other associated information. During operations, this pipeline will be run by the Data Management System on all raw data, including the data collected by Non-Participating Surveys. In other words, the Data Management System will provide Non-Participating Surveys with their L1 data as a service. The data reduction pipeline will also generate per-target progress information to be used by the Operations System in its progress monitor and for the preparation of future observations.

Advanced (L2) data products

Within the 4MOST project there are four pipelines producing advanced data products. These are the 4MOST Selection Functions Pipeline (4SP), the 4MOST Galactic Pipeline (4GP), the 4MOST Extragalactic Pipeline (4XP), and the 4MOST Classification Pipeline (4CP). These pipelines will only be available to Participating Surveys. They are described in more detail below.

While the pipelines are still in development and none of the data products have been finalized, the pipeline Infrastructure Working Groups (IWGs) have already drafted example data products that could be produced by their pipelines. The data products follow a format we call “Data eXchange Unit (DXU)”. The current versions of the DXU documents are provided below for information.

4MOST Selection Functions Pipeline (4SP)

4SP will consist of two parts, an object selection function (4SP-OSF) and a geometric selection function (4SP-GSF). The aim of 4SP-OSF is to provide multi-dimensional probability maps as a function of different parameters defined by the individual surveys. The probability maps will evaluate the expected observational biases that the 4MOST instrumental setup will imprint on the spectroscopic success rate as a function of certain parameters (e.g. signal-to-noise ratio, magnitude, redshift, line-width, temperature, metallicity, etc.). The principal task of 4SP-GSF is to track survey completeness as a function of position on the sky, accounting for observational effects such as fibre placement constraints and observing conditions. Additionally, using the survey simulator we will estimate for each object the probability that it is successfully observed. Combining the 4SP-GSF and 4SP-OSF will allow science users to fully model the 4MOST survey.

4MOST Galactic Pipeline (4GP)

4GP will analyse the high-resolution and low-resolution spectra of stellar sources, ranging from O to M spectral types, including variable stars and white dwarfs. For all of the sources, 4GP will measure heliocentric line-of-sight velocities and stellar parameters as well as chemical abundances; for FGK-type stars, up to ~20 individual chemical abundances will be extracted from the spectra. Whenever possible, non-LTE and 3D hydrodynamic models will be used. The pipeline will be able to derive stellar parameters, including ages, from the spectra considering also astrometric, photometric, and asteroseismic information when available, e.g. with data from the Gaia satellite.

4MOST Extragalactic Pipeline (4XP)

4XP will measure the spectral properties of all extragalactic sources observed with 4MOST. The primary, and mandatory, data products produced by 4XP will be robust spectroscopic redshifts and single-componenent emission/absorption lines measurements. Spectroscopic redshifts will be derived using both template cross-correlation and emission line pattern matching, both using photometric priors and without. In addition, 4XP will also derive higher-order galaxy properties from these measurements, such as star-formation rates, gas-phase metallicities, ionization diagnostic line ratios, etc., and fit continuum stellar population models to the spectra to estimate stellar ages and masses.

4MOST Classification Pipeline (4CP)

4CP will use machine learning methods to classify objects. It will divide sources into broad astrophysical categories, such as stars, galaxies and quasars. 4CP will produce probabilistic outputs. 4CP will not use any synthetic templates, in contrast with the 4GP and 4XP pipelines. Instead, it will be trained only on empirical data. Some peculiar objects may clearly not fit into known categories and become outliers. In addition, some other objects may be hard to classify due to, e.g., low signal, which will put them into the unknown category. Both outliers and unknown objects may be the most interesting for potential discoveries.