-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdatafacilities.tex
33 lines (17 loc) · 4.51 KB
/
datafacilities.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
\subsection{Data Facilities} \label{sec:datafacilities}
As noted in \autoref{sec:arch}, data processing will occur at three data facilities --- in USA, France, and UK. In particular, preparation of the (typically, annual) Data Releases will be distributed across these three facilities using specialised software tools and techniques for distributed data management and remote job submission adopted from the high-energy physics community, with DM providing the required interfaces to the Science Pipeline.
In this arrangement, the USDF will coordinate each processing {\em Campaign} and be the primary curation site, holding a copy of all raw, intermediate, and science-ready products from each production run of the Science Pipeline. The USDF will also be solely responsible for Prompt Processing.
\subsubsection{US Data Facility} \label{sec:usdf}
\subsubsection{French Data Facility} \label{sec:frdf}
The computing centre of France's National institute of nuclear and particle physics (IN2P3)\footnote{\url{https://cc.in2p3.fr}} hosts and operates Rubin's French Data Facility (FrDF)\footnote{\url{https://doc.lsst.eu}}. This computing and storage infrastructure is sized to store a full copy of the raw images as well as to contribute $40\%$ of the image processing capacity required to produce the Data Releases, for the duration of the observatory's operations phase.
A compute element exposes the site's batch farm to Rubin's central campaign management system and a Butler-compatible storage element (see \autoref{sec:dataabstraction}) stores input data as well as locally-produced data products. At the end of each processing campaign, final products are replicated to the US Data Facility where they are combined for composing the Data Release.
FrDF builds and packages the LSST Science Pipelines for distribution via a software content distribution based on CERN's CernVM File System\footnote{\url{https://sw.lsst.eu}}. This distribution mechanism, which all the Rubin data facilities subscribe to, ensures that they all use an identical copy of the pipelines for the purposes of producing the Data Releases.
In addition, the French Data Facility contributes to perform realistic test campaigns of Rubin's distributed system being developed to prepare the Data Releases, including the development of the inter-facility data replication system. Evaluation instances of the Rubin Science Platform and the catalog database have been locally deployed continuously since several years. The facility also hosts Fink \citep{10.1093/mnras/staa3602}, one of the Rubin community alert brokers.
\subsubsection{UK Data Facility} \label{sec:ukdf}
UK interest in the Vera C. Rubin Observatory is coordinated by the LSST:UK Consortium, which has 36 partners representing all major UK astronomy research groups.
Via the Rubin In-kind Contribution program, LSST:UK has proposed --- among other things --- to provide computing resources and associated staff time to undertake $25\%$ of the computing associated with the preparation of each Data Release.
The infrastructure (the UK Data Facility) for this and other significant in-kind contributions has been secured from the UK IRIS programme (\url{www.iris.ac.uk}), on a mix of grid, high-performance and research cloud facilities.
In particular, it is proposed that Data Release Processing will occur on grid-computing services at Lancaster University and Rutherford Appleton Laboratories (RAL). Staff at Lancaster and RAL are directly involved in the development of the distributed DRP approach with particular contributions to data distribution and progress tracking, job handling, and infrastructure health monitoring.
LSST:UK has also proposed to operation a full Independent Data Access Center, with capacity to serve the two most recent Data Releases to $20\%$ of the anticipated Rubin international community via the Rubin Science Platform.
The UK IDAC is an integral part of the UK Data Facility, mostly hosted in on-premises cloud resources at the University of Edinburgh, though with some ancillary services provided by RAL. At the time of writing, LSST:UK has been running a prototype IDAC for more than two years, hosting precursor and ancillary astronomy surveys for 20 or so early adopters.
Other contributions that are provided by the UK Data Facility include a Rubin Community Broker, called Lasair and an HPC-based instance of the Science Pipeline for the production of specific User-generated Products that support the fusion of LSST with compatible near-infrared surveys and the crossmatch of LSST object catalogues with contemporary surveys.