The Earth and its external fluid envelopes are complex dynamical systems characterized by physical, chemical and biological processes interacting across a broad continuum of temporal and spatial scales. It is also the home of human societies interacting ever more closely with these environments. Observing, understanding and modelling the Earth systems’ history and their integrated functioning and predicting their responses to global changes is a key and pressing research challenge and a necessity for many environmental and socio-economic applications related to the implementation of the UN Sustainable Development Goals (SDGs). Access to all of the data diversity from the various subsystems and environments is vital to address the challenges that face us, such as natural hazards, increased anthropogenic pressures, climate change, resources and biodiversity issues and their impacts on health.


Different information systems have been developed to make data findable, accessible, interoperable and reusable (FAIR) in the different domains through national data centres dedicated to the atmosphere, oceans, land surfaces and solid Earth (Data Terra Research Infrastructure (RI)), the national biodiversity data centre (PNDB RI), together with results from reference climate simulations (CLIMERI-France RI).

The landscape remains however fragmented into domains of independent research that lead to a proliferation of data sources, standards and tools, together with a wide diversity of data, data products and volumes of data, some of them exceeding already the petabyte scale. Integrated, transparent and seamless access to this cornucopia of data across a continuum of interoperable and distributed infrastructures is today a critical issue for Earth system, biodiversity and environment sciences to overcome this fragmentation, accelerate science-driven extraction, composition and use of these data enabling innovative research practices and discovery processes that address inter and transdisciplinary scientific challenges and can inform decision support systems. At the same time this must enable high-quality data products synthetized from different sources to be easily transferred back to knowledge on subsystems. 


The GAIA Data project is led by the three Research Infrastructures referenced above, all of them registered on the national roadmap research infrastructures (e-infrastructures or cyber-infrastructures).

The objective of the project is to co-develop and co-implement, an integrated platform of distributed data and services supported across a continuum of science-driven data centres for the understanding of the Earth system, the environment and  biodiversity. Targeted to the scientific community, the public and socio-economic stakeholders, these services will be accessible through user-oriented interfaces (e.g.; portals and gateways) enabling smart uses of multi-source data (acquired by satellites, ships, aircraft, drones, submersibles, balloons, in-situ devices, inventories, observatories and experiments, as well as from reference simulations) in inter- and trans-disciplinary research practices.


The open and interoperable platform of data and services will enable transparent and seamless access to the full range of data sources, their extraction and combination for developing high-quality synthesised data products that meet FAIR and open science criteria across all compartments of the Earth systems and interactions, from the Earth’s core to the Earth’s fluid envelops (atmosphere, ocean) and biosphere. It will also deliver scalable thematic services meeting the needs of scientific communities and research practices relying on observation, experimentation and modelling of the Earth system, biodiversity and the environment.

For a more detailed description of the project and more information, please write to contact@dataterra.org