Over the past few decades, machine learning has evolved from a theoretical concept into a practical tool, thanks to advancements in computational architectures around the early 2000s. Initially applied predominantly in language and image processing contexts, it was only around 2015 that researchers began exploring its potential in climate science.
However, it was not until approximately two years ago that a significant breakthrough occurred, marking a pivotal moment in the field. “This happened when the emergence of fully data-driven models, which outperformed traditional numerical models, particularly in atmospheric weather forecasting, demonstrated the viability of data-driven approaches in climate modeling,” says Italo Epicoco, principal scientist at CMCC and Assistant Professor at the University of Salento, working on machine learning approaches for climate modeling.
This watershed moment has opened new avenues in climate research, ushering in an era where data-driven methodologies are becoming increasingly prominent in understanding and predicting complex climate systems.
In this framework, CMCC has joined this shift in the field with exploratory endeavors focused on downscaling techniques, which already yielded promising results in regional applications. “We must be careful not to flatten machine learning only on earth science modeling,” says Epicoco. “CMCC is traditionally not only concerned with climate models, but also with models that describe how climate change impacts various contexts, such as the impacts on soil and agriculture, or on coastal erosion. These are examples where CMCC has already started working for some years now.”
CMCC has been actively exploring for some years contextualized machine learning approaches to address the socio-economic impacts of climate change, or to support coastal management, with successful applications in monitoring and fishing-related activities. Additionally, CMCC is working to harness machine learning techniques to analyze satellite data for land use classification, facilitating informed decision-making in agriculture and environmental management. Ongoing initiatives, such as fire monitoring, underscore CMCC’s commitment to integrating machine learning methodologies across various domains to enhance climate research and resilience efforts.
CMCC is seeking to identify, and if possible, anticipate the challenges of the future by developing comprehensive models that incorporate human behavior and economic dynamics. Epicoco acknowledges the complexity of this endeavor: “To address climate impacts and its interactions with socio-economic systems, we need to be able to manage such an amount of data that is much bigger than what you face in the case of forecasts,” he explains.
Machine learning as an innovative approach to climate research
For instance, the CLINT (CLimate INTelligence) project, a collaborative effort involving CMCC, Politecnico di Milano, and other European partners, utilizes Artificial Intelligence (AI) and Machine Learning techniques to enhance the detection, causation, and attribution of extreme events. Within this project, researchers at CMCC are investigating ways to improve the detection and forecasting of extreme weather events, particularly tropical cyclones and heatwaves, by integrating AI into standard detection methods.
Recent developments, such as the enhancement of global indices for tropical cyclone occurrence, demonstrate the potential of this collaboration to yield significant outcomes. Ongoing efforts within the CLINT project aim to refine detection mechanisms, improve the representation of extreme precipitation, and advance seasonal predictions. With progress made since its inception in 2021, CLINT is poised to make further strides in 2024, contributing to more accurate and timely forecasts of extreme weather phenomena.
The iMagine project focuses on leveraging machine learning to facilitate more efficient processing and analysis of imaging data in marine and freshwater research. The project’s overarching goal is to accelerate scientific insights related to healthy oceans, seas, coastal and inland waters, encompassing areas such as water pollution mitigation, biodiversity studies, climate change analysis, and beach monitoring. “In the iMagine project, rather than replacing traditional methods, machine learning is employed to enhance them,” says Epicoco. “In a specific application, by effectively calibrating parameters used in oil spill models, our machine learning approach contributed to refining detection processes.” iMagine aims to provide free image datasets and high-performance image analysis tools empowered with AI.
With its sights set on co-designing and implementing an interdisciplinary Digital Twin Engine (DTE), InterTwin seeks to establish an open-source platform equipped with generic and tailored software components for modeling and simulation. By embracing principles of open standards and interoperability, InterTwin endeavors to cultivate a common approach applicable across scientific disciplines, from climate research to environmental monitoring. In this framework, CMCC is leading the research about the evaluation of wildfire risk and its impact by means of machine learning approaches.
“Similarly to InterTwin, SDGs-EYES operates as an infrastructural project rather than an application-focused one, aligning with the UN 2030 Agenda for Sustainable Development’s data-driven approach,” says Epicoco. Both projects emphasize tool development to facilitate data management and optimize training processes, particularly within the realm of machine learning model training.
SDGs-EYES specifically targets boosting European capacity for monitoring Sustainable Development Goals (SDGs) based on Copernicus data, aligning with the EU Green Deal priorities. It aims to build decision-making tools for monitoring SDG indicators related to climate, ocean, and land sectors, utilizing a combination of scientific, technological, and user engagement frameworks.
SDGs-EYES will demonstrate Copernicus’s potential for monitoring key indicators, including GHG emissions and forest cover change, while also addressing socio-economic factors like human health and resources security. By combining top-down scientific approaches with stakeholder-driven methodologies, SDGs-EYES seeks to provide actionable information for SDG indicator assessment, co-designed with users to ensure usability and effectiveness.
“AdriaClim is another project that employs machine learning technologies for interesting applications, such as monitoring the intrusion of salt into rivers in the Adriatic region,” says Epicoco. AdriaClim aims to bolster climate resilience by enhancing the capacity to develop climate adaptation plans and mitigation strategies, to harmonize and improve accessibility to climate observation and modeling tools, facilitating the definition and implementation of adaptation plans for pilot sites in Italy and Croatia.
Focused on coastal and marine regions facing risks such as sea level rise and coastal erosion, the project endeavors to provide high-resolution, reliable climate information and integrated modeling.
Through the development of regional and local-scale information systems, AdriaClim seeks to offer accessible databases and knowledge-based tools for effective climate adaptation planning across the Adriatic region. Its overarching goals include improving climate change adaptation capacity, enhancing knowledge and cooperation on climate observation and modeling systems, and developing advanced information systems and indicators for optimal adaptation planning.
Capacity building for machine learning innovation
“We consider machine learning to be an essential area for the future of climate research,” says Epicoco. The challenges in this respect are many and engaging for scientists working on advanced computing research: “Despite our most recent hardware infrastructure, Juno, not being optimized for machine learning applications, we recognize the proven potential of machine learning in climate modeling. Our ambition is to equip the CMCC supercomputing center with architecture specifically tailored for machine learning applications in the near future.”
But infrastructure alone is not sufficient to lead in the application of machine learning to climate research; it also necessitates comprehensive capacity building across the board.
“The long-term goal of our Machine Learning initiative is inherently collaborative across all research areas at CMCC,” says Epicoco. “This cross-institutional and multidisciplinary approach will aim to bolster the use of machine learning techniques at all levels. Consequently, we wish that every researcher could recognize that, alongside numerical methods, machine learning theory is a valuable tool. This will broaden their problem-solving landscape, allowing for a data-driven approach where applicable. Thus, a secondary objective of the program would be to disseminate the culture of machine learning, enriching the collective knowledge base. It is an ambitious program, but I believe we are well-positioned to succeed.”