Seismic waves from earthquakes and other sources are used to infer the structure and properties of Earth’s interior. The availability of large-scale seismic datasets and the suitability of deep-learning techniques for seismic data processing have pushed deep learning to the forefront of fundamental, long-standing research investigations in seismology. However, some aspects of applying deep learning to seismology are likely to prove instructive for the geosciences, and perhaps other research areas more broadly. Deep learning is a powerful approach, but there are subtleties and nuances in its application. We present a systematic overview of trends, challenges, and opportunities in applications of deep-learning methods in seismology.
The large amount and availability of datasets in seismology creates a great opportunity to apply machine learning and artificial intelligence to data processing. Mousavi and Beroza provide a comprehensive review of the deep-learning techniques being applied to seismic datasets, covering approaches, limitations, and opportunities. The trends in data processing and analysis can be instructive for geoscience and other research areas more broadly. —BG
The ways in which deep learning can help process and analyze large seismological datasets are reviewed.
Seismology is the study of seismic waves to understand their origin—most obviously, sudden fault slip in earthquakes, but also explosions, volcanic eruptions, glaciers, landslides, ocean waves, vehicular traffic, aircraft, trains, wind, air guns, and thunderstorms, for example. Seismology uses those same waves to infer the structure and properties of planetary interiors. Because sources can generate waves at any time, seismic ground motion is recorded continuously, at typical sampling rates of 100 points per second, for three components of motion, and on arrays that can include thousands of sensors. Although seismology is clearly a data-rich science, it often is a data-driven science as well, with new phenomena and unexpected behavior discovered with regularity. And for at least some tasks, the careful and painstaking work of seismic analysts over decades and around the world has also made seismology a data label–rich science. This facet makes it fertile ground for deep learning, which has entered almost every subfield of seismology and outperforms classical approaches, often dramatically, for many seismological tasks.
Seismic wave identification and onset-time, first-break determination for seismic P and S waves within continuous seismic data are foundational to seismology and are particularly well suited to deep learning because of the availability of massive, labeled datasets. It has received particularly close attention, and that has led, for example, to the development of deep learning–based earthquake catalogs that can feature more than an order of magnitude more events than are present in conventional catalogs. Deep learning has shown the ability to outperform classical approaches for other important seismological tasks as well, including the discrimination of earthquakes from explosions and other sources, separation of seismic signals from background noise, seismic image processing and interpretation, and Earth model inversion.
The development of increasingly cost-effective sensors and emerging ground-motion sensing technologies, such as fiber optic cable and accelerometers in smart devices, portend a continuing acceleration of seismological data volumes, so that deep learning is likely to become essential to seismology’s future. Deep learning’s nonlinear mapping ability, sequential data modeling, automatic feature extraction, dimensionality reduction, and reparameterization are all advantageous for processing high-dimensional seismic data, particularly because those data are noisy and, from the point of view of mathematical inference, incomplete. Deep learning for scientific discovery and direct extraction of insight into seismological processes is clearly just getting started.
Aspects of seismology pose interesting additional challenges for deep learning. Many of the most important problems in earthquake seismology—such as earthquake forecasting, ground motion prediction, and rapid earthquake alerting—concern large and damaging earthquakes that are (fortunately) rare. That rarity poses a fundamental challenge for the data-hungry methods of deep learning: How can we train reliable models, and how do we validate them well enough to rely on them when data are scarce and opportunities to test models are infrequent? Further, how can we operationalize deep-learning techniques in such a situation, when the mechanisms by which they make predictions from data may not be easily explained, and the consequences of incorrect models are high? Incorporating domain knowledge through physics-based and explainable deep learning and setting up standard benchmarking and evaluation protocols will help ensure progress, as is the nascent emergence of a seismological data science ecosystem. More generally, a combination of data science literacy for geoscientists as well as recruiting data science expertise will help to ensure that deep-learning seismology reaches its full potential.