The Multimodal Universe: 100 TB of Machine Learning Ready Astronomical Data

Angeloudi, Eirini; Audenaert, Jeroen; Bowles, Micah; Boyd, Benjamin M.; Chemaly, David; Cherinka, Brian; Ciucă, Ioana; Cranmer, Miles; Do, Aaron; Grayling, Matthew; Hayes, Erin E.; Hehir, Tom; Ho, Shirley; Huertas-Company, Marc; Iyer, Kartheik G.; Jablonska, Maja; Lanusse, Francois; Leung, Henry W.; Mandel, Kaisey; Martínez-Galarza, Juan Rafael; Melchior, Peter; Meyer, Lucas; Parker, Liam H.; Qu, Helen; Shen, Jeff; Smith, Michael J.; Walmsley, Mike; Wu, John F.; Multimodal Universe Collaboration
Referencia bibliográfica

Research Notes of the American Astronomical Society

Fecha de publicación:
12
2024
Número de autores
29
Número de autores del IAC
2
Número de citas
0
Número de citas referidas
0
Descripción
We present the Multimodal Universe, a new framework collating over 100 TB of multimodal astronomical data for its first release, spanning images, spectra, time series, tabular and hyper-spectral data. This unified collection enables a wide variety of machine learning (ML) applications and research across astronomical domains. The dataset brings together observations from multiple surveys, facilities, and wavelength regimes, providing standardized access to diverse data types. By providing uniform access to this diverse data, the Multimodal Universe aims to accelerate the development of ML methods for observational astronomy that can work across the large differences in astronomical datasets. The framework is actively supported and is designed to be extended whilst enforcing minimal self consistent conventions making contributing data as simple and practical as possible.