Using active learning to improve quasar identification for the DESI spectra processing pipeline

Bibcode

2025JCAP...10..087G

DOI

10.1088/1475-7516/2025/10/087

Green, Dylan; Kirkby, David; Aguilar, J.; Ahlen, S.; Alexander, D. M.; Armengaud, E.; Bailey, S.; Bault, A.; Bianchi, D.; Brodzeller, A.; Brooks, D.; Claybaugh, T.; de Belsunce, R.; de la Macorra, A.; Doel, P.; Fawcett, V. A.; Ferraro, S.; Font-Ribera, A.; Forero-Romero, J. E.; Gaztañaga, E.; Gontcho, S. Gontcho A.; Gutierrez, G.; Ishak, M.; Juneau, S.; Kehoe, R.; Kisner, T.; Kremin, A.; Lambert, A.; Landriau, M.; Le Guillou, L.; Levi, M. E.; Manera, M.; Meisner, A.; Miquel, R.; Moustakas, J.; Myers, A. D.; Palanque-Delabrouille, N.; Prada, F.; Pérez-Ràfols, I.; Rossi, G.; Sanchez, E.; Saulder, C.; Schlegel, D.; Schubnell, M.; Seo, H.; Sinigaglia, F.; Sprayberry, D.; Tan, T.; Tarlé, G.; Weaver, B. A.; Youles, S.; Yu, J.; Zhou, R.; Zou, H.

Bibliographical reference

Journal of Cosmology and Astroparticle Physics

Advertised on:

2025

Number of authors

IAC number of authors

Citations

Refereed citations

Description

The Dark Energy Spectroscopic Instrument (DESI) survey uses an automatic spectral classification pipeline to classify spectra. QuasarNET is a convolutional neural network used as part of this pipeline originally trained using data from the Baryon Oscillation Spectroscopic Survey (BOSS). In this paper we implement an active learning algorithm to optimally select spectra to use for training a new version of the QuasarNET weights file using only DESI data, with the goal of improving classification accuracy. This active learning algorithm includes a novel outlier rejection step using a Self-Organizing Map to ensure we label spectra representative of the larger quasar sample observed in DESI. We perform two iterations of the active learning pipeline, assembling a final dataset of 5600 labeled spectra, a small subset of the approximately 1.3 million quasar targets in DESI's Data Release 1. When splitting the spectra into training and validation subsets we achieve similar performance to the previously trained weights file in completeness and purity calculated on the validation dataset but do so with less than one tenth of the amount of training data. The new weights also more consistently classify objects in the same way when used on unlabeled data compared to the old weights file. In the process of improving QuasarNET's classification accuracy we discovered a systemic error in QuasarNET's redshift estimation and used our findings to improve our understanding of QuasarNET's redshifts.

Related projects

Cosmology with Large Scale Structure Probes

The Cosmic Microwave Background (CMB) contains the statistical information about the early seeds of the structure formation in our Universe. Its natural counterpart in the local universe is the distribution of galaxies that arises as a result of gravitational growth of those primordial and small density fluctuations. The characterization of the

FRANCISCO SHU

KITAURA JOYANES

In progress

It may interest you

Refereed

WEAVE imaging spectroscopy of NGC 6720: an iron bar in the Ring

We present spatially resolved spectroscopic observations of the planetary nebula NGC 6720, the Ring Nebula, taken during the science verification phase of WEAVE, a new instrument mounted on the William Herschel Telescope on La Palma. We use the instrument's Large Integral Field Unit (LIFU) to obtain spectra of the Ring Nebula, covering its entire

Sánchez-Janssen, R. et al.

Advertised on:

2

2026
Bibcode

2026MNRAS.546f2139W

Citations

0

Read more
Refereed

Focal-plane wavefront sensing with narrowband light using a short multi-mode fiber

We propose a focal-plane wavefront sensor (FPWFS) based on a short multimode fiber (MMF). By coupling the aberrated focal-plane field into an MMF of length 1 cm, we preserve modal interference over a 10 nm bandwidth at near-infrared wavelengths, but broader bandwidths could be achieved by appropriately tuning the fiber length. The resulting output

Padrón-Brito, Auxiliadora et al.

Advertised on:

3

2026
Read more
Refereed

Optical spectral characterization of OP 313: Constraining the contribution of thermal and non-thermal optical emission

Context. The flat spectrum radio quasar (FSRQ) OP 313 was discovered in December 2023 in very-high-energy γ rays above 100 GeV, enabling for the first time a complete broadband characterization of its emission. However, the lack of updated measurements of its accretion disc, broad line region, and dusty torus hampers a detailed interpretation of

Otero-Santos, J. et al.

Advertised on:

3

2026
Bibcode

2026A&A...707A.199O

Citations

0

Read more