F: Análisis de espectros: biblioteca Specutils

specutils (documentación) es un paquete de Python para representar, cargar, manipular y analizar datos espectroscópicos astronómicos. Los contenedores de datos genéricos y los módulos que los acompañan proporcionan una caja de herramientas que la comunidad astronómica puede usar para crear más paquetes específicos. Para installar specutils ejecutamos en la línea de comandos:

> conda install -c conda-forge specutils

> conda install -c astropy astroquery

Como ejemplo básico, consideramos un espectro de galaxias de línea de emisión del SDSS.

Comenzamos con algunas importaciones básicas:

[1]:
from astropy.io import fits
from astropy import units as u
import numpy as np
from matplotlib import pyplot as plt
from astropy.visualization import quantity_support
quantity_support()  # for getting units on the axes below

import numpy as np
from astropy import units as u
from astropy.nddata import StdDevUncertainty
from astropy.modeling import models
from specutils import Spectrum1D, SpectralRegion

Lectura desde un archivo

Specutils aprovecha la maquinaria Astropy IO y permite cargar y escribir en archivos. El siguiente ejemplo muestra la carga de un archivo FITS:

[2]:
f = fits.open('espectros/spec-1323-52797-0012.fits')
specdata = f[1].data
f.close()

Reformateamos este conjunto de datos en cantidades astronómicas y creamos un objeto Spectrum1D:

[3]:
lamb = 10**specdata['loglam'] * u.AA
flux = specdata['flux'] * 10**-17 * u.Unit('erg cm-2 s-1 AA-1')
spec = Spectrum1D(spectral_axis=lamb, flux=flux)

Y lo graficamos:

[4]:
f, ax = plt.subplots()
ax.step(spec.spectral_axis, spec.flux)
[4]:
[<matplotlib.lines.Line2D at 0x7f2137fc2790>]
_images/F-5-espectros_10_1.png

Análisis de espectros

El paquete specutils viene con un conjunto de herramientas para realizar tareas de análisis comunes en espectros astronómicos. A continuación se describen algunos ejemplos de la aplicación de estas herramientas. El espectro básico que se muestra aquí se usa en los ejemplos de las subsecciones siguientes: una línea de perfil gaussiano con un flujo de 5 GHz Jy.

[5]:
np.random.seed(42)
spectral_axis = np.linspace(11., 1., 200) * u.GHz
spectral_model = models.Gaussian1D(amplitude=5*(2*np.pi*0.8**2)**-0.5*u.Jy, mean=5*u.GHz, stddev=0.8*u.GHz)

flux = spectral_model(spectral_axis)
flux += np.random.normal(0., 0.05, spectral_axis.shape) * u.Jy

uncertainty = StdDevUncertainty(0.2*np.ones(flux.shape)*u.Jy)
noisy_gaussian = Spectrum1D(spectral_axis=spectral_axis, flux=flux, uncertainty=uncertainty)
import matplotlib.pyplot as plt
plt.step(noisy_gaussian.spectral_axis, noisy_gaussian.flux)
[5]:
[<matplotlib.lines.Line2D at 0x7f20c5a75d00>]
_images/F-5-espectros_13_1.png

La relación señal-ruido de un espectro suele ser buen indicador para evaluar la calidad de un espectro. La función snr realiza esta tarea, ya sea en el espectro como un todo o en subregiones de un espectro:

[6]:
from specutils.analysis import snr
snr(noisy_gaussian)
[6]:
$2.4773067 \; \mathrm{}$
[7]:
snr(noisy_gaussian, SpectralRegion(2*u.GHz, 8*u.GHz))
[7]:
$4.1561837 \; \mathrm{}$

Búsqueda de líneas

Hay dos técnicas implementadas para encontrar líneas de emisión y/o absorción en un espectro Spectrum1D.

  • La primera técnica es find_lines_threshold que encontrará líneas mediante el umbral del flujo en función de un factor aplicado a la incertidumbre del espectro.

  • La segunda técnica es find_lines_derivative que encontrará las líneas basándose en el cálculo de la derivada y luego en el umbral basado en ella.

Empezamos con un espectro sintético:

[8]:
import numpy as np
from astropy.modeling import models
import astropy.units as u
from specutils import Spectrum1D, SpectralRegion
[9]:
np.random.seed(42)
g1 = models.Gaussian1D(1, 4.6, 0.2)
g2 = models.Gaussian1D(2.5, 5.5, 0.1)
g3 = models.Gaussian1D(-1.7, 8.2, 0.1)
x = np.linspace(0, 10, 200)
y = g1(x) + g2(x) + g3(x) + np.random.normal(0., 0.2, x.shape)
spectrum = Spectrum1D(flux=y*u.Jy, spectral_axis=x*u.um)
[10]:
from matplotlib import pyplot as plt
plt.plot(spectrum.spectral_axis, spectrum.flux)
plt.xlabel('Spectral Axis ({})'.format(spectrum.spectral_axis.unit))
plt.ylabel('Flux Axis({})'.format(spectrum.flux.unit))
plt.grid(True)
_images/F-5-espectros_22_0.png

Si bien conocemos la verdadera incertidumbre aquí, a menudo este no es el caso con datos reales. Por lo tanto, dado find_lines_threshold que requiere una incertidumbre, produciremos una estimación de la incertidumbre llamando a la función noise_region_uncertainty:

[11]:
import warnings
from specutils.manipulation import noise_region_uncertainty

noise_region = SpectralRegion(0*u.um, 3*u.um)
spectrum = noise_region_uncertainty(spectrum, noise_region)
[12]:
from specutils.fitting import find_lines_threshold
with warnings.catch_warnings():  # Ignore warnings
    warnings.simplefilter('ignore')
    lines = find_lines_threshold(spectrum, noise_factor=3)

lines[lines['line_type'] == 'emission']
[12]:
QTable length=4
line_centerline_typeline_center_index
um
float64str10int64
4.572864321608041emission91
4.824120603015076emission96
5.477386934673367emission109
8.99497487437186emission179
[13]:
lines[lines['line_type'] == 'absorption']
[13]:
QTable length=1
line_centerline_typeline_center_index
um
float64str10int64
8.190954773869347absorption163
[14]:
lines
[14]:
QTable length=5
line_centerline_typeline_center_index
um
float64str10int64
4.572864321608041emission91
4.824120603015076emission96
5.477386934673367emission109
8.99497487437186emission179
8.190954773869347absorption163

Ajuste de línea

El primer paso es crear un conjunto de modelos con condiciones iniciales como parámetros. Para lograr mejores ajustes, puede ser conveniente incluir un conjunto de límites para cada parámetro, pero eso es opcional.

Ejemplo sencillo

A continuación se muestra un ejemplo simple para demostrar cómo usar el método fit_lines para ajustar un espectro a una suposición inicial del modelo de Astropy.

[15]:
import numpy as np
import matplotlib.pyplot as plt

from astropy.modeling import models
from astropy import units as u

from specutils.spectra import Spectrum1D
from specutils.fitting import fit_lines

# Create a simple spectrum with a Gaussian.
np.random.seed(0)
x = np.linspace(0., 10., 200)
y = 3 * np.exp(-0.5 * (x- 6.3)**2 / 0.8**2)
y += np.random.normal(0., 0.2, x.shape)
spectrum = Spectrum1D(flux=y*u.Jy, spectral_axis=x*u.um)

# Fit the spectrum and calculate the fitted flux values (``y_fit``)
g_init = models.Gaussian1D(amplitude=3.*u.Jy, mean=6.1*u.um, stddev=1.*u.um)
g_fit = fit_lines(spectrum, g_init)
y_fit = g_fit(x*u.um)

# Plot the original spectrum and the fitted.
plt.plot(x, y, label="Original spectrum")
plt.plot(x, y_fit, label="Fit result")
plt.title('Single fit peak')
plt.grid(True)
plt.legend()
[15]:
<matplotlib.legend.Legend at 0x7f20c4d503a0>
_images/F-5-espectros_32_1.png

Ejemplo real: galaxia SDSSCGB 56936.1

Para este ejemplo instalaremos el paquete astroquery que permite descargar fácilmente espectros del SDSS:

> conda install -c astropy astroquery
[16]:
from astropy import units as u
#from astropy.coordinates import SkyCoord
from astropy import coordinates as coords
ra = '12h39m38.17s'
dec = '+44d49m22.52s'
pos = coords.SkyCoord(ra, dec, frame='icrs')
pos
[16]:
<SkyCoord (ICRS): (ra, dec) in deg
    (189.90904167, 44.82292222)>
[17]:
from astroquery.sdss import SDSS

xid = SDSS.query_region(pos, spectro=True)
print(xid)
       ra              dec        ...      specobjid       run2d
---------------- ---------------- ... ------------------- -------
189.909048699636 44.8229242986661 ... 7470471882854080512 v5_10_0
/home/zerjillo/anaconda3/envs/cursoAstronomia2/lib/python3.8/site-packages/astroquery/sdss/core.py:874: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
  arr = np.atleast_1d(np.genfromtxt(io.BytesIO(response.content),
[18]:
specs = SDSS.get_spectra(matches=xid)
[19]:
im = SDSS.get_images(matches=xid, band='g')
im[0].info()
Filename: (No file associated with this HDUList)
No.    Name      Ver    Type      Cards   Dimensions   Format
  0  PRIMARY       1 PrimaryHDU      96   (2048, 1489)   float32
  1                1 ImageHDU         6   (2048,)   float32
  2                1 BinTableHDU     27   1R x 3C   [49152E, 2048E, 1489E]
  3                1 BinTableHDU     79   1R x 31C   [J, 3A, J, A, D, D, 2J, J, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, E, E]
[20]:
im[0][0].header
[20]:
SIMPLE  =                    T /
BITPIX  =                  -32 / 32 bit floating point
NAXIS   =                    2
NAXIS1  =                 2048
NAXIS2  =                 1489
EXTEND  =                    T /Extensions may be present
BZERO   =              0.00000 /Set by MRD_SCALE
BSCALE  =              1.00000 /Set by MRD_SCALE
TAI     =        4555198318.03 / 1st row - Number of seconds since Nov 17 1858
RA      =            190.72577 / 1st row - Right ascension of telescope boresigh
DEC     =            44.856717 / 1st row - Declination of telescope boresight (d
SPA     =              94.045  / 1st row - Camera col position angle wrt north (
IPA     =             189.551  / 1st row - Instrument rotator position angle (de
IPARATE =              0.0024  / 1st row - Instrument rotator angular velocity (
AZ      =            123.00517 / 1st row - Azimuth  (encoder) of tele (0=N?) (de
ALT     =            55.179280 / 1st row - Altitude (encoder) of tele        (de
FOCUS   =           -482.80000 / 1st row - Focus piston (microns?)
DATE-OBS= '2003-03-24'         / 1st row - TAI date
TAIHMS  = '04:51:58.03'        / 1st row - TAI time (HH:MM:SS.SS) (TAI-UT = appr
COMMENT  TAI,RA,DEC,SPA,IPA,IPARATE,AZ,ALT,FOCUS at reading of col 0, row 0
ORIGIN  = 'SDSS    '
TELESCOP= '2.5m    '
TIMESYS = 'TAI     '
RUN     =                 3813 / Run number
FRAME   =                  258 / Frame sequence number within the run
CCDLOC  =                   53 / Survey location of CCD (e.g., rowCol)
STRIPE  =                   28 / Stripe index number (23 <--> eta=0)
STRIP   = 'N       '           / Strip in the stripe being tracked.
FLAVOR  = 'science '           / Flavor of this run
OBSERVER= 'prn     '           / Observer
SYS_SCN = 'mean    '           / System of the scan great circle (e.g., mean)
EQNX_SCN=           2000.00    / Equinox of the scan great circle. (years)
NODE    =           95.00000   / RA of the great circle's ascending node (deg)
INCL    =           45.00000   / Great circle's inclination wrt cel. eq. (deg)
XBORE   =           -22.74     / Boresight x offset from the array center (mm)
YBORE   =           0.00       / Boresight x offset from the array center (mm)
OBJECT  = '28 N    '           / e.g., 'stripe 50.6 degrees, north strip'
EXPTIME = '53.907456'          / Exposure time (seconds)
SYSTEM  = 'FK5     '           / System of the TCC coordinates (e.g., mean)
CCDMODE = 'DRIFT   '           / 'STARING' or 'DRIFT'
C_OBS   =                26322 / CCD row clock rate (usec/row)
COLBIN  =                    1 / Binning factor perpendicular to the columns
ROWBIN  =                    1 / Binning factor perpendicular to the rows
DAVERS  = 'v14_47  '           / Version of DA software
SCDMETHD= 'sqrtDynamic'        / scdMethod
SCDWIDTH=                 1280 / scdDisplayWidth
SCDDECMF=                    1 / scdDecimateFactor
SCDOFSET=                  410 / scdDisplayOffset
SCDDYNTH=                  -18 / scdDynamicThresh
SCDSTTHL=                 1835 / scdStaticThreshL
SCDSTTHR=                 1835 / scdStaticThreshR
SCDREDSZ=                  471 / scdReduceSize
SCDSKYL =                 1962 / scdSkyLeft
SCDSKYR =                 1962 / scdSkyRight
COMMENT  CCD-specific parameters
CAMROW  =                    5 / Row in the imaging camera
BADLINES=                    0 / Number of bad lines in frame
EQUINOX =              2000.00 /
SOFTBIAS=                 1000 / software "bias" added to all DN
BUNIT   = 'nanomaggy'          / 1 nanomaggy = 3.631e-6 Jy
FILTER  = 'g       '           / filter used
CAMCOL  =                    3 / column in the imaging camera
VERSION = 'v5_6_3  '
DERV_VER= 'NOCVS:v8_23'
ASTR_VER= 'NOCVS:v5_24'
ASTRO_ID= '2009-05-27T14:05:51 23189'
BIAS_ID = 'PS      '
FRAME_ID= '2009-09-24T00:24:22 11893'
KO_VER  = 'devel   '
PS_ID   = '2009-05-27T11:56:52 23646 camCol 3'
ATVSN   = 'NOCVS:v5_24'        / ASTROTOOLS version tag
RADECSYS= 'ICRS    '           / International Celestial Ref. System
CTYPE1  = 'RA---TAN'           /Coordinate type
CTYPE2  = 'DEC--TAN'           /Coordinate type
CUNIT1  = 'deg     '           /Units
CUNIT2  = 'deg     '           /Units
CRPIX1  =        1025.00000000 /X of reference pixel
CRPIX2  =        745.000000000 /Y of reference pixel
CRVAL1  =        189.831798553 /RA of reference pixel (deg)
CRVAL2  =        44.7927880102 /Dec of reference pixel (deg)
CD1_1   =    6.56924555710E-06 /RA deg per column pixel
CD1_2   =    0.000109830777750 /RA deg per row pixel
CD2_1   =    0.000109746453097 /Dec deg per column pixel
CD2_2   =   -6.55661466001E-06 /Dec deg per row pixel
HISTORY GSSSPUTAST: Sep 24 00:24:28 2009
COMMENT  Calibration parameters
COMMENT  Floats truncated at 10 binary digits with FLOATCOMPRESS
NMGY    =           0.00375519 / Calibration factor [nMgy per count]
NMGYIVAR=            0.0547761 / Calibration factor inverse variance
VERSIDL = '7.0     '           / Version of IDL
VERSUTIL= 'v5_5_5  '           / Version of idlutils
VERSPOP = 'v1_11_1 '           / Version of photoop product
PCALIB  = '/clusterfs/riemann/raid006/bosswork/groups/boss/calib/2009-06-14/cal'
PSKY    = '/clusterfs/riemann/raid006/bosswork/groups/boss/photo/sky' / Value of
RERUN   = '301     '           / rerun
HISTORY SDSS_FRAME_ASTROM: Astrometry fixed for dr9 Sun Jun 17 01:33:15 2012
[21]:
from astropy.wcs import WCS
w = WCS(im[0][0].header)
x, y = w.world_to_pixel(pos)
x, y
WARNING: FITSFixedWarning: RADECSYS= 'ICRS ' / International Celestial Ref. System
the RADECSYS keyword is deprecated, use RADESYSa. [astropy.wcs.wcs]
WARNING: FITSFixedWarning: 'datfix' made the change 'Set MJD-OBS to 52722.000000 from DATE-OBS'. [astropy.wcs.wcs]
[21]:
(array(1327.5350548), array(1224.68214133))
[22]:
from specutils import Spectrum1D
espectro = Spectrum1D.read(specs[0], format="SDSS-III/IV spec")
WARNING: UnitsWarning: '1E-17 erg/cm^2/s/Angstrom' contains multiple slashes, which is discouraged by the FITS standard [astropy.units.format.generic]
[23]:
espectro
[23]:
<Spectrum1D(flux=<Quantity [-21.237778 ,   9.708488 ,   6.143837 , ...,  -1.8208474,
            -2.3184657,   8.073518 ] 1e-17 erg / (Angstrom cm2 s)>, spectral_axis=<SpectralAxis [ 3566.1538,  3566.9744,  3567.7969, ..., 10337.131 , 10339.515 ,
   10341.888 ] Angstrom>, uncertainty=InverseVariance([0.01676147, 0.0153255 , 0.01600573, ..., 0.0552746 ,
                 0.06065338, 0.05814334]))>
[24]:
import matplotlib.pyplot as plt
import numpy as np
plt.figure('Espectro', figsize=(8, 8))
plt.plot(espectro.spectral_axis, espectro.flux)
plt.xlabel("Longitud de onda [A]")
plt.ylabel("Flux [erg / (Angstrom cm2 s)]")
[24]:
Text(0, 0.5, 'Flux [erg / (Angstrom cm2 s)]')
_images/F-5-espectros_42_1.png
[25]:
plt.figure('Espectro', figsize=(8, 8))
plt.imshow(im[0][0].data, vmin=np.min(im[0][0].data), vmax=np.mean(im[0][0].data)*2, origin='lower')
plt.plot(x, y, color='red', marker='+')
plt.xlabel("pixel")
plt.ylabel("pixel")
[25]:
Text(0, 0.5, 'pixel')
_images/F-5-espectros_43_1.png
[26]:
import warnings
from specutils.manipulation import noise_region_uncertainty
from specutils.spectra import SpectralRegion

noise_region = SpectralRegion(7000*u.AA, 7100*u.AA)# aqui es donde yo mido el ruido
espectro = noise_region_uncertainty(espectro, noise_region)
[27]:
from specutils.fitting import find_lines_threshold
with warnings.catch_warnings():  # Ignore warnings
    warnings.simplefilter('ignore')
    lines = find_lines_threshold(espectro, noise_factor=3)

tablaEmision = lines[lines['line_type'] == 'emission']
[28]:
tablaAbsorcion = lines[lines['line_type'] == 'absorption']
[29]:
import pandas as pd
lineas=pd.read_csv('espectros/lines.raw', delimiter = ';')
[30]:
lineas['linel_0'][11], lineas['valor'][11]
[30]:
('[OIII]', 4363.21)
[31]:
tablaEmision['line_center']
[31]:
$[3569.4385,~3572.7283,~3585.0903,~\dots,~10329.994,~10334.747,~10341.888] \; \mathrm{\mathring{A}}$
[32]:
import numpy as np
def find_nearest(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return array[idx]
[33]:
find_nearest(tablaEmision['line_center'], lineas['valor'][11])
[33]:
3768.7732
[34]:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.pyplot import text
plt.figure('Espectro', figsize=(8, 8))
plt.plot(espectro.spectral_axis, espectro.flux)
plt.vlines(lineas['valor'][11], ymin=0, ymax=100, color = 'r')

for i in range(len(lineas['valor'])):
    plt.vlines(lineas['valor'][i], ymin=0, ymax=100, color = 'r')
    text(lineas['valor'][i], 100-2*i, f"{lineas['linel_0'][i]}" , rotation=0, verticalalignment='center')
plt.xlabel("Longitud de onda [A]")
plt.ylabel("Flux [erg / (Angstrom cm2 s)]")
[34]:
Text(0, 0.5, 'Flux [erg / (Angstrom cm2 s)]')
_images/F-5-espectros_52_1.png
[35]:
espectro_observado = []
for i in range(len(lineas['valor'])):
    espectro_observado.append(find_nearest(tablaEmision['line_center'], lineas['valor'][i]))
[36]:
print('Espectro observado no corregido de redshift:')
for i in range(len(espectro_observado)):
    print(lineas['linel_0'][i], espectro_observado[i], lineas['valor'][i])
Espectro observado no corregido de redshift:
[OII] 3720.489 3727.42
H12 3735.9417 3750.151
H11 3768.7732 3770.63
H10 3768.7732 3797.898
H9 3768.7732 3835.384
[NeIII] 3768.7732 3869.06
H8 3768.7732 3889.049
H-epsilon 3768.7732 3970.072
[SII] 3768.7732 4072.5
H-delta 3768.7732 4101.734
H-gamma 3768.7732 4340.464
[OIII] 3768.7732 4363.21
H-beta 5017.646 4861.325
[OIII] 5017.646 4958.911
[OIII] 5017.646 5006.843
HeI 5893.8633 5875.97
[NII] 6575.0635 6548.04
H-alpha 6575.0635 6562.8
[NII] 6575.0635 6583.46
HeI 6575.0635 6678.15
[SII] 6575.0635 6716.44
[SII] 6575.0635 6730.81
[ArIII] 7276.124 7135.8
[SIII] 9088.661 9068.6
[SIII] 9523.576 9530.6

La correccion de redshift se haría como:

\[z= \frac{\lambda_{observado}-\lambda_{emitida}}{\lambda_{emitida}}\]

Por lo tanto la correccion de redshift es:

[37]:
z = (lineas['valor'][0]-espectro_observado[0])/espectro_observado[0]
z
[37]:
0.0018629234766331566