deepSIP.preprocessing

class deepSIP.preprocessing.Spectrum(filename, z, obsframe=True)

Bases: object

container for individual spectra

Parameters:
filename : str

name of text file to read spectrum from

z : float

redshift of SN

obsframe : bool, optional

indicates spectrum is in observer frame (and thus needs to be de-redshifted)

Attributes:
signal_window_angstroms : int

window size in angstroms for signal smoothing

signal_smoothing_order : int

polynomial order for signal smoothing

continuum_window_angstroms : int

window size in angstroms for continuum smoothing

continuum_smoothing_order : int

polynomial order for continuum smoothing

lwave_bounds : tuple, list, or other iterable of length 2

lower and upper limit of logarithmic wavelength array

lwave_n_bins : int

number of bins in logarithmic wavelength grid

lwave : np.array

logarithmic wavelength grid

apodize_end_pct : float

percentage from each end of flux array to apodize

aug_drop_frac : float

maximum percentage of each of flux array to drop during augmentation

wave : np.array

rest frame wavelength grid

flux : np.array

fluxes on wavelength grid

Methods

SNID(self, **kwargs) run SNID on spectrum using pySNID (if available)
augprocess(self, augz, sig_wA) process spectrum with augmentation
plot(self[, show]) plot original or processed spectrum for quick inspection
process(self) process spectrum with no augmentation
process(self)

process spectrum with no augmentation

complete pre-processing: smoothes spectrum, identifies and subtracts pseudo-continuum, does log binning, scales flux and apodizes edges

Returns:
np.array

fully pre-processed flux array

augprocess(self, augz, sig_wA)

process spectrum with augmentation

perturbs wavelength by multiplicative (1 + augz), drops ends randomly up to a maximum of aug_drop_frac, modify signal_window_angstroms to sig_wA, and then perform pre-processing according to process method

Returns:
np.array

augmented and fully pre-processed flux array

plot(self, show='processed')

plot original or processed spectrum for quick inspection

Parameters:
show : str, optional

type of plot to show (‘processed’ or ‘original’)

SNID(self, **kwargs)

run SNID on spectrum using pySNID (if available)

Parameters:
**kwargs

arbitrary keyword arguments for pySNID

Returns:
pySNID outputs
class deepSIP.preprocessing.EvaluationSpectra(spectra, savefile='eval.spectra.sav')

Bases: object

class for preparing spectra for evaluation

Parameters:
spectra : pd.DataFrame

spectra to prepare for evaluation; must have columns columns of [SN, filename, z] and optionally obsframe as bool

savefile : str, optional

name of save file

Attributes:
X : np.ndarray with dimensions (number of spectra, lwave_n_bins)

processed spectra (attribute set by process method)

SNID_results : pd.DataFrame

pySNID results for each spectrum (attribute set by SNID method)

Methods

SNID(self[, selection, status]) run SNID on selected (via boolean array) spectra in dataset
SNID_to_csv(self) write SNID results to csv file
load(self) load from savefile
process(self[, status]) process all loaded spectra with no augmentation
save(self) save current state to savefile
to_npy(self) write processed spectra to .npy file
save(self)

save current state to savefile

load(self)

load from savefile

process(self, status=True)

process all loaded spectra with no augmentation

Parameters:
status : bool, optional

show status bars

SNID(self, selection='all', status=True, **kwargs)

run SNID on selected (via boolean array) spectra in dataset

Parameters:
selection : boolean array

spectra from data set to run SNID on via pySNID

status : bool, optional

show status bars

**kwargs

arbitrary keyword arguments for pySNID

SNID_to_csv(self)

write SNID results to csv file

to_npy(self)

write processed spectra to .npy file

class deepSIP.preprocessing.TVTSpectra(spectra, savefile='tvt.spectra.sav', prep=1, val_frac=0.1, test_frac=0.1, phase_bounds=(-10, 18), dm15_bounds=(0.85, 1.55), phase_binsize=4, dm15_binsize=0.1, aug_num=5000, aug_z_bounds=(-0.004, 0.004), aug_signal_window_angstroms=(50, 150), random_state=100)

Bases: deepSIP.preprocessing.EvaluationSpectra

class for preparing Training, Validation, and Testing sets

Parameters:
spectra : pd.DataFrame

spectra to prepare; must have columns columns of [SN, filename, z] and optionally obsframe as bool

savefile : str, optional

name of save file

prep : int, optional

preparation mode (1 for all spectra, 2 for domain-restricted subset)

val_frac : float, optional

fraction of full set to split for validation

test_frac : float, optional

fraction of full set to split for testing

phase_bounds : tuple, list, or other iterable of length 2, optional

lower and upper phase limits of domain

dm15_bounds : tuple, list, or other iterable of length 2, optional

lower and upper dm15 limits of domain

Other Parameters:
 
phase_binsize : int or float, optional

phase bin size for pseudo-stratified splitting

dm15_binsize : int or float, optional

dm15 bin size for pseudo-stratified splitting

aug_num : int, optional

final size of training set after augmentation

aug_z_bounds : tuple, list, or other iterable of length 2, optional

lower and upper limits of randomly selected redshifts for augmented spectra

aug_signal_window_angstroms : tuple, list, or other iterable of length 2

lower and upper limits of randomly selected signal windows for augmented spectra

random_state : int, optional

seed for random number generator

Attributes:
spectra_[out,in] : pd.DataFrame

subset of spectra that are [out,in] selected domain

spectra_aug : pd.DataFrame

augmented spectra

Ycol : list

columns in spectra that correspond to labels

[train,aug,val,test]X : np.ndarray

processed spectra (attribute set by [aug]process method)

[train,aug,val,test]Y : np.ndarray

targets (attribute set by [aug]process method)

Methods

SNID(self[, selection, status]) run SNID on selected (via boolean array) spectra in dataset
SNID_to_csv(self) write SNID results to csv file
augprocess(self[, status]) process all loaded spectra with augmentation
load(self) load from savefile
process(self[, status]) process all loaded spectra with no augmentation
save(self) save current state to savefile
split(self[, force, method]) split spectra into train/val/test subsets
to_npy(self) write processed spectra and targets to .npy files
split(self, force=False, method='stratified')

split spectra into train/val/test subsets

two splitting methods are available:
  1. ‘stratified’ - split in-domain spectra in pseudo-stratified
    fashion by randomly selected subsets from bins
  2. ‘randomized’ - random selection
Parameters:
force : bool, optional

overwrite pre-existing splits if True

method : str, optional

splitting method (‘stratified’ or ‘randomized’)

process(self, status=True)

process all loaded spectra with no augmentation

Parameters:
status : bool, optional

show status bars

augprocess(self, status=True)

process all loaded spectra with augmentation

Parameters:
status : bool, optional

show status bars

to_npy(self)

write processed spectra and targets to .npy files