deepSIP.preprocessing¶

class deepSIP.preprocessing.Spectrum(filename, z, obsframe=True)¶

Bases: object

container for individual spectra

Parameters:

filename : str: name of text file to read spectrum from
z : float: redshift of SN
obsframe : bool, optional: indicates spectrum is in observer frame (and thus needs to be de-redshifted)

Attributes:

signal_window_angstroms : int: window size in angstroms for signal smoothing
signal_smoothing_order : int: polynomial order for signal smoothing
continuum_window_angstroms : int: window size in angstroms for continuum smoothing
continuum_smoothing_order : int: polynomial order for continuum smoothing
lwave_bounds : tuple, list, or other iterable of length 2: lower and upper limit of logarithmic wavelength array
lwave_n_bins : int: number of bins in logarithmic wavelength grid
lwave : np.array: logarithmic wavelength grid
apodize_end_pct : float: percentage from each end of flux array to apodize
aug_drop_frac : float: maximum percentage of each of flux array to drop during augmentation
wave : np.array: rest frame wavelength grid
flux : np.array: fluxes on wavelength grid

Methods

`SNID`(self, **kwargs)	run SNID on spectrum using pySNID (if available)
`augprocess`(self, augz, sig_wA)	process spectrum with augmentation
`plot`(self[, show])	plot original or processed spectrum for quick inspection
`process`(self)	process spectrum with no augmentation

process(self)¶

process spectrum with no augmentation

complete pre-processing: smoothes spectrum, identifies and subtracts pseudo-continuum, does log binning, scales flux and apodizes edges

Returns:	np.array fully pre-processed flux array

augprocess(self, augz, sig_wA)¶

process spectrum with augmentation

perturbs wavelength by multiplicative (1 + augz), drops ends randomly up to a maximum of aug_drop_frac, modify signal_window_angstroms to sig_wA, and then perform pre-processing according to process method

Returns:	np.array augmented and fully pre-processed flux array

plot(self, show='processed')¶

plot original or processed spectrum for quick inspection

Parameters:	show : str, optional type of plot to show (‘processed’ or ‘original’)

SNID(self, **kwargs)¶

run SNID on spectrum using pySNID (if available)

Parameters:	**kwargs arbitrary keyword arguments for pySNID
Returns:	pySNID outputs

class deepSIP.preprocessing.EvaluationSpectra(spectra, savefile='eval.spectra.sav')¶

Bases: object

class for preparing spectra for evaluation

Parameters:	spectra : pd.DataFrame spectra to prepare for evaluation; must have columns columns of [SN, filename, z] and optionally obsframe as bool savefile : str, optional name of save file
Attributes:	X : np.ndarray with dimensions (number of spectra, lwave_n_bins) processed spectra (attribute set by process method) SNID_results : pd.DataFrame pySNID results for each spectrum (attribute set by SNID method)

Methods

`SNID`(self[, selection, status])	run SNID on selected (via boolean array) spectra in dataset
`SNID_to_csv`(self)	write SNID results to csv file
`load`(self)	load from savefile
`process`(self[, status])	process all loaded spectra with no augmentation
`save`(self)	save current state to savefile
`to_npy`(self)	write processed spectra to .npy file

save(self)¶: save current state to savefile

load(self)¶: load from savefile

process(self, status=True)¶

process all loaded spectra with no augmentation

Parameters:	status : bool, optional show status bars

SNID(self, selection='all', status=True, **kwargs)¶

run SNID on selected (via boolean array) spectra in dataset

Parameters:	selection : boolean array spectra from data set to run SNID on via pySNID status : bool, optional show status bars **kwargs arbitrary keyword arguments for pySNID

SNID_to_csv(self)¶: write SNID results to csv file

to_npy(self)¶: write processed spectra to .npy file

class deepSIP.preprocessing.TVTSpectra(spectra, savefile='tvt.spectra.sav', prep=1, val_frac=0.1, test_frac=0.1, phase_bounds=(-10, 18), dm15_bounds=(0.85, 1.55), phase_binsize=4, dm15_binsize=0.1, aug_num=5000, aug_z_bounds=(-0.004, 0.004), aug_signal_window_angstroms=(50, 150), random_state=100)¶

Bases: deepSIP.preprocessing.EvaluationSpectra

class for preparing Training, Validation, and Testing sets

Other Parameters:
Parameters:	spectra : pd.DataFrame spectra to prepare; must have columns columns of [SN, filename, z] and optionally obsframe as bool savefile : str, optional name of save file prep : int, optional preparation mode (1 for all spectra, 2 for domain-restricted subset) val_frac : float, optional fraction of full set to split for validation test_frac : float, optional fraction of full set to split for testing phase_bounds : tuple, list, or other iterable of length 2, optional lower and upper phase limits of domain dm15_bounds : tuple, list, or other iterable of length 2, optional lower and upper dm15 limits of domain
	phase_binsize : int or float, optional phase bin size for pseudo-stratified splitting dm15_binsize : int or float, optional dm15 bin size for pseudo-stratified splitting aug_num : int, optional final size of training set after augmentation aug_z_bounds : tuple, list, or other iterable of length 2, optional lower and upper limits of randomly selected redshifts for augmented spectra aug_signal_window_angstroms : tuple, list, or other iterable of length 2 lower and upper limits of randomly selected signal windows for augmented spectra random_state : int, optional seed for random number generator
Attributes:	spectra_[out,in] : pd.DataFrame subset of spectra that are [out,in] selected domain spectra_aug : pd.DataFrame augmented spectra Ycol : list columns in spectra that correspond to labels [train,aug,val,test]X : np.ndarray processed spectra (attribute set by [aug]process method) [train,aug,val,test]Y : np.ndarray targets (attribute set by [aug]process method)

Methods

`SNID`(self[, selection, status])	run SNID on selected (via boolean array) spectra in dataset
`SNID_to_csv`(self)	write SNID results to csv file
`augprocess`(self[, status])	process all loaded spectra with augmentation
`load`(self)	load from savefile
`process`(self[, status])	process all loaded spectra with no augmentation
`save`(self)	save current state to savefile
`split`(self[, force, method])	split spectra into train/val/test subsets
`to_npy`(self)	write processed spectra and targets to .npy files

split(self, force=False, method='stratified')¶

split spectra into train/val/test subsets

two splitting methods are available:

‘stratified’ - split in-domain spectra in pseudo-stratified

fashion by randomly selected subsets from bins
‘randomized’ - random selection

Parameters:	force : bool, optional overwrite pre-existing splits if True method : str, optional splitting method (‘stratified’ or ‘randomized’)

process(self, status=True)¶

process all loaded spectra with no augmentation

Parameters:	status : bool, optional show status bars

augprocess(self, status=True)¶

process all loaded spectra with augmentation

Parameters:	status : bool, optional show status bars

to_npy(self)¶: write processed spectra and targets to .npy files