deepSIP.preprocessing¶
-
class
deepSIP.preprocessing.
Spectrum
(filename, z, obsframe=True)¶ Bases:
object
container for individual spectra
Parameters: - filename : str
name of text file to read spectrum from
- z : float
redshift of SN
- obsframe : bool, optional
indicates spectrum is in observer frame (and thus needs to be de-redshifted)
Attributes: - signal_window_angstroms : int
window size in angstroms for signal smoothing
- signal_smoothing_order : int
polynomial order for signal smoothing
- continuum_window_angstroms : int
window size in angstroms for continuum smoothing
- continuum_smoothing_order : int
polynomial order for continuum smoothing
- lwave_bounds : tuple, list, or other iterable of length 2
lower and upper limit of logarithmic wavelength array
- lwave_n_bins : int
number of bins in logarithmic wavelength grid
- lwave : np.array
logarithmic wavelength grid
- apodize_end_pct : float
percentage from each end of flux array to apodize
- aug_drop_frac : float
maximum percentage of each of flux array to drop during augmentation
- wave : np.array
rest frame wavelength grid
- flux : np.array
fluxes on wavelength grid
Methods
SNID
(self, **kwargs)run SNID on spectrum using pySNID (if available) augprocess
(self, augz, sig_wA)process spectrum with augmentation plot
(self[, show])plot original or processed spectrum for quick inspection process
(self)process spectrum with no augmentation -
process
(self)¶ process spectrum with no augmentation
complete pre-processing: smoothes spectrum, identifies and subtracts pseudo-continuum, does log binning, scales flux and apodizes edges
Returns: - np.array
fully pre-processed flux array
-
augprocess
(self, augz, sig_wA)¶ process spectrum with augmentation
perturbs wavelength by multiplicative (1 + augz), drops ends randomly up to a maximum of aug_drop_frac, modify signal_window_angstroms to sig_wA, and then perform pre-processing according to process method
Returns: - np.array
augmented and fully pre-processed flux array
-
plot
(self, show='processed')¶ plot original or processed spectrum for quick inspection
Parameters: - show : str, optional
type of plot to show (‘processed’ or ‘original’)
-
SNID
(self, **kwargs)¶ run SNID on spectrum using pySNID (if available)
Parameters: - **kwargs
arbitrary keyword arguments for pySNID
Returns: - pySNID outputs
-
class
deepSIP.preprocessing.
EvaluationSpectra
(spectra, savefile='eval.spectra.sav')¶ Bases:
object
class for preparing spectra for evaluation
Parameters: - spectra : pd.DataFrame
spectra to prepare for evaluation; must have columns columns of [SN, filename, z] and optionally obsframe as bool
- savefile : str, optional
name of save file
Attributes: - X : np.ndarray with dimensions (number of spectra, lwave_n_bins)
processed spectra (attribute set by process method)
- SNID_results : pd.DataFrame
pySNID results for each spectrum (attribute set by SNID method)
Methods
SNID
(self[, selection, status])run SNID on selected (via boolean array) spectra in dataset SNID_to_csv
(self)write SNID results to csv file load
(self)load from savefile process
(self[, status])process all loaded spectra with no augmentation save
(self)save current state to savefile to_npy
(self)write processed spectra to .npy file -
save
(self)¶ save current state to savefile
-
load
(self)¶ load from savefile
-
process
(self, status=True)¶ process all loaded spectra with no augmentation
Parameters: - status : bool, optional
show status bars
-
SNID
(self, selection='all', status=True, **kwargs)¶ run SNID on selected (via boolean array) spectra in dataset
Parameters: - selection : boolean array
spectra from data set to run SNID on via pySNID
- status : bool, optional
show status bars
- **kwargs
arbitrary keyword arguments for pySNID
-
SNID_to_csv
(self)¶ write SNID results to csv file
-
to_npy
(self)¶ write processed spectra to .npy file
-
class
deepSIP.preprocessing.
TVTSpectra
(spectra, savefile='tvt.spectra.sav', prep=1, val_frac=0.1, test_frac=0.1, phase_bounds=(-10, 18), dm15_bounds=(0.85, 1.55), phase_binsize=4, dm15_binsize=0.1, aug_num=5000, aug_z_bounds=(-0.004, 0.004), aug_signal_window_angstroms=(50, 150), random_state=100)¶ Bases:
deepSIP.preprocessing.EvaluationSpectra
class for preparing Training, Validation, and Testing sets
Parameters: - spectra : pd.DataFrame
spectra to prepare; must have columns columns of [SN, filename, z] and optionally obsframe as bool
- savefile : str, optional
name of save file
- prep : int, optional
preparation mode (1 for all spectra, 2 for domain-restricted subset)
- val_frac : float, optional
fraction of full set to split for validation
- test_frac : float, optional
fraction of full set to split for testing
- phase_bounds : tuple, list, or other iterable of length 2, optional
lower and upper phase limits of domain
- dm15_bounds : tuple, list, or other iterable of length 2, optional
lower and upper dm15 limits of domain
Other Parameters: - phase_binsize : int or float, optional
phase bin size for pseudo-stratified splitting
- dm15_binsize : int or float, optional
dm15 bin size for pseudo-stratified splitting
- aug_num : int, optional
final size of training set after augmentation
- aug_z_bounds : tuple, list, or other iterable of length 2, optional
lower and upper limits of randomly selected redshifts for augmented spectra
- aug_signal_window_angstroms : tuple, list, or other iterable of length 2
lower and upper limits of randomly selected signal windows for augmented spectra
- random_state : int, optional
seed for random number generator
Attributes: - spectra_[out,in] : pd.DataFrame
subset of spectra that are [out,in] selected domain
- spectra_aug : pd.DataFrame
augmented spectra
- Ycol : list
columns in spectra that correspond to labels
- [train,aug,val,test]X : np.ndarray
processed spectra (attribute set by [aug]process method)
- [train,aug,val,test]Y : np.ndarray
targets (attribute set by [aug]process method)
Methods
SNID
(self[, selection, status])run SNID on selected (via boolean array) spectra in dataset SNID_to_csv
(self)write SNID results to csv file augprocess
(self[, status])process all loaded spectra with augmentation load
(self)load from savefile process
(self[, status])process all loaded spectra with no augmentation save
(self)save current state to savefile split
(self[, force, method])split spectra into train/val/test subsets to_npy
(self)write processed spectra and targets to .npy files -
split
(self, force=False, method='stratified')¶ split spectra into train/val/test subsets
- two splitting methods are available:
- ‘stratified’ - split in-domain spectra in pseudo-stratified
- fashion by randomly selected subsets from bins
- ‘randomized’ - random selection
Parameters: - force : bool, optional
overwrite pre-existing splits if True
- method : str, optional
splitting method (‘stratified’ or ‘randomized’)
-
process
(self, status=True)¶ process all loaded spectra with no augmentation
Parameters: - status : bool, optional
show status bars
-
augprocess
(self, status=True)¶ process all loaded spectra with augmentation
Parameters: - status : bool, optional
show status bars
-
to_npy
(self)¶ write processed spectra and targets to .npy files