Pipelines¶

Helper functions describing pipelines for creating large samples of nodules

radio.pipelines.pipelines.combine_crops(cancer_set, non_cancer_set, batch_sizes=(10, 10), hu_lims=(-1000, 400))[source]¶

Pipeline for generating batches of cancerous and non-cancerous crops from ct-scans in chosen proportion.

Parameters:	cancer_set (dataset) – dataset of cancerous crops in blosc format. non_cancer_set (dataset) – dataset of non-cancerous crops in blosc format. batch_sizes (tuple, list of int) – seq of len=2, (num_cancer_batches, num_noncancer_batches). hu_lims (tuple, list of float) – seq of len=2, representing limits of hu-trimming in normalize_hu-action.
Returns:
Return type:	pipeline

radio.pipelines.pipelines.get_crops(nodules, fmt='raw', nodule_shape=(32, 64, 64), batch_size=20, share=0.5, histo=None, variance=(36, 144, 144), hu_lims=(-1000, 400), **kwargs)[source]¶

Get pipeline that performs preprocessing and crops cancerous/non-cancerous nodules in a chosen proportion.

Parameters:	nodules (pd.DataFrame) – contains: ’seriesuid’: index of patient or series. ’z’,’y’,’x’: coordinates of nodules center. ’diameter’: diameter, in mm. fmt (str) – can be either ‘raw’, ‘blosc’ or ‘dicom’. nodule_shape (tuple, list or ndarray of int) – crop shape along (z,y,x). batch_size (int) – number of nodules in batch generated by pipeline. share (float) – share of cancer crops in the batch. histo (tuple) – `numpy.histogramdd()` output. Used for sampling non-cancerous crops variance (tuple, list or ndarray of float) – variances of normally distributed random shifts of nodules’ start positions hu_lims (tuple, list of float) – seq of len=2, representing limits of hu-trimming in normalize_hu-action. **kwargs – spacing : tuple (z,y,x) spacing after resize. shape : tuple (z,y,x) shape after crop/pad. method : str interpolation method (‘pil-simd’ or ‘resize’). See `resize()`. order : None or int order of scipy-interpolation (<=5), if used. padding : str mode of padding, any supported by `numpy.pad()`.
Returns:
Return type:	pipeline

radio.pipelines.pipelines.split_dump(cancer_path, non_cancer_path, nodules, histo=None, fmt='raw', nodule_shape=(32, 64, 64), variance=(36, 144, 144), **kwargs)[source]¶

Get pipeline for dumping cancerous crops in one folder and random noncancerous crops in another.

Parameters:	cancer_path (str) – directory to dump cancerous crops in. non_cancer_path (str) – directory to dump non-cancerous crops in. nodules (pd.DataFrame) – contains: ’seriesuid’: index of patient or series. ’z’,’y’,’x’: coordinates of nodules center. ’diameter’: diameter, in mm. histo (tuple) – `numpy.histogramdd()` output. Used for sampling non-cancerous crops fmt (str) – can be either ‘raw’, ‘blosc’ or ‘dicom’. nodule_shape (tuple, list or ndarray of int) – crop shape along (z,y,x). variance (tuple, list or ndarray of float) – variances of normally distributed random shifts of nodules’ start positions **kwargs – spacing : tuple (z,y,x) spacing after resize. shape : tuple (z,y,x) shape after crop/pad. method : str interpolation method (‘pil-simd’ or ‘resize’). See `resize()` for more information. order : None or int order of scipy-interpolation (<=5), if used. padding : str mode of padding, any supported by `numpy.pad()`.
Returns:
Return type:	pipeline

radio.pipelines.pipelines.update_histo(nodules, histo, fmt='raw', **kwargs)[source]¶

Pipeline for updating histogram using info in dataset of scans.

Parameters:	nodules (pd.DataFrame) – contains: ’seriesuid’: index of patient or series. ’z’,’y’,’x’: coordinates of nodules center. ’diameter’: diameter, in mm. histo (tuple) – `numpy.histogramdd()` output. Used for sampling non-cancerous crops (compare the latter with tuple (bins, edges) returned by `numpy.histogramdd()`). fmt (str) – can be either ‘raw’, ‘blosc’ or ‘dicom’. **kwargs – spacing : tuple (z,y,x) spacing after resize. shape : tuple (z,y,x) shape after crop/pad. method : str interpolation method (‘pil-simd’ or ‘resize’). See `resize()` for more information. order : None or int order of scipy-interpolation (<=5), if used. padding : str mode of padding, any supported by `numpy.pad()`.
Returns:
Return type:	pipeline

Pipelines¶

Previous topic

Next topic

This Page