ImagesBatch¶
- class ImagesBatch(index, dataset=None, pipeline=None, preloaded=None, copy=False, *args, **kwargs)[source]¶
Bases:
batchflow.batch_image.BaseImagesBatch
Batch class for 2D images.
Images are stored as numpy arrays of PIL.Image.
PIL.Image has the following system of coordinates:
X 0 -------------- > | | | images's pixels | | Y v
Pixel’s position is defined as (x, y)
Note, that if any class method is wrapped with @apply_parallel decorator than for inner calls (i.e. from other class methods) should be used version of desired method with underscores. (For example, if there is a decorated method than you need to call _method_ from inside of other_method). Same is applicable for all child classes of
batch.Batch
.- add(image, term=1.0, clip=False, preserve_type=False)[source]¶
Add term to each pixel.
- Parameters
term (float, sequence) –
clip (bool) – whether to force image’s pixels to be in [0, 255] or [0, 1.]
preserve_type (bool) – Whether to preserve
dtype
of transformed images. IfFalse
is given then the resulting type will benp.float
.src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- add_components(components, init=None)¶
Add new components
- Parameters
- Raises
ValueError – If a component or an attribute with the given name already exists
- additive_noise(image, noise, clip=False, preserve_type=False)[source]¶
Add additive noise to an image.
- Parameters
noise (callable) – Distribution. Must have
size
parameter.clip (bool) – whether to force image’s pixels to be in [0, 255] or [0, 1.]
preserve_type (bool) – Whether to preserve
dtype
of transformed images. IfFalse
is given then the resulting type will benp.float
.src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- apply_defaults = {'dst': 'images', 'post': '_assemble', 'src': 'images', 'target': 'for'}¶
- apply_parallel(func, init=None, post=None, src=None, dst=None, *args, p=None, target='for', requires_rng=False, rng_seeds=None, **kwargs)¶
Apply a function to each item in the container, returned by init, and assemble results by post. Depending on the target parameter, different parallelization engines may be used: for, threads, MPC, async.
- Roughly, under the hood we perform the following:
- compute parameters, individual for each worker. Currently, these are:
p to indicate whether the function should be applied
worker id and a seed for random generator, if required
call init function, which outputs a container of items, passed directly to the func.
The simplest example is the init funciton that returns batch indices, and the function works off of each. - wrap the func call into parallelization engine of choice. - compute results of func calls for each item, returned by init - assemble results by post function, e.g. stack the obtained numpy arrays.
In the simplest possible case of init=None, src=images, dst=images_transformed, post=None, this function is almost equivalent to:
If src is a list and dst is a list, then this function is applied recursively to each pair of src, dst. If src is a tuple, then this tuple is used as a whole. This allows to make functions that work on multiple components.
- Parameters
func (callable) – A function to apply to each item from the source. Should accept src and dst parameters, or be written in a way that accepts variable args.
target (str) –
- Parallelization engine:
’f’, ‘for’ for executing each worker sequentially, like in a for-loop.
’t’, ‘threads’ for using threads.
’m’, ‘mpc’ for using processes. Note the bigger overhead for process initialization.
’a’, ‘async’ for asynchronous execution.
init (str, callable or container) –
Function to init data for individual workers: must return a container of items.
If ‘data’, then use src components as the init. If other str, then must be a name of the attribute of the batch to use as the init. If callable or any previous returned a callable, then result of this callable is used as the init. Note that in the last case callable should accept src and dst parameters, and kwargs are also passed. If not any of the above, then the object is used directly, for example, np.ndarray.
post (str or callable) – Function to apply to the results of function evaluation on each item. Must accept src and dst parameters, as well as kwargs.
src (str, sequence, list of str) – The source to get data from: - None - str - a component name, e.g. ‘images’ or ‘masks’ - tuple or list of str - several component names - sequence - data as a numpy-array, data frame, etc
dst (str or array) – the destination to put the result in, can be: - None - in this case dst is set to be same as src - str - a component name, e.g. ‘images’ or ‘masks’ - tuple or list of str, e.g. [‘images’, ‘masks’] If not provided, uses src.
p (float or None) – Probability of applying func to an element in the batch.
requires_rng (bool) – Whether the func requires RNG. Should be used for correctly initialized seeds for reproducibility. If True, then a pre-initialized RNG will be passed to the function call as rng keyword parameter.
args – Other parameters passed to
func
.kwargs – Other parameters passed to
func
.
Notes
apply_parallel does the following (but in parallel):
for item in range(len(batch)): self.dst[item] = func(self.src[item], *args, **kwargs)
apply_parallel(func, src=[‘images’, ‘masks’]) is equal to apply_parallel(func, src=[‘images’, ‘masks’], dst=[‘images’, ‘masks’]), which in turn equals to two subsequent calls:
images = func(images) masks = func(masks)
However, named expressions will be evaluated only once before the first call.
Whereas apply_parallel(func, src=(‘images’, ‘masks’)) (i.e. when src takes a tuple of component names, not the list as in the previous example) passes both components data into func simultaneously:
images, masks = func((images, masks))
Examples
apply_parallel(make_masks_fn, src='images', dst='masks') apply_parallel(apply_mask, src=('images', 'masks'), dst='images_with_masks') apply_parallel(rotate, src=['images', 'masks'], dst=['images', 'masks'], p=.2) apply_parallel(MyBatch.some_static_method, p=.5) apply_parallel(B.some_method, src='features', p=.5)
TODO: move logic of applying post function from inbatch_parallel here, as well as remove use_self arg.
- property array_of_nones¶
NumPy
array withNone
values.- Type
1-D ndarray
- as_dataset(dataset=None, copy=False)¶
Makes a new dataset from batch data
- Parameters
dataset – an instance or a subclass of Dataset
copy (bool) – whether to copy batch data to allow for further inplace transformations
- Returns
- Return type
an instance of a class specified by dataset arg, preloaded with this batch data
- clip(image, low=0, high=255)[source]¶
Truncate image’s pixels.
- Parameters
low (int, float, sequence) – Actual pixel’s value is equal max(value, low). If sequence is given, then its length must coincide with the number of channels in an image and each channel is thresholded separately
high (int, float, sequence) – Actual pixel’s value is equal min(value, high). If sequence is given, then its length must coincide with the number of channels in an image and each channel is thresholded separately
src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- components = ('images', 'labels', 'masks')¶
- create_attrs(**kwargs)¶
Create attributes from kwargs
- crop(image, origin, shape, crop_boundaries=False, src=None, dst=None)[source]¶
Crop an image.
Extract image data from the window of the size given by shape and placed at origin.
- Parameters
origin (sequence, str) – Location of the cropping box. See
ImagesBatch._calc_origin()
for details.shape (sequence) – crop size in the form of (rows, columns)
crop_boundaries (bool) – If True then crop is got only from image’s area. Shape of the crop might diverge with the passed one
src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
Notes
Using ‘random’ origin with src as list with multiple elements will not result in same crop for each element, as origin will be sampled independently for each src element. To randomly sample same origin for a number of components, use R named expression for origin argument.
- cutout(image, origin, shape, color)[source]¶
Fills given areas with color
Note
It is assumed that
origins
,shapes
andcolors
have the same length.- Parameters
origin (sequence, str) – Location of the cropping box. See
ImagesBatch._calc_origin()
for details.shape (sequence, int) –
- Shape of a filled box. Can be one of:
sequence - crop size in the form of (rows, columns)
int - shape has squared form
color (sequence, number) –
Color of a filled box. Can be one of:
sequence - (r,g,b) form
number - grayscale
src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
Notes
Using ‘random’ origin with src as list with multiple elements will not result in same crop for each element, as origin will be sampled independently for each src element. To randomly sample same origin for a number of components, use R named expression for origin argument.
- property data¶
tuple or named components - batch data
- property data_setter¶
tuple or named components - batch data
- property dataset¶
Dataset - a dataset the batch has been taken from
- deepcopy()¶
Return a deep copy of the batch.
- do_nothing(*args, **kwargs)¶
An empty action (might be convenient in complicated pipelines)
- dump(*args, dst=None, fmt=None, components='images', **kwargs)¶
Dump data.
Note
If fmt=’images’ than
dst
must be a single component (str).Note
All parameters must be named only.
- Parameters
- Returns
- Return type
self
- elastic_transform(image, alpha, sigma, **kwargs)[source]¶
Deformation of images as described by Simard, Steinkraus and Platt, Best Practices for Convolutional Neural Networks applied to Visual Document Analysis <http://cognitivemedium.com/assets/rmnist/Simard.pdf>_.
Code slightly differs from https://gist.github.com/chsasank/4d8f68caf01f041a6453e67fb30f8f5a.
- enhance(image, layout='hcbs', factor=(1, 1, 1, 1))[source]¶
Apply enhancements from PIL.ImageEnhance to the image.
- filter(image, mode, *args, **kwargs)[source]¶
Filters an image. Calls
image.filter(getattr(PIL.ImageFilter, mode)(*args, **kwargs))
.For more details see ImageFilter <http://pillow.readthedocs.io/en/stable/reference/ImageFilter.html>_.
- classmethod from_data(index=None, data=None)¶
Create a batch from data given
- get(item=None, component=None)¶
Return an item from the batch or the component
- get_attrs()¶
Return additional attrs as kwargs
- get_errors(all_res)¶
Return a list of errors from a parallel action
- property image_shape¶
tuple - shape of the image
- property indices¶
numpy array - an array with the indices
- property items¶
list - batch items
- load(*args, src=None, fmt=None, dst=None, **kwargs)¶
Load data.
Note
if fmt=’images’ than
components
must be a single component (str).Note
All parameters must be named only.
- classmethod merge(batches, batch_size=None, components=None, batch_class=None)¶
Merge several batches to form a new batch of a given size
- Parameters
batches (tuple of batches) –
batch_size (int or None) – if None, just merge all batches into one batch (the rest will be None), if int, then make one batch of batch_size and a batch with the rest of data.
components (str, tuple or None) – if None, all components from initial batches will be created, if str or tuple, then create these components in new batches.
batch_class (Batch or None) – if None, created batches will be of the same class as initial batch, if Batch, created batches will be of that class.
- Returns
batch, rest
- Return type
tuple of two batches
- Raises
ValueError – If component is None in some batches and not None in others.
- classmethod merge_component(component=None, data=None)¶
Merge the same component data from several batches
- multiplicative_noise(image, noise, clip=False, preserve_type=False)[source]¶
Add multiplicative noise to an image.
- Parameters
noise (callable) – Distribution. Must have
size
parameter.clip (bool) – whether to force image’s pixels to be in [0, 255] or [0, 1.]
preserve_type (bool) – Whether to preserve
dtype
of transformed images. IfFalse
is given then the resulting type will benp.float
.src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- multiply(image, multiplier=1.0, clip=False, preserve_type=False, src=None, dst=None)[source]¶
Multiply each pixel by the given multiplier.
- Parameters
multiplier (float, sequence) –
clip (bool) – whether to force image’s pixels to be in [0, 255] or [0, 1.]
preserve_type (bool) – Whether to preserve
dtype
of transformed images. IfFalse
is given then the resulting type will benp.float
.src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- pad(image, *args, **kwargs)[source]¶
Calls
PIL.ImageOps.expand
.For more details see http://pillow.readthedocs.io/en/stable/reference/ImageOps.html#PIL.ImageOps.expand.
- Parameters
offset (sequence) – Size of the borders in pixels. The order is (left, top, right, bottom).
mode ({'const', 'wrap'}) – Filling mode
src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- property pipeline¶
Pipeline - a pipeline the batch is being used in
- posterize(image, bits=4)[source]¶
Posterizes image.
More concretely, it quantizes pixels’ values so that they have``2^bits`` colors
- put_on_background(image, background, origin, mask=None)[source]¶
Put an image on a background at given origin
- Parameters
background (PIL.Image, np.ndarray of np.uint8) – Blank background to put image on.
origin (sequence, str) – Location of the cropping box. See
ImagesBatch._calc_origin()
for details.mask (None, PIL.Image, np.ndarray of np.uint8) – mask passed to PIL.Image.paste
Notes
Using ‘random’ origin with src as list with multiple elements will not result in same crop for each element, as origin will be sampled independently for each src element. To randomly sample same origin for a number of components, use R named expression for origin argument.
- property random¶
A random number generator
numpy.random.Generator
. Use it instead of np.random for reproducibility.Examples
x = self.random.normal(0, 1)
- property random_seed¶
SeedSequence for random number generation
- resize(image, size, src=None, dst=None, *args, **kwargs)[source]¶
Calls
image.resize(*args, **kwargs)
.For more details see <https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.resize>_.
- Parameters
size (tuple) – the resulting size of the image. If one of the components of tuple is None, corresponding dimension will be proportionally resized.
src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- rotate(image, *args, **kwargs)[source]¶
Rotates an image.
kwargs are passed to PIL.Image.rotate
- Parameters
angle (Number) – In degrees counter clockwise.
resample (int) – Interpolation order
expand (bool) – Whether to expand the output to hold the whole image. Default is False.
center ((Number, Number)) – Center of rotation. Default is the center of the image.
src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- run_once(*args, **kwargs)¶
Init function for no parallelism Useful for async action-methods (will wait till the method finishes)
- salt(image, p_noise=0.015, color=255, size=(1, 1))[source]¶
Set random pixel on image to givan value.
Every pixel will be set to
color
value with probabilityp_noise
.- Parameters
p_noise (float) – Probability of salting a pixel.
color (float, int, sequence, callable) –
Color’s value.
int, float, sequence – value of color
callable – color is sampled for every chosen pixel (rules are the same as for int, float and sequence)
size (int, sequence of int, callable) –
Size of salt
int – square salt with side
size
sequence – recangular salt in the form (row, columns)
callable – size is sampled for every chosen pixel (rules are the same as for int and sequence)
src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- save(*args, **kwargs)¶
Save batch data to a file (an alias for dump method)
- scale(image, factor, preserve_shape=False, origin='center', resample=0)[source]¶
Scale the content of each image in the batch.
Resulting shape is obtained as original_shape * factor.
- Parameters
factor (float, sequence) –
resulting shape is obtained as original_shape * factor
float - scale all axes with the given factor
sequence (factor_1, factort_2, …) - scale each axis with the given factor separately
preserve_shape (bool) – whether to preserve the shape of the image after scaling
origin (array-like, {'center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random'}) –
Relevant only if preserve_shape is True. If scale < 1, defines position of the scaled image with respect to the original one’s shape. If scale > 1, defines position of cropping box.
Can be one of:
’center’ - place the center of the input image on the center of the background and crop the input image accordingly.
’top_left’ - place the upper-left corner of the input image on the upper-left of the background and crop the input image accordingly.
’top_right’ - crop an image such that upper-right corners of an image and the cropping box coincide
’bottom_left’ - crop an image such that lower-left corners of an image and the cropping box coincide
’bottom_right’ - crop an image such that lower-right corners of an image and the cropping box coincide
’random’ - place the upper-left corner of the input image on the randomly sampled position in the background. Position is sampled uniformly such that there is no need for cropping.
array_like - sequence of ints or sequence of floats in [0, 1) interval; place the upper-left corner of the input image on the given position in the background. If origin is a sequence of floats in [0, 1), it defines a relative position of the origin in a valid region of image.
resample (int) – Parameter passed to PIL.Image.resize. Interpolation order
src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
Notes
Using ‘random’ option for origin with src as list with multiple elements will not result in same crop for each element, as origin will be sampled independently for each src element. To randomly sample same origin for a number of components, use R named expression for origin argument.
- Returns
- Return type
self
- property size¶
int - number of items in the batch
- split_to_patches(ix, patch_shape, stride=1, drop_last=False, src='images', dst=None)[source]¶
Splits image to patches.
Small images with the same shape (
patch_shape
) are cropped from the original one with stridestride
.- Parameters
patch_shape (int, sequence) – Patch’s shape in the from (rows, columns). If int is given then patches have square shape.
stride (int, square) – Step of the moving window from which patches are cropped. If int is given then the window has square shape.
drop_last (bool) – Whether to drop patches whose window covers area out of the image. If False is passed then these patches are cropped from the edge of an image. See more in tutorials.
src (str) – Component to get images from. Default is ‘images’.
dst (str) – Component to write images to. Default is ‘images’.
p (float) – Probability of applying the transform. Default is 1.
- to_array(comp, dtype=<class 'numpy.float32'>, channels='last', **kwargs)¶
Converts batch components to np.ndarray format
- ImagesBatch._calc_origin(image_shape, origin, background_shape)[source]¶
Calculate coordinate of the input image with respect to the background.
- Parameters
image_shape (sequence) – shape of the input image.
origin (array_like, sequence, {'center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random'}) –
- Position of the input image with respect to the background. Can be one of:
’center’ - place the center of the input image on the center of the background and crop the input image accordingly.
’top_left’ - place the upper-left corner of the input image on the upper-left of the background and crop the input image accordingly.
’top_right’ - crop an image such that upper-right corners of an image and the cropping box coincide
’bottom_left’ - crop an image such that lower-left corners of an image and the cropping box coincide
’bottom_right’ - crop an image such that lower-right corners of an image and the cropping box coincide
’random’ - place the upper-left corner of the input image on the randomly sampled position in the background. Position is sampled uniformly such that there is no need for cropping.
other - sequence of ints or sequence of floats in [0, 1) interval; place the upper-left corner of the input image on the given position in the background. If origin is a sequence of floats in [0, 1), it defines a relative position of the origin in a valid region of image.
background_shape (sequence) – shape of the background image.
- Returns
sequence
- Return type
calculated origin in the form (column, row)