LPOT supports built-in preprocessing methods on different framework backends. Refer to this HelloWorld example on how to configure a transform in a dataloader.
Transform | Parameters | Comments | Usage(In yaml file) |
---|---|---|---|
Resize(size, interpolation) | size (list or int): Size of the result interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic' |
Resize the input image to the given size | Resize: size: 256 interpolation: bilinear |
CenterCrop(size) | size (list or int): Size of the result | Crops the given image at the center to the given size | CenterCrop: size: [10, 10] # or size: 10 |
RandomResizedCrop(size, scale, ratio, interpolation) | size (list or int): Size of the result scale (tuple or list, default=(0.08, 1.0)):range of size of the origin size cropped ratio (tuple or list, default=(3. / 4., 4. / 3.)): range of aspect ratio of the origin aspect ratio cropped interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest' |
Crop the given image to random size and aspect ratio | RandomResizedCrop: size: [10, 10] # or size: 10 scale: [0.08, 1.0] ratio: [3. / 4., 4. / 3.] interpolation: bilinear |
Normalize(mean, std) | mean (list, default=[0.0]):means for each channel, if len(mean)=1, mean will be broadcasted to each channel, otherwise its length should be same with the length of image shape std (list, default=[1.0]):stds for each channel, if len(std)=1, std will be broadcasted to each channel, otherwise its length should be same with the length of image shape |
Normalize a image with mean and standard deviation | Normalize: mean: [0.0, 0.0, 0.0] std: [1.0, 1.0, 1.0] |
RandomCrop(size) | size (list or int): Size of the result | Crop the image at a random location to the given size | RandomCrop: size: [10, 10] # size: 10 |
Compose(transform_list) | transform_list (list of Transform objects): list of transforms to compose | Composes several transforms together | If user uses yaml file to configure transforms, LPOT will automatic call Compose to group other transforms. In user code: from lpot.experimental.data import TRANSFORMS preprocess = TRANSFORMS(framework, 'preprocess') resize = preprocess["Resize"] (**args) normalize = preprocess["Normalize"] (**args) compose = preprocess["Compose"] ([resize, normalize]) sample = compose(sample) # sample: image, label |
CropResize(x, y, width, height, size, interpolation) | x (int):Left boundary of the cropping area y (int):Top boundary of the cropping area width (int):Width of the cropping area height (int):Height of the cropping area size (list or int): resize to new size after cropping interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest' and 'bicubic' |
Crop the input image with given location and resize it | CropResize: x: 0 y: 5 width: 224 height: 224 size: [100, 100] # or size: 100 interpolation: bilinear |
RandomHorizontalFlip() | None | Horizontally flip the given image randomly | RandomHorizontalFlip: {} |
RandomVerticalFlip() | None | Vertically flip the given image randomly | RandomVerticalFlip: {} |
DecodeImage() | None | Decode a JPEG-encoded image to a uint8 tensor | DecodeImage: {} |
EncodeJped() | None | Encode image to a Tensor of type string | EncodeJped: {} |
Transpose(perm) | perm (list): A permutation of the dimensions of input image | Transpose image according perm | Transpose: perm: [1, 2, 0] |
CropToBoundingBox(offset_height, offset_width, target_height, target_width) | offset_height (int): Vertical coordinate of the top-left corner of the result in the input offset_width (int): Horizontal coordinate of the top-left corner of the result in the input target_height (int): Height of the result target_width (int): Width of the result |
Crops an image to a specified bounding box | CropToBoundingBox: offset_height: 10 offset_width: 10 target_height: 224 224 |
Cast(dtype) | dtype (str, default='float32'): A dtype to convert image to | Convert image to given dtype | Cast: dtype: float32 |
ToArray() | None | Convert PIL Image to numpy array | ToArray: {} |
Rescale() | None | Scale the values of image to [0,1] | Rescale: {} |
AlignImageChannel(dim) | dim (int): The channel number of result image | Align image channel, now just support [H,W]->[H,W,dim], [H,W,4]->[H,W,3] and [H,W,3]->[H,W] | AlignImageChannel: dim: 3 |
ParseDecodeImagenet() | None | Parse features in Example proto | ParseDecodeImagenet: {} |
ResizeCropImagenet(height, width, random_crop, resize_side, random_flip_left_right, mean_value, scale) | height (int): Height of the result width (int): Width of the result random_crop (bool, default=False): whether to random crop resize_side (int, default=256):desired shape after resize operation random_flip_left_right (bool, default=False): whether to random flip left and right mean_value (list, default=[0.0,0.0,0.0]):means for each channel scale (float, default=1.0):std value |
Combination of a series of transforms which is applicable to images in Imagenet | ResizeCropImagenet: height: 224 width: 224 random_crop: False resize_side: 256 random_flip_left_right: False mean_value: [123.68, 116.78, 103.94] scale: 0.017 |
QuantizedInput(dtype, scale) | dtype(str): desired image dtype, support 'uint8', 'int8' scale(float, default=None):scaling ratio of each point in image |
Convert the dtype of input to quantize it | QuantizedInput: dtype: 'uint8' |
LabelShift(label_shift) | label_shift(int, default=0): number of label shift | Convert label to label - label_shift | LabelShift: label_shift: 0 |
BilinearImagenet(height, width, central_fraction, mean_value, scale) | height(int): Height of the result width(int):Width of the result central_fraction(float, default=0.875):fraction of size to crop mean_value(list, default=[0.0,0.0,0.0]):means for each channel scale(float, default=1.0):std value |
Combination of a series of transforms which is applicable to images in Imagenet | BilinearImagenet: height: 224 width: 224 central_fraction: 0.875 mean_value: [0.0,0.0,0.0] scale: 1.0 |
SquadV1(label_file, n_best_size, max_seq_length, max_query_length, max_answer_length, do_lower_case, doc_stride) | label_file (str): path of label file vocab_file(str): path of vocabulary file n_best_size (int, default=20): The total number of n-best predictions to generate in the nbest_predictions.json output file max_seq_length (int, default=384): The maximum total input sequence length after WordPiece tokenization. Sequences longer than this will be truncated, and sequences shorter, than this will be padded max_query_length (int, default=64): The maximum number of tokens for the question. Questions longer than this will be truncated to this length max_answer_length (int, default=30): The maximum length of an answer that can be generated. This is needed because the start and end predictions are not conditioned on one another do_lower_case (bool, default=True): Whether to lower case the input text. Should be True for uncased models and False for cased models doc_stride (int, default=128): When splitting up a long document into chunks, how much stride to take between chunks |
Postprocess the predictions of bert on SQuAD | SquadV1 label_file: /path/to/label_file n_best_size: 20 max_seq_length: 384 max_query_length: 64 max_answer_length: 30 do_lower_case: True doc_stride: True |
Transform | Parameters | Comments | Usage(In yaml file) |
---|---|---|---|
Resize(size) | size (list or int): Size of the result interpolation(str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic' |
Resize the input image to the given size | Resize: size: 256 interpolation: bilinear |
CenterCrop(size) | size (list or int): Size of the result | Crops the given image at the center to the given size | CenterCrop: size: [10, 10] # or size: 10 |
RandomResizedCrop(size, scale, ratio, interpolation) | size (list or int): Size of the result scale (tuple or list, default=(0.08, 1.0)):range of size of the origin size cropped ratio (tuple or list, default=(3. / 4., 4. / 3.)): range of aspect ratio of the origin aspect ratio cropped interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic' |
Crop the given image to random size and aspect ratio | RandomResizedCrop: size: [10, 10] # or size: 10 scale: [0.08, 1.0] ratio: [3. / 4., 4. / 3.] interpolation: bilinear |
Normalize(mean, std) | mean (list, default=[0.0]):means for each channel, if len(mean)=1, mean will be broadcasted to each channel, otherwise its length should be same with the length of image shape std (list, default=[1.0]):stds for each channel, if len(std)=1, std will be broadcasted to each channel, otherwise its length should be same with the length of image shape |
Normalize a image with mean and standard deviation | Normalize: mean: [0.0, 0.0, 0.0] std: [1.0, 1.0, 1.0] |
RandomCrop(size) | size (list or int): Size of the result | Crop the image at a random location to the given size | RandomCrop: size: [10, 10] # size: 10 |
Compose(transform_list) | transform_list (list of Transform objects): list of transforms to compose | Composes several transforms together | If user uses yaml file to configure transforms, LPOT will automatic call Compose to group other transforms. In user code: from lpot.experimental.data import TRANSFORMS preprocess = TRANSFORMS(framework, 'preprocess') resize = preprocess["Resize"] (**args) normalize = preprocess["Normalize"] (**args) compose = preprocess["Compose"] ([resize, normalize]) sample = compose(sample) # sample: image, label |
RandomHorizontalFlip() | None | Horizontally flip the given image randomly | RandomHorizontalFlip: {} |
RandomVerticalFlip() | None | Vertically flip the given image randomly | RandomVerticalFlip: {} |
Transpose(perm) | perm (list): A permutation of the dimensions of input image | Transpose image according perm | Transpose: perm: [1, 2, 0] |
CropToBoundingBox(offset_height, offset_width, target_height, target_width) | offset_height (int): Vertical coordinate of the top-left corner of the result in the input offset_width (int): Horizontal coordinate of the top-left corner of the result in the input target_height (int): Height of the result target_width (int): Width of the result |
Crops an image to a specified bounding box | CropToBoundingBox: offset_height: 10 offset_width: 10 target_height: 224 224 |
ToTensor() | None | Convert a PIL Image or numpy.ndarray to tensor | ToTensor: {} |
ToPILImage() | None | Convert a tensor or an ndarray to PIL Image | ToPILImage: {} |
Pad(padding, fill, padding_mode) | padding (int or tuple or list): Padding on each border fill (int or str or tuple): Pixel fill value for constant fill. Default is 0 padding_mode (str): Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant |
Pad the given image on all sides with the given “pad” value | Pad: padding: 0 fill: 0 padding_mode: constant |
ColorJitter(brightness, contrast, saturation, hue) | brightness (float or tuple of python:float (min, max)): How much to jitter brightness. Default is 0 contrast (float or tuple of python:float (min, max)): How much to jitter contrast. Default is 0 saturation (float or tuple of python:float (min, max)): How much to jitter saturation. Default is 0 hue (float or tuple of python:float (min, max)): How much to jitter hue. Default is 0 |
Randomly change the brightness, contrast, saturation and hue of an image | ColorJitter: brightness: 0 contrast: 0 saturation: 0 hue: 0 |
ToArray() | None | Convert PIL Image to numpy array | ToArray: {} |
CropResize(x, y, width, height, size, interpolation) | x (int):Left boundary of the cropping area y (int):Top boundary of the cropping area width (int):Width of the cropping area height (int):Height of the cropping area size (list or int): resize to new size after cropping interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic' |
Crop the input image with given location and resize it | CropResize: x: 0 y: 5 width: 224 height: 224 size: [100, 100] # or size: 100 interpolation: bilinear |
Cast(dtype) | dtype (str, default ='float32') :The target data type | Convert image to given dtype | Cast: dtype: float32 |
AlignImageChannel(dim) | dim (int): The channel number of result image | Align image channel, now just support [H,W,4]->[H,W,3] and [H,W,3]->[H,W], input image must be PIL Image | AlignImageChannel: dim: 3 |
Transform | Parameters | Comments | Usage(In yaml file) |
---|---|---|---|
Resize(size, interpolation) | size (list or int): Size of the result interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic' |
Resize the input image to the given size | Resize: size: 256 interpolation: bilinear |
CenterCrop(size) | size (list or int): Size of the result | Crops the given image at the center to the given size | CenterCrop: size: [10, 10] # or size: 10 |
RandomResizedCrop(size, scale, ratio, interpolation) | size (list or int): Size of the result scale (tuple or list, default=(0.08, 1.0)):range of size of the origin size cropped ratio (tuple or list, default=(3. / 4., 4. / 3.)): range of aspect ratio of the origin aspect ratio cropped interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic' |
Crop the given image to random size and aspect ratio | RandomResizedCrop: size: [10, 10] # or size: 10 scale: [0.08, 1.0] ratio: [3. / 4., 4. / 3.] interpolation: bilinear |
Normalize(mean, std) | mean (list, default=[0.0]):means for each channel, if len(mean)=1, mean will be broadcasted to each channel, otherwise its length should be same with the length of image shape std (list, default=[1.0]):stds for each channel, if len(std)=1, std will be broadcasted to each channel, otherwise its length should be same with the length of image shape |
Normalize a image with mean and standard deviation | Normalize: mean: [0.0, 0.0, 0.0] std: [1.0, 1.0, 1.0] |
RandomCrop(size) | size (list or int): Size of the result | Crop the image at a random location to the given size | RandomCrop: size: [10, 10] # size: 10 |
Compose(transform_list) | transform_list (list of Transform objects): list of transforms to compose | Composes several transforms together | If user uses yaml file to configure transforms, LPOT will automatic call Compose to group other transforms. In user code: from lpot.experimental.data import TRANSFORMS preprocess = TRANSFORMS(framework, 'preprocess') resize = preprocess["Resize"] (**args) normalize = preprocess["Normalize"] (**args) compose = preprocess["Compose"] ([resize, normalize]) sample = compose(sample) # sample: image, label |
CropResize(x, y, width, height, size, interpolation) | x (int):Left boundary of the cropping area y (int):Top boundary of the cropping area width (int):Width of the cropping area height (int):Height of the cropping area size (list or int): resize to new size after cropping interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic' |
Crop the input image with given location and resize it | CropResize: x: 0 y: 5 width: 224 height: 224 size: [100, 100] # or size: 100 interpolation: bilinear |
RandomHorizontalFlip() | None | Horizontally flip the given image randomly | RandomHorizontalFlip: {} |
RandomVerticalFlip() | None | Vertically flip the given image randomly | RandomVerticalFlip: {} |
CropToBoundingBox(offset_height, offset_width, target_height, target_width) | offset_height (int): Vertical coordinate of the top-left corner of the result in the input offset_width (int): Horizontal coordinate of the top-left corner of the result in the input target_height (int): Height of the result target_width (int): Width of the result |
Crops an image to a specified bounding box | CropToBoundingBox: offset_height: 10 offset_width: 10 target_height: 224 224 |
ToArray() | None | Convert NDArray to numpy array | ToArray: {} |
ToTensor() | None | Converts an image NDArray or batch of image NDArray to a tensor NDArray | ToTensor: {} |
Cast(dtype) | dtype (str, default ='float32') :The target data type | Convert image to given dtype | Cast: dtype: float32 |
Transpose(perm) | perm (list): A permutation of the dimensions of input image | Transpose image according perm | Transpose: perm: [1, 2, 0] |
AlignImageChannel(dim) | dim (int): The channel number of result image | Align image channel, now just support [H,W]->[H,W,dim], [H,W,4]->[H,W,3] and [H,W,3]->[H,W] | AlignImageChannel: dim: 3 |
ToNDArray() | None | Convert np.array to NDArray | ToNDArray: {} |
Type | Parameters | Comments | Usage(In yaml file) |
---|---|---|---|
Resize(size, interpolation) | size (list or int): Size of the result interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic' |
Resize the input image to the given size | Resize: size: 256 interpolation: bilinear |
CenterCrop(size) | size (list or int): Size of the result | Crops the given image at the center to the given size | CenterCrop: size: [10, 10] # or size: 10 |
RandomResizedCrop(size, scale, ratio, interpolation) | size (list or int): Size of the result scale (tuple or list, default=(0.08, 1.0)):range of size of the origin size cropped ratio (tuple or list, default=(3. / 4., 4. / 3.)): range of aspect ratio of the origin aspect ratio cropped interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest' |
Crop the given image to random size and aspect ratio | RandomResizedCrop: size: [10, 10] # or size: 10 scale: [0.08, 1.0] ratio: [3. / 4., 4. / 3.] interpolation: bilinear |
Normalize(mean, std) | mean (list, default=[0.0]):means for each channel, if len(mean)=1, mean will be broadcasted to each channel, otherwise its length should be same with the length of image shape std (list, default=[1.0]):stds for each channel, if len(std)=1, std will be broadcasted to each channel, otherwise its length should be same with the length of image shape |
Normalize a image with mean and standard deviation | Normalize: mean: [0.0, 0.0, 0.0] std: [1.0, 1.0, 1.0] |
RandomCrop(size) | size (list or int): Size of the result | Crop the image at a random location to the given size | RandomCrop: size: [10, 10] # size: 10 |
Compose(transform_list) | transform_list (list of Transform objects): list of transforms to compose | Composes several transforms together | If user uses yaml file to configure transforms, LPOT will automatic call Compose to group other transforms. In user code: from lpot.experimental.data import TRANSFORMS preprocess = TRANSFORMS(framework, 'preprocess') resize = preprocess["Resize"] (**args) normalize = preprocess["Normalize"] (**args) compose = preprocess["Compose"] ([resize, normalize]) sample = compose(sample) # sample: image, label |
CropResize(x, y, width, height, size, interpolation) | x (int):Left boundary of the cropping area y (int):Top boundary of the cropping area width (int):Width of the cropping area height (int):Height of the cropping area size (list or int): resize to new size after cropping interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest' |
Crop the input image with given location and resize it | CropResize: x: 0 y: 5 width: 224 height: 224 size: [100, 100] # or size: 100 interpolation: bilinear |
RandomHorizontalFlip() | None | Horizontally flip the given image randomly | RandomHorizontalFlip: {} |
RandomVerticalFlip() | None | Vertically flip the given image randomly | RandomVerticalFlip: {} |
CropToBoundingBox(offset_height, offset_width, target_height, target_width) | offset_height (int): Vertical coordinate of the top-left corner of the result in the input offset_width (int): Horizontal coordinate of the top-left corner of the result in the input target_height (int): Height of the result target_width (int): Width of the result |
Crops an image to a specified bounding box | CropToBoundingBox: offset_height: 10 offset_width: 10 target_height: 224 224 |
ToArray() | None | Convert PIL Image to numpy array | ToArray: {} |
Rescale() | None | Scale the values of image to [0,1] | Rescale: {} |
AlignImageChannel(dim) | dim (int): The channel number of result image | Align image channel, now just support [H,W]->[H,W,dim], [H,W,4]->[H,W,3] and [H,W,3]->[H,W] | AlignImageChannel: dim: 3 |
ResizeCropImagenet(height, width, random_crop, resize_side, random_flip_left_right, mean_value, scale) | height (int): Height of the result width (int): Width of the result random_crop (bool, default=False): whether to random crop resize_side (int, default=256):desired shape after resize operation random_flip_left_right (bool, default=False): whether to random flip left and right mean_value (list, default=[0.0,0.0,0.0]):mean for each channel scale (float, default=1.0):std value |
Combination of a series of transforms which is applicasble to images in Imagenet | ResizeCropImagenet: height: 224 width: 224 random_crop: False resize_side: 256 random_flip_left_right: False mean_value: [123.68, 116.78, 103.94] scale: 0.017 |
Cast(dtype) | dtype (str, default ='float32') :The target data type | Convert image to given dtype | Cast: dtype: float32 |