Skip to content

Latest commit

 

History

History
98 lines (88 loc) · 24.5 KB

transform.md

File metadata and controls

98 lines (88 loc) · 24.5 KB

Transform

LPOT supports built-in preprocessing methods on different framework backends. Refer to this HelloWorld example on how to configure a transform in a dataloader.

Transform support list

TensorFlow

Transform Parameters Comments Usage(In yaml file)
Resize(size, interpolation) size (list or int): Size of the result
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic'
Resize the input image to the given size Resize:
   size: 256
   interpolation: bilinear
CenterCrop(size) size (list or int): Size of the result Crops the given image at the center to the given size CenterCrop:
   size: [10, 10] # or size: 10
RandomResizedCrop(size, scale, ratio, interpolation) size (list or int): Size of the result
scale (tuple or list, default=(0.08, 1.0)):range of size of the origin size cropped
ratio (tuple or list, default=(3. / 4., 4. / 3.)): range of aspect ratio of the origin aspect ratio cropped
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest'
Crop the given image to random size and aspect ratio RandomResizedCrop:
   size: [10, 10] # or size: 10
   scale: [0.08, 1.0]
   ratio: [3. / 4., 4. / 3.]
   interpolation: bilinear
Normalize(mean, std) mean (list, default=[0.0]):means for each channel, if len(mean)=1, mean will be broadcasted to each channel, otherwise its length should be same with the length of image shape
std (list, default=[1.0]):stds for each channel, if len(std)=1, std will be broadcasted to each channel, otherwise its length should be same with the length of image shape
Normalize a image with mean and standard deviation Normalize:
   mean: [0.0, 0.0, 0.0]
   std: [1.0, 1.0, 1.0]
RandomCrop(size) size (list or int): Size of the result Crop the image at a random location to the given size RandomCrop:
   size: [10, 10] # size: 10
Compose(transform_list) transform_list (list of Transform objects): list of transforms to compose Composes several transforms together If user uses yaml file to configure transforms, LPOT will automatic call Compose to group other transforms.
In user code:
from lpot.experimental.data import TRANSFORMS
preprocess = TRANSFORMS(framework, 'preprocess')
resize = preprocess["Resize"] (**args)
normalize = preprocess["Normalize"] (**args)
compose = preprocess["Compose"] ([resize, normalize])
sample = compose(sample)
# sample: image, label
CropResize(x, y, width, height, size, interpolation) x (int):Left boundary of the cropping area
y (int):Top boundary of the cropping area
width (int):Width of the cropping area
height (int):Height of the cropping area
size (list or int): resize to new size after cropping
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest' and 'bicubic'
Crop the input image with given location and resize it CropResize:
   x: 0
   y: 5
   width: 224
   height: 224
   size: [100, 100] # or size: 100
   interpolation: bilinear
RandomHorizontalFlip() None Horizontally flip the given image randomly RandomHorizontalFlip: {}
RandomVerticalFlip() None Vertically flip the given image randomly RandomVerticalFlip: {}
DecodeImage() None Decode a JPEG-encoded image to a uint8 tensor DecodeImage: {}
EncodeJped() None Encode image to a Tensor of type string EncodeJped: {}
Transpose(perm) perm (list): A permutation of the dimensions of input image Transpose image according perm Transpose:
   perm: [1, 2, 0]
CropToBoundingBox(offset_height, offset_width, target_height, target_width) offset_height (int): Vertical coordinate of the top-left corner of the result in the input
offset_width (int): Horizontal coordinate of the top-left corner of the result in the input
target_height (int): Height of the result
target_width (int): Width of the result
Crops an image to a specified bounding box CropToBoundingBox:
   offset_height: 10
   offset_width: 10
   target_height: 224
   224
Cast(dtype) dtype (str, default='float32'): A dtype to convert image to Convert image to given dtype Cast:
   dtype: float32
ToArray() None Convert PIL Image to numpy array ToArray: {}
Rescale() None Scale the values of image to [0,1] Rescale: {}
AlignImageChannel(dim) dim (int): The channel number of result image Align image channel, now just support [H,W]->[H,W,dim], [H,W,4]->[H,W,3] and [H,W,3]->[H,W] AlignImageChannel:
   dim: 3
ParseDecodeImagenet() None Parse features in Example proto ParseDecodeImagenet: {}
ResizeCropImagenet(height, width, random_crop, resize_side, random_flip_left_right, mean_value, scale) height (int): Height of the result
width (int): Width of the result
random_crop (bool, default=False): whether to random crop
resize_side (int, default=256):desired shape after resize operation
random_flip_left_right (bool, default=False): whether to random flip left and right
mean_value (list, default=[0.0,0.0,0.0]):means for each channel
scale (float, default=1.0):std value
Combination of a series of transforms which is applicable to images in Imagenet ResizeCropImagenet:
   height: 224
   width: 224
   random_crop: False
   resize_side: 256
   random_flip_left_right: False
   mean_value: [123.68, 116.78, 103.94]
   scale: 0.017
QuantizedInput(dtype, scale) dtype(str): desired image dtype, support 'uint8', 'int8'
scale(float, default=None):scaling ratio of each point in image
Convert the dtype of input to quantize it QuantizedInput:
   dtype: 'uint8'
LabelShift(label_shift) label_shift(int, default=0): number of label shift Convert label to label - label_shift LabelShift:
   label_shift: 0
BilinearImagenet(height, width, central_fraction, mean_value, scale) height(int): Height of the result
width(int):Width of the result
central_fraction(float, default=0.875):fraction of size to crop
mean_value(list, default=[0.0,0.0,0.0]):means for each channel
scale(float, default=1.0):std value
Combination of a series of transforms which is applicable to images in Imagenet BilinearImagenet:
   height: 224
   width: 224
   central_fraction: 0.875
   mean_value: [0.0,0.0,0.0]
   scale: 1.0
SquadV1(label_file, n_best_size, max_seq_length, max_query_length, max_answer_length, do_lower_case, doc_stride) label_file (str): path of label file
vocab_file(str): path of vocabulary file
n_best_size (int, default=20): The total number of n-best predictions to generate in the nbest_predictions.json output file
max_seq_length (int, default=384): The maximum total input sequence length after WordPiece tokenization. Sequences longer than this will be truncated, and sequences shorter, than this will be padded
max_query_length (int, default=64): The maximum number of tokens for the question. Questions longer than this will be truncated to this length
max_answer_length (int, default=30): The maximum length of an answer that can be generated. This is needed because the start and end predictions are not conditioned on one another
do_lower_case (bool, default=True): Whether to lower case the input text. Should be True for uncased models and False for cased models
doc_stride (int, default=128): When splitting up a long document into chunks, how much stride to take between chunks
Postprocess the predictions of bert on SQuAD SquadV1
   label_file: /path/to/label_file
   n_best_size: 20
   max_seq_length: 384
   max_query_length: 64
   max_answer_length: 30
   do_lower_case: True
   doc_stride: True

Pytorch

Transform Parameters Comments Usage(In yaml file)
Resize(size) size (list or int): Size of the result
interpolation(str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic'
Resize the input image to the given size Resize:
   size: 256
   interpolation: bilinear
CenterCrop(size) size (list or int): Size of the result Crops the given image at the center to the given size CenterCrop:
   size: [10, 10] # or size: 10
RandomResizedCrop(size, scale, ratio, interpolation) size (list or int): Size of the result
scale (tuple or list, default=(0.08, 1.0)):range of size of the origin size cropped
ratio (tuple or list, default=(3. / 4., 4. / 3.)): range of aspect ratio of the origin aspect ratio cropped
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic'
Crop the given image to random size and aspect ratio RandomResizedCrop:
   size: [10, 10] # or size: 10
   scale: [0.08, 1.0]
   ratio: [3. / 4., 4. / 3.]
   interpolation: bilinear
Normalize(mean, std) mean (list, default=[0.0]):means for each channel, if len(mean)=1, mean will be broadcasted to each channel, otherwise its length should be same with the length of image shape
std (list, default=[1.0]):stds for each channel, if len(std)=1, std will be broadcasted to each channel, otherwise its length should be same with the length of image shape
Normalize a image with mean and standard deviation Normalize:
   mean: [0.0, 0.0, 0.0]
   std: [1.0, 1.0, 1.0]
RandomCrop(size) size (list or int): Size of the result Crop the image at a random location to the given size RandomCrop:
   size: [10, 10] # size: 10
Compose(transform_list) transform_list (list of Transform objects): list of transforms to compose Composes several transforms together If user uses yaml file to configure transforms, LPOT will automatic call Compose to group other transforms.
In user code:
from lpot.experimental.data import TRANSFORMS
preprocess = TRANSFORMS(framework, 'preprocess')
resize = preprocess["Resize"] (**args)
normalize = preprocess["Normalize"] (**args)
compose = preprocess["Compose"] ([resize, normalize])
sample = compose(sample)
# sample: image, label
RandomHorizontalFlip() None Horizontally flip the given image randomly RandomHorizontalFlip: {}
RandomVerticalFlip() None Vertically flip the given image randomly RandomVerticalFlip: {}
Transpose(perm) perm (list): A permutation of the dimensions of input image Transpose image according perm Transpose:
   perm: [1, 2, 0]
CropToBoundingBox(offset_height, offset_width, target_height, target_width) offset_height (int): Vertical coordinate of the top-left corner of the result in the input
offset_width (int): Horizontal coordinate of the top-left corner of the result in the input
target_height (int): Height of the result
target_width (int): Width of the result
Crops an image to a specified bounding box CropToBoundingBox:
   offset_height: 10
   offset_width: 10
   target_height: 224
   224
ToTensor() None Convert a PIL Image or numpy.ndarray to tensor ToTensor: {}
ToPILImage() None Convert a tensor or an ndarray to PIL Image ToPILImage: {}
Pad(padding, fill, padding_mode) padding (int or tuple or list): Padding on each border
fill (int or str or tuple): Pixel fill value for constant fill. Default is 0
padding_mode (str): Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant
Pad the given image on all sides with the given “pad” value Pad:
   padding: 0
   fill: 0
   padding_mode: constant
ColorJitter(brightness, contrast, saturation, hue) brightness (float or tuple of python:float (min, max)): How much to jitter brightness. Default is 0
contrast (float or tuple of python:float (min, max)): How much to jitter contrast. Default is 0
saturation (float or tuple of python:float (min, max)): How much to jitter saturation. Default is 0
hue (float or tuple of python:float (min, max)): How much to jitter hue. Default is 0
Randomly change the brightness, contrast, saturation and hue of an image ColorJitter:
   brightness: 0
   contrast: 0
   saturation: 0
   hue: 0
ToArray() None Convert PIL Image to numpy array ToArray: {}
CropResize(x, y, width, height, size, interpolation) x (int):Left boundary of the cropping area
y (int):Top boundary of the cropping area
width (int):Width of the cropping area
height (int):Height of the cropping area
size (list or int): resize to new size after cropping
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic'
Crop the input image with given location and resize it CropResize:
   x: 0
   y: 5
   width: 224
   height: 224
   size: [100, 100] # or size: 100
   interpolation: bilinear
Cast(dtype) dtype (str, default ='float32') :The target data type Convert image to given dtype Cast:
   dtype: float32
AlignImageChannel(dim) dim (int): The channel number of result image Align image channel, now just support [H,W,4]->[H,W,3] and [H,W,3]->[H,W], input image must be PIL Image AlignImageChannel:
   dim: 3

MXNet

Transform Parameters Comments Usage(In yaml file)
Resize(size, interpolation) size (list or int): Size of the result
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic'
Resize the input image to the given size Resize:
   size: 256
   interpolation: bilinear
CenterCrop(size) size (list or int): Size of the result Crops the given image at the center to the given size CenterCrop:
   size: [10, 10] # or size: 10
RandomResizedCrop(size, scale, ratio, interpolation) size (list or int): Size of the result
scale (tuple or list, default=(0.08, 1.0)):range of size of the origin size cropped
ratio (tuple or list, default=(3. / 4., 4. / 3.)): range of aspect ratio of the origin aspect ratio cropped
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic'
Crop the given image to random size and aspect ratio RandomResizedCrop:
   size: [10, 10] # or size: 10
   scale: [0.08, 1.0]
   ratio: [3. / 4., 4. / 3.]
   interpolation: bilinear
Normalize(mean, std) mean (list, default=[0.0]):means for each channel, if len(mean)=1, mean will be broadcasted to each channel, otherwise its length should be same with the length of image shape
std (list, default=[1.0]):stds for each channel, if len(std)=1, std will be broadcasted to each channel, otherwise its length should be same with the length of image shape
Normalize a image with mean and standard deviation Normalize:
   mean: [0.0, 0.0, 0.0]
   std: [1.0, 1.0, 1.0]
RandomCrop(size) size (list or int): Size of the result Crop the image at a random location to the given size RandomCrop:
   size: [10, 10] # size: 10
Compose(transform_list) transform_list (list of Transform objects): list of transforms to compose Composes several transforms together If user uses yaml file to configure transforms, LPOT will automatic call Compose to group other transforms.
In user code:
from lpot.experimental.data import TRANSFORMS
preprocess = TRANSFORMS(framework, 'preprocess')
resize = preprocess["Resize"] (**args)
normalize = preprocess["Normalize"] (**args)
compose = preprocess["Compose"] ([resize, normalize])
sample = compose(sample)
# sample: image, label
CropResize(x, y, width, height, size, interpolation) x (int):Left boundary of the cropping area
y (int):Top boundary of the cropping area
width (int):Width of the cropping area
height (int):Height of the cropping area
size (list or int): resize to new size after cropping
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic'
Crop the input image with given location and resize it CropResize:
   x: 0
   y: 5
   width: 224
   height: 224
   size: [100, 100] # or size: 100
   interpolation: bilinear
RandomHorizontalFlip() None Horizontally flip the given image randomly RandomHorizontalFlip: {}
RandomVerticalFlip() None Vertically flip the given image randomly RandomVerticalFlip: {}
CropToBoundingBox(offset_height, offset_width, target_height, target_width) offset_height (int): Vertical coordinate of the top-left corner of the result in the input
offset_width (int): Horizontal coordinate of the top-left corner of the result in the input
target_height (int): Height of the result
target_width (int): Width of the result
Crops an image to a specified bounding box CropToBoundingBox:
   offset_height: 10
   offset_width: 10
   target_height: 224
   224
ToArray() None Convert NDArray to numpy array ToArray: {}
ToTensor() None Converts an image NDArray or batch of image NDArray to a tensor NDArray ToTensor: {}
Cast(dtype) dtype (str, default ='float32') :The target data type Convert image to given dtype Cast:
   dtype: float32
Transpose(perm) perm (list): A permutation of the dimensions of input image Transpose image according perm Transpose:
   perm: [1, 2, 0]
AlignImageChannel(dim) dim (int): The channel number of result image Align image channel, now just support [H,W]->[H,W,dim], [H,W,4]->[H,W,3] and [H,W,3]->[H,W] AlignImageChannel:
   dim: 3
ToNDArray() None Convert np.array to NDArray ToNDArray: {}

ONNXRT

Type Parameters Comments Usage(In yaml file)
Resize(size, interpolation) size (list or int): Size of the result
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest', 'bicubic'
Resize the input image to the given size Resize:
   size: 256
   interpolation: bilinear
CenterCrop(size) size (list or int): Size of the result Crops the given image at the center to the given size CenterCrop:
   size: [10, 10] # or size: 10
RandomResizedCrop(size, scale, ratio, interpolation) size (list or int): Size of the result
scale (tuple or list, default=(0.08, 1.0)):range of size of the origin size cropped
ratio (tuple or list, default=(3. / 4., 4. / 3.)): range of aspect ratio of the origin aspect ratio cropped
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest'
Crop the given image to random size and aspect ratio RandomResizedCrop:
   size: [10, 10] # or size: 10
   scale: [0.08, 1.0]
   ratio: [3. / 4., 4. / 3.]
   interpolation: bilinear
Normalize(mean, std) mean (list, default=[0.0]):means for each channel, if len(mean)=1, mean will be broadcasted to each channel, otherwise its length should be same with the length of image shape
std (list, default=[1.0]):stds for each channel, if len(std)=1, std will be broadcasted to each channel, otherwise its length should be same with the length of image shape
Normalize a image with mean and standard deviation Normalize:
   mean: [0.0, 0.0, 0.0]
   std: [1.0, 1.0, 1.0]
RandomCrop(size) size (list or int): Size of the result Crop the image at a random location to the given size RandomCrop:
   size: [10, 10] # size: 10
Compose(transform_list) transform_list (list of Transform objects): list of transforms to compose Composes several transforms together If user uses yaml file to configure transforms, LPOT will automatic call Compose to group other transforms.
In user code:
from lpot.experimental.data import TRANSFORMS
preprocess = TRANSFORMS(framework, 'preprocess')
resize = preprocess["Resize"] (**args)
normalize = preprocess["Normalize"] (**args)
compose = preprocess["Compose"] ([resize, normalize])
sample = compose(sample)
# sample: image, label
CropResize(x, y, width, height, size, interpolation) x (int):Left boundary of the cropping area
y (int):Top boundary of the cropping area
width (int):Width of the cropping area
height (int):Height of the cropping area
size (list or int): resize to new size after cropping
interpolation (str, default='bilinear'):Desired interpolation type, support 'bilinear', 'nearest'
Crop the input image with given location and resize it CropResize:
   x: 0
   y: 5
   width: 224
   height: 224
   size: [100, 100] # or size: 100
   interpolation: bilinear
RandomHorizontalFlip() None Horizontally flip the given image randomly RandomHorizontalFlip: {}
RandomVerticalFlip() None Vertically flip the given image randomly RandomVerticalFlip: {}
CropToBoundingBox(offset_height, offset_width, target_height, target_width) offset_height (int): Vertical coordinate of the top-left corner of the result in the input
offset_width (int): Horizontal coordinate of the top-left corner of the result in the input
target_height (int): Height of the result
target_width (int): Width of the result
Crops an image to a specified bounding box CropToBoundingBox:
   offset_height: 10
   offset_width: 10
   target_height: 224
   224
ToArray() None Convert PIL Image to numpy array ToArray: {}
Rescale() None Scale the values of image to [0,1] Rescale: {}
AlignImageChannel(dim) dim (int): The channel number of result image Align image channel, now just support [H,W]->[H,W,dim], [H,W,4]->[H,W,3] and [H,W,3]->[H,W] AlignImageChannel:
   dim: 3
ResizeCropImagenet(height, width, random_crop, resize_side, random_flip_left_right, mean_value, scale) height (int): Height of the result
width (int): Width of the result
random_crop (bool, default=False): whether to random crop
resize_side (int, default=256):desired shape after resize operation
random_flip_left_right (bool, default=False): whether to random flip left and right
mean_value (list, default=[0.0,0.0,0.0]):mean for each channel
scale (float, default=1.0):std value
Combination of a series of transforms which is applicasble to images in Imagenet ResizeCropImagenet:
   height: 224
   width: 224
   random_crop: False
   resize_side: 256
   random_flip_left_right: False
   mean_value: [123.68, 116.78, 103.94]
   scale: 0.017
Cast(dtype) dtype (str, default ='float32') :The target data type Convert image to given dtype Cast:
   dtype: float32