Reference guide¶
Data handling for input function¶
dataset_generator (csv_filename, params[, …]) |
Generates the dataset for the experiment. |
padding_inputs_width (image, target_shape, …) |
Given an input image, will pad it to return a target_shape size padded image. |
augment_data (image[, max_rotation, …]) |
Data augmentation on an image (padding, brightness, contrast, rotation) |
random_rotation (img[, max_rotation, crop, …]) |
Rotates an image with a random angle. |
Model definitions¶
ConvBlock (features, kernel_size, stride, …) |
Convolutional block class. |
get_model_train (parameters) |
Constructs the full model for training. |
get_model_inference (parameters[, weights_path]) |
Constructs the full model for prediction. |
get_crnn_output (input_images[, parameters]) |
Creates the CRNN network and returns it’s output. |
Config for training¶
Alphabet ([lookup_alphabet_file, blank_symbol]) |
Class for alphabet / symbols units. |
Params (**kwargs) |
Class for parameters of the model and the experiment |
import_params_from_json ([model_directory, …]) |
Read the exported json file with parameters of the experiment. |
Custom Callbacks¶
CustomSavingCallback (output_dir, saving_freq) |
Callback to save weights, architecture, and optimizer at the end of training. |
LRTensorBoard (log_dir, **kwargs) |
Adds learning rate to TensorBoard scalars. |
CustomLoaderCallback (loading_dir) |
Callback to load necessary weight and parameters for training, evaluation and prediction. |
CustomPredictionSaverCallback (output_dir, …) |
Callback to save prediction decoded outputs. |
Preprocessing data¶
data_preprocessing (params) |
Preporcesses the data for the experiment (training and evaluation data). |
preprocess_csv (csv_filename, parameters, …) |
Converts the original csv data to the format required by the experiment. |
-
tf_crnn.data_handler.
augment_data
(image, max_rotation=0.1, minimum_width=0)[source]¶ Data augmentation on an image (padding, brightness, contrast, rotation)
Parameters: - image (
MockObject
) – Tensor - max_rotation (
float
) – float, maximum permitted rotation (in radians) - minimum_width (
int
) – minimum width of image after data augmentation
Return type: MockObject
Returns: Tensor
- image (
-
tf_crnn.data_handler.
dataset_generator
(csv_filename, params, use_labels=True, batch_size=64, data_augmentation=False, num_epochs=None, shuffle=True)[source]¶ Generates the dataset for the experiment.
Parameters: - csv_filename (
Union
[List
[str
],str
]) – Path to csv file containing the data - params (
Params
) – parameters df the experiment (Params
) - use_labels (
bool
) – boolean to indicate dataset generation during training / evaluation (true) or prediction (false) - batch_size (
int
) – size of the generated batches - data_augmentation (
bool
) – whether to use data augmentation strategies or not - num_epochs (
Optional
[int
]) – number of epochs to repeat the dataset generation - shuffle (
bool
) – whether to suffle the data
Returns: tf.data.Dataset
- csv_filename (
-
tf_crnn.data_handler.
get_resized_width
(image, target_height, increment)[source]¶ Resizes the image according to target_height.
Parameters: - image (
MockObject
) – image to resize - target_height (
int
) – height of the resized image - increment (
int
) – reduction factor due to pooling between input width and output width, this makes sure that the final width will be a multiple of increment
Returns: resized image
- image (
-
tf_crnn.data_handler.
padding_inputs_width
(image, target_shape, increment)[source]¶ Given an input image, will pad it to return a target_shape size padded image. There are 3 cases:
- image width > target width : simple resizing to shrink the image
- image width >= 0.5*target width : pad the image
- image width < 0.5*target width : replicates the image segment and appends it
Parameters: - image (
MockObject
) – Tensor of shape [H,W,C] - target_shape (
Tuple
[int
,int
]) – final shape after padding [H, W] - increment (
int
) – reduction factor due to pooling between input width and output width, this makes sure that the final width will be a multiple of increment
Return type: Tuple
[MockObject
,MockObject
]Returns: (image padded, output width)
-
tf_crnn.data_handler.
random_rotation
(img, max_rotation=0.1, crop=True, minimum_width=0)[source]¶ Rotates an image with a random angle. See https://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders for formulae
Parameters: - img (
MockObject
) – Tensor - max_rotation (
float
) – maximum angle to rotate (radians) - crop (
bool
) – boolean to crop or not the image after rotation - minimum_width (
int
) – minimum width of image after data augmentation
Return type: MockObject
Returns: - img (
-
class
tf_crnn.config.
Alphabet
(lookup_alphabet_file=None, blank_symbol='$')[source]¶ Class for alphabet / symbols units.
Variables: - _blank_symbol (str) – Blank symbol used for CTC
- _alphabet_units (List[str]) – list of elements composing the alphabet. The units may be a single character or multiple characters.
- _codes (List[int]) – Each alphabet unit has a unique corresponding code.
- _nclasses (int) – number of alphabet units.
-
alphabet_units
¶
-
blank_symbol
¶
-
check_input_file_alphabet
(csv_filenames, discarded_chars=';|\t\n\r\x0b\x0c', csv_delimiter=';')[source]¶ Checks if labels of input files contains only characters that are in the Alphabet.
Parameters: - csv_filenames (
List
[str
]) – list of the csv filename - discarded_chars (
str
) – discarded characters - csv_delimiter (
str
) – character delimiting field in the csv file
Return type: None
Returns: - csv_filenames (
-
codes
¶
-
classmethod
create_lookup_from_labels
(csv_files, export_lookup_filename, original_lookup_filename=None)[source]¶ Create a lookup dictionary for csv files containing labels. Exports a json file with the Alphabet.
Parameters: - csv_files (
List
[str
]) – list of files to get the labels from (should be of format path;label) - export_lookup_filename (
str
) – filename to export alphabet lookup dictionary - original_lookup_filename (
Optional
[str
]) – original lookup filename to update (optional)
Returns: - csv_files (
-
classmethod
load_lookup_from_json
(json_filenames)[source]¶ Load a lookup table from a json file to a dictionnary :type json_filenames:
Union
[List
[str
],str
] :param json_filenames: either a filename or a list of filenames :rtype:dict
:return:
-
classmethod
make_json_lookup_alphabet
(string_chars=None)[source]¶ Parameters: string_chars ( Optional
[str
]) – for example string.ascii_letters, string.digitsReturn type: dict
Returns:
-
classmethod
map_lookup
(lookup_table, unique_entry=True)[source]¶ Converts an existing lookup table with minimal range code ([1, len(lookup_table)-1]) and avoids multiple instances of the same code label (bijectivity)
Parameters: - lookup_table (
dict
) – dictionary to be mapped {alphabet_unit : code label} - unique_entry (
bool
) – If each alphabet unit has a unique code and each code a unique alphabet unique (‘bijective’), only True is implemented for now
Return type: dict
Returns: a mapped dictionary
- lookup_table (
-
n_classes
¶
-
class
tf_crnn.config.
Params
(**kwargs)[source]¶ Class for parameters of the model and the experiment
Variables: - input_shape (Tuple[int, int]) – input shape of the image to batch (this is the shape after data augmentation). The original will either be resized or pad depending on its original size
- input_channels (int) – number of color channels for input image (default: 1)
- cnn_features_list (List(int)) – a list of length n_layers containing the number of features for each convolutionl layer (default: [16, 32, 64, 96, 128])
- cnn_kernel_size (List(int)) – a list of length n_layers containing the size of the kernel for each convolutionl layer (default: [3, 3, 3, 3, 3])
- cnn_stride_size (List((int, int))) – a list of length n_layers containing the stride size each convolutionl layer (default: [(1, 1), (1, 1), (1, 1), (1, 1), (1, 1)])
- cnn_pool_size (List((int, int))) – a list of length n_layers containing the pool size each MaxPool layer default: ([(2, 2), (2, 2), (2, 2), (2, 2), (1, 1)])
- cnn_batch_norm (List(bool)) – a list of length n_layers containing a bool that indicated wether or not to use batch normalization (default: [False, False, False, False, False])
- rnn_units (List(int)) – a list containing the number of units per rnn layer (default: 256)
- num_beam_paths (int) – number of paths (transcriptions) to return for ctc beam search (only used when predicting)
- csv_delimiter (str) – character to delimit csv input files (default: ‘;’)
- string_split_delimiter (str) – character that delimits each alphabet unit in the labels (default: ‘|’)
- csv_files_train (str) – csv filename which contains the (path;label) of each training sample
- csv_files_eval (str) – csv filename which contains the (path;label) of each eval sample
- lookup_alphabet_file (str) – json file that contains the mapping alphabet units <-> codes
- blank_symbol (str) – symbol for to be considered as blank by the CTC decoder (default: ‘$’)
- max_chars_per_string (int) – maximum characters per sample (to avoid CTC decoder errors) (default: 75)
- data_augmentation (bool) – if True augments data on the fly (default: true)
- data_augmentation_max_rotation (float) – max permitted roation to apply to image during training in radians (default: 0.005)
- data_augmentation_max_slant (float) – maximum angle for slant augmentation (default: 0.7)
- n_epochs (int) – numbers of epochs to run the training (default: 50)
- train_batch_size (int) – batch size during training (default: 64)
- eval_batch_size (int) – batch size during evaluation (default: 128)
- learning_rate (float) – initial learning rate (default: 1e-4)
- evaluate_every_epoch (int) – evaluate every ‘evaluate_every_epoch’ epoch (default: 5)
- save_interval (int) – save the model every ‘save_interval’ epoch (default: 20)
- optimizer (str) – which optimizer to use (‘adam’, ‘rms’, ‘ada’) (default: ‘adam’)
- output_model_dir (str) – output directory where the model will be saved and exported
- restore_model (bool) – boolean to continue training with saved weights (default: False)
-
classmethod
from_json_file
(json_file)[source]¶ Given a json file, creates a
Params
object.Parameters: json_file ( str
) – path to the json fileReturns: Params
object
-
tf_crnn.config.
import_params_from_json
(model_directory=None, json_filename=None)[source]¶ Read the exported json file with parameters of the experiment.
Parameters: - model_directory (
Optional
[str
]) – Direcoty where the odel was exported - json_filename (
Optional
[str
]) – filename of the file
Return type: dict
Returns: a dictionary containing the parameters of the experiment
- model_directory (
-
class
tf_crnn.model.
ConvBlock
(features, kernel_size, stride, cnn_padding, pool_size, batchnorm, **kwargs)[source]¶ Convolutional block class. It is composed of a Conv2D layer, a BatchNormaization layer (optional), a MaxPool2D layer (optional) and a ReLu activation.
Variables: - features (int) – number of features of the convolutional layer
- kernel_size (int) – size of the convolutional kernel
- stride (int, int) – stride of the convolutional layer
- cnn_padding – padding of the convolution (‘same’ or ‘valid’)
- pool_size (int, int) – size of the maxpooling
- batchnorm (bool) – use batch norm or not
-
tf_crnn.model.
get_crnn_output
(input_images, parameters=None)[source]¶ Creates the CRNN network and returns it’s output. Passes the input_images through the network and returns its output
Parameters: - input_images – images to process (B, H, W, C)
- parameters (
Optional
[Params
]) – parameters of the model (Params
)
Return type: MockObject
Returns: the output of the CRNN model
-
tf_crnn.model.
get_model_inference
(parameters, weights_path=None)[source]¶ Constructs the full model for prediction. Defines inputs and outputs, and loads the weights.
Parameters: - parameters (
Params
) – parameters of the model (Params
) - weights_path (
Optional
[str
]) – path to the weights (.h5 file)
Returns: the model (
tf.Keras.Model
)- parameters (
-
tf_crnn.model.
get_model_train
(parameters)[source]¶ Constructs the full model for training. Defines inputs and outputs, loss function and metric (CER).
Parameters: parameters ( Params
) – parameters of the model (Params
)Returns: the model ( tf.Keras.Model
)
-
class
tf_crnn.callbacks.
CustomLoaderCallback
(loading_dir)[source]¶ Callback to load necessary weight and parameters for training, evaluation and prediction.
Variables: loading_dir (str) – path to directory to save logs
-
class
tf_crnn.callbacks.
CustomPredictionSaverCallback
(output_dir, parameters)[source]¶ Callback to save prediction decoded outputs. This will save the decoded outputs into a file.
Variables: - output_dir (str) – path to directory to save logs
- parameters (Params) – parameters of the experiment (
Params
)
-
class
tf_crnn.callbacks.
CustomSavingCallback
(output_dir, saving_freq, save_best_only=False, keep_max_models=5)[source]¶ Callback to save weights, architecture, and optimizer at the end of training. Inspired by ModelCheckpoint.
Variables: - output_dir (str) – path to the folder where files will be saved
- saving_freq (int) – save every n epochs
- save_best_only (bool) – wether to save a model if it is best thant the last saving
- keep_max_models (int) – number of models to keep, the older ones will be deleted
-
class
tf_crnn.callbacks.
LRTensorBoard
(log_dir, **kwargs)[source]¶ Adds learning rate to TensorBoard scalars.
Variables: logdir (str) – path to directory to save logs
-
tf_crnn.preprocessing.
data_preprocessing
(params)[source]¶ Preporcesses the data for the experiment (training and evaluation data). Exports the updated csv files into <output_model_dir>/preprocessed/updated_{eval,train}.csv
Parameters: params ( Params
) – parameters of the experiment (Params
)Return type: (<class ‘str’>, <class ‘str’>, <class ‘int’>, <class ‘int’>) Returns: output path files, number of samples (for train and evaluation data)
-
tf_crnn.preprocessing.
preprocess_csv
(csv_filename, parameters, output_csv_filename)[source]¶ Converts the original csv data to the format required by the experiment. Removes the samples which labels have too many characters. Computes the widths of input images and removes the samples which have more characters per label than image width. Converts the string labels to dense codes. The output csv file contains the path to the image, the dense list of codes corresponding to the alphabets units (which are padded with 0 if len(label) < max_len) and the length of the label sequence.
Parameters: - csv_filename (
str
) – path to csv file - parameters (
Params
) – parameters of the experiment (Params
) - output_csv_filename (
str
) – path to the output csv file
Return type: int
Returns: number of samples in the output csv file
- csv_filename (