Thursday, December 12, 2019
Optical Character Recognition for Cursive Handwriting free essay sample
In this paper, a new analytic scheme, which uses a sequence of segmentation and recognition algorithms, is proposed for offline cursive handwriting recognition problem. First, some global parameters, such as slant angle, baselines, and stroke width and height are estimated. Second, a segmentation method finds character segmentation paths by combining gray scale and binary information. Third, Hidden Markov Model (HMM) is employed for shape recognition to label and rank the character candidates. For this purpose, a string of codes is extracted from each segment to represent the character candidates. The estimation of feature space parameters is embedded in HMM training stage together with the estimation of the HMM model parameters. Finally, the lexicon information and HMM ranks are combined in a graph optimization problem for word-level recognition. This method corrects most of the errors produced by segmentation and HMM ranking stages by maximizing an information measure in an efficient graph search algorithm. The experiments in dicate higher recognition rates compared to the available methods reported in the literature. Index Terms? Handwritten word recognition, preprocessing, segmentation, optical character recognition, cursive handwriting, hidden Markov model, search, graph, lexicon matching. ? 1 HE most difficult problem in the field of Optical Character Recognition (OCR) is the recognition of unconstrained cursive handwriting. The present tools for modeling almost infinitely many variations of human handwriting are not yet sufficient. The similarities of distinct character shapes, the overlaps, and interconnection of the neighboring characters further complicate the problem. Additionally, when observed in isolation, characters are often ambiguous and require context information to reduce the classification error. Thus, current research aims at developing constrained systems for limited domain applications such as postal address reading [21], check sorting [8], tax reading [20], and office automation for text entry [7]. A well-defined lexicon plus a well-constrained syntax help provide a feasible solution to the problem [11]. Handwritten Word Recognition techniques use either holistic or analytic strategies for training and recognition stages. Holistic strategies employ top-down approaches for recognizing the whole word, thus eliminating the segmentation problem [9]. In this strategy, global features, extracted from the entire word image, are used in recognition of limited-size lexicon. As the size of the lexicon gets larger, the complexity of algorithms increase linearly due to the need for a larger search space and a more complex pattern representation. Additionally, the recognition rates decrease rapidly due to the decrease in betweenclass-variances in the feature space. The analytic strategies, on the other hand, employ bottom-up approaches, starting from stroke or character- T INTRODUCTION level and going towards producing a meaningful text. Explicit [23] or implicit [16] segmentation of word into characters or strokes is required for this strategy. With the cooperation of segmentation stage, the problem is reduced to the recognition of simple isolated characters or strokes, which can be handled for unlimited vocabulary. However, there is no segmentation algorithm available in the literature for correctly extracting the characters from a given word image. The popular techniques are based on over-segmenting the words and applying a search algorithm for grouping segments to make up characters [14], [10]. If a lexicon of limited size is given, dynamic programming is used to rank every word in the lexicon. The word with the highest rank is chosen as the recognition hypothesis. The complexity of search process for this strategy also increases linearly with the lexicon size, if the flat representation of lexicon is used. More efficient representations such as trie and hash tables can be used in order to reduce the search space. Application of the preprocessing techniques to a given image, may introduce unexpected distortion (closing loops, breaking character, spurious branches etc. ) to the data, which may cause unrecoverable errors in the recognition system. Most of the existing character recognition systems threshold the gray-level image and normalize the slant angle and baseline skew in the preprocessing stage. Then, they employ the normalized binary image in the segmentation and recognition stages [10], [16], [3]. However, in some cases, normalization may severely deform the writing, generating improper character shapes. Furthermore, through the binarization of the gray scale document image, useful information is lost. In order to avoid the limitation of binary image, some recent methods use gray-level image [13]. There, however, the insignificant details suppress important shape information. The scheme developed in this study, employs an analytic approach on gray-level image, which is supported by binary image and a set of global features. Document image is not . The authors are with the Computer Engineering Department, Middle East Technical University, Ankara, Turkey. E-mail: {nafiz, vural}@ceng. metu. edu. r. Fig. 1. System overview. preprocessed for noise reduction and normalization. However, global parameters, such as lower-upper baseline and slant angle are estimated and then incorporated to improve the accuracy of the segmentation and recognition stages. The scheme makes concurrent use of binary and gray-level image in a mixed way to extract the maximum amount of information for both segmentat ion and recognition. The segmentation algorithm, proposed in this study, segments the whole word into strokes, each of which corresponds mostly to a character or rarely to a portion of a character. Recognition of each segment is accomplished in three stages: In the first stage, characters are labeled in three classes as ascending, descending, and normal characters. In the second stage, Hidden Markov Model (HMM) is employed for shape recognition. The features extracted from the strokes of each segment are fed to a left-right HMM. The parameters of the feature space are also estimated in the training stage of HMM. Finally, an efficient word-level recognition algorithm resolves handwriting strings by combining lexicon information and the HMM probabilities. The proposed system receives the gray-level word image as input, assuming the segmentation of input image into individual words is performed. Although the system is designed for cursive handwriting, methodologies used in the system are easily applicable to machine or hand-printed characters. System overview is summarized by the block diagram representation in Fig. 1. Global parameter estimation, segmentation, and feature extraction stages employ both gray-level and binary images. The parameters for HMM and feature space are estimated by using the correctly segmented character images in training. These parameters are then used in feature extraction and HMM ranking of character segments. Finally, the word-level recognition algorithm maximizes an information measure, using the HMM probabilities and lexicon information, resulting with ASCII strings. If the input image consists of isolated characters, the segmentation stage is omitted. Global Parameter Estimation. The output of the global parameter estimation stage is the word-level features, such as average stroke width/height, baselines, skew, and slant angles (see Section 3). First Level Character Classification. The baselines and character size information estimated in HMM training stage are used to decide on the ascending and descending character thresholds in a given word image. The character size information contains the height-to-width ratios of ascending, descending, and normal characters (see Section 4). Segmentation. Initially, the word image is divided into segmentation regions each of which contains a segmentation path. Then, a search process finds the segmentation path in each region in order to split the connected characters. The algorithm performs the search process by combining the characteristics of gray scale and binary images. The proposed method slightly over-segments the word image (see Section 5). Feature Extraction and HMM Training. Since HMM is most successful in the recognition of one-dimensional string of codes, it is critical to represent the two-dimensional information of character images as one dimensional strings. A feature extraction scheme proposed by the authors of this study [1] is employed in this stage, where a set of directional skeletons is extracted by scanning a fixed size window in arious directions (see Section 6. 1). HMM training is performed on the selected output of the segmentation stage for the estimation of both HMM parameters and the parameters of feature space. These parameters are composed of the character window size, number of scanning directions, and number of regions in each scanning direction. The parameters, which give the maximum recognition rate for the trainin g set, are then used to form the feature space of recognition stage (see Section 6. 2). HMM Ranking. Each string of codes extracted from a character segment is fed to the HMM recognizer.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.