As I've said
here, I'm with developing a LebGeek special anti-speeding detector. As a first step, I'm going to post the algorithm which I adopted to perform license plate extraction, processing and character segmentation. Here, the original image can be captured as a grayscale image or as an RGB image and then converted to a grayscale image. However, all the processing
must be performed on the grayscale form. The image is either a frontal view of the car with the license plate visible or a back view of the car with the license plate visible.
In order to understand the code, it is imperative to explain the main concepts behind the algorithm development. Further data can be extracted from the original IEEE transactions. The sources are:
Main Algorithm Source
Title
“A License Plate-Recognition Algorithm for Intelligent Transportation System Applications”
Authors
“Christos Nikolaos E. Anagnostopolous”
“Ioannis E. Anagnostopolous”
“Vassili Loumos”
“Eleftherious Kayafas”
Publication Source
IEEE Transactions on Intelligent Transportation Systems, Vol.7, № 3, September 2006
Digital Object Identifier
10.1109/TITS.2006.880641
Slant Correction Algorithm
Title
“Slant Correction of Vehicle License Plate Based on Feature Point and Principal Component Analysis”
Authors
“Wu Guo-ping”
“Cheng Shi”
“Ao Min-si”
“Lei Hui”
Publication Source
IEEE International Conference on Computer Science and Software Engineering of 2008
Digital Object Identifier
10.1109/CSSE.2008.1648
Explaining main algorithm:
The process as implied in the title involves three major partitions: license plate extraction, license plate processing, and character segmentation of license plate.
License Plate Extraction
As the above image indicates, there are 7 steps in the process of license plate recognition or LPR: sliding concentric window or SCW segmentation (to indicate region where the license plate or LP lies within), image masking (to isolate region of image where LP is located, but other regions can also be isolated falsely), Sauvola binarization or SB (to apply thresholding over isolated region such that the ambient or surrounding illumination source like the sun or flash of the camera is accounted for; precedes next step and is used to remove undesirable regions which have been falsely isolated), connected component analysis or CCA (to identify which of the isolated regions is actually the LP region), image inversion if required (sometimes the later steps yield no result even though there might in fact be LPs in the image captured, this is where the original grayscale image is inverted and the latter 4 steps are repeated => 85% of the time the LPs are detected correctly), object number tracking (in some images there might be multiple LPs so they are each tracked by an object number <= this step has no purpose in the anti-speeding detector in most cases), and finally storing the coordinates of the LP boundaries (to extract the LP from the original image; note that the isolated regions are in binary form; what we require is actually the grayscale form of the LP <= to be used in later section license plate processing).
Sliding Concentric Windows or SCW
SCW is an adaptive segmentation technique that identifies the LP region by means of statistical data. Review the above image for more info (please post questions if a certain point is not clear).
Image Masking and Sauvola Binarization or SB
After performing the SCW segmentation, all the candidate regions (where LP lies within one of them) are extracted from original grayscale image by performing first an image masking (simply a logical ANDing of the candidate regions with the original image => the result is a grayscale image with the candidate regions in grayscale form and the remaining picture pitch black) then applying SB. The SB is a thresholding technique that depends on ambient lighting: since when LPs are subjected to extra light (ie flash of the camera), the reflective material within causes the LP to shine even if in broad daylight. This allows the elimination of all the falsely isolated regions where no LPs exist. The SB is also considered as an edge detection method. Edge detection is used to reduce an image into a group of contours for the objects lying within it.
Connected Component Analysis or CCA
After SB edge detection, we label the objects within the image to identify how many objects (in this case the objects are the LPs) exist within the isolated region. But before the labeling, we perform connected component labeling where the pixels that lie within the neighborhood of a certain pixel are recognized as coming from the same object; this topic has been discussed already
here. 8-connectivity is adopted for this application. Now after connecting the pixels of the same object together, we identify how many objects exist in the isolated region that match certain characteristics of an LP (here we assume that the LP studied is only Lebanese LP: this is our first limitation which needs solving; without this restriction, it is very complex to conduct the remaining steps and sections of the algorithm that even the authors have applied to their LP detection of Greek license plates). In each country, the LPs have specific parameters (constant height to width ratio within a marginal error), orientation (not of the LP itself but rather the alphanumeric characters and other objects like the Lebanese cedar tree are unique in their orientation with respect to the borders), and Euler's number (the number of alphanumeric characters that lie within the Lebanese LP are also unique compared to the rest of the countries). These conditions are used to identify the LP objects in the isolated and selected regions.
An example of how the original image is reduced to after applying the aforementioned steps is:
Image Inversion and Object Number Tracking
Sometimes the luminosity of the surrounding is not enough (e.g. in the dark) so the reflective material of the LP shines brightly which can lead to a false conclusion that the captured image does not have an LP object. This is where the grayscale inverted form of the image is treated as the original image instead, and all the 4 steps mentioned so far would be repeated on this inverted form. Other cases considered are license plates with dark background and white characters (like those of Qatar). This method is only effective 85% of the time <= our second limitation. As for object number tracking, it is just to recognize exactly how many LPs exist within an image (I don't think this step is required for an anti-speeding detector).
Storing LP Boundary Coordinates
Finally, the coordinates of the desired LPs are stored for later analysis (sections 2 and 3).
An image of how the algorithm can be applied for the LP with dark background and white characters is shown below:
For those of you who prefer a flowchart than reading, here you go: flowchart of LPR stage:
License Plate Processing
This stage is proceeded by slant correction which for now will not be explained or implemented; I will do that later on but rest assure that it will be explained before posting the code and results of implementation on Lebanese LPs.
This section involves 8 steps: specifying sub-image (to process each LP alone), cropping sub-image (to resize extracted regions into a standard form before conducting further analysis), SCW segmentation (performed on alphanumeric characters), image inversion (for better CCA results), CCA (connected-component labeling of alphanumeric characters in the LP using the orientation and height standards), character enclosing (separating alphanumeric characters into regions of their own), character number tracking (number of enclosed boxes = number of alphanumeric characters per LP <= I think this is where we can actually modify the code to study all possible LPs that can be encountered: if no boxes are yielded, we can refer the image to another part where the algorithm is tailored to suit LPs from other countries, or if another number of boxes is yielded, then we refer the image to another part where the country LP with that number of LP characters is analyzed by its tailored code), and character resizing (resize the characters for next section of character recognition).
Specifying and Cropping Sub-images, SCW Segmentation, Image Inversion, and CCA
Character Enclosing, Tracking and Resizing
We mean by PNN: Probabilistic Neural Network, which is a neural network that uses Bayes detector and less training cycles than regular networks like back-propagation to make its decisions (more on NNs
here). Although it requires cumbersome coding and it has a rather slow execution time, it is good for analyzing noisy, degraded LPs and continuously improves the learning curve for better character recognition. Its success rate is 95% to 98%.
An example of how the final LP appears is:
The flowchart for character segmentation or CS is:
Optical Character Recognition or OCR
This section involves developing a PNN for character recognition. We discuss the topology and the learning mechanism of the PNN.
Topology
An example of what the PNN topology looks like is:
The topology used for our Lebanese LPs however require less output nodes. The 108 input layer nodes correspond to number of pixels that compose a single LP alphanumeric character. The 180 middle layer nodes correspond to number of output nodes (36) times number of image patterns (i.e. possible view of character) stored for each of the output nodes (in this case 5). The 36 output layer nodes correspond to the 10 digits (0 to 9) plus 26 letters of the English alphabet (A to Z).
Competitive and Conscience Learning Mechanism
The learning mechanism enhances the chances of recognizing the character correctly; this is done as an update on how the character encountered might look like (ie characteristics) in the database of alphanumeric characters that is stored and used for the comparison (using cross-correlation) with the analyzed input characters. Note that not all the alphabet characters are used for Lebanese LPs: only the letters corresponding to the cities which issue the LPs are considered, such as B for Beirut, Z for Zahle, T for Tripoli, etc... But you also have to figure that the detected characters are not only for English alphanumeric characters but Arabic ones. On the other hand, it is possible to identify the LP correctly by simply analyzing the English section of the LP. When the analysis yields no result, the Arabic section would be analyzed. These restrictions allow not only reducing the number of output nodes on the PNN network, but also generating 2 PNN networks one for English section and one for Arabic section => less complexity and faster execution run-time. A similar restriction analysis can be conducted on LPs of each country. The mathematical formulas would be given later on.
To those who are interested in trying to write the code for the algorithm, I'll give you a week starting from now. Just implement the first 2 sections for now. The next post would explain the slant correction. Have fun!!