Bạn có chắc chắn muốn xóa bài viết này không ?
Bạn có chắc chắn muốn xóa bình luận này không ?
OCR Part IV
PART IV
With the dynamic technique it is easier to OCR the character and it works faster than the static technique. The Static OCR bases on an Alphanumeric table (A..Z a..z 0..9 and special characters # * @ ....). An incoming letter has to be compared with all characters in the Alphanumeric table (character by character) and the result is the best-matched comparison. The probability that a letter is correctly recognized (identified) is depending on the quality of the retrieved letters of the image. It's because of the compression technique of the image. PNG is the best, JPG or GIF could cause more blurred pixels and that could easily falsify the comparison result.
Image Comparison is a very tricky work. The reason is the huge possibilities of Pixels (Alpha, Red, Green and Blue): 4294967295 possibilities (hex. 0xFFFFFFFF). Pixel-Comparison is the best way to get lost in the maze of colors. However, in OCR it is usually about letters with one color (black or white). As said in the previous session a string of letters can be isolated, pipelined for letter-by-letter comparison. The isolation bases on the specified string color (black or white) and the translucentification of the unneeded.
Similar to the dynamic OCR the static OCR relies on some clues that make out the similarity between the two images. Examples:
- Resize the images so that they have the same size (width and height)
- The dispersion between opaque and translucentified pixels (the X and Y Coordinate)
Examples:
public class StaticOCR implements OCR {
public StaticOCR(BufferedImage img, int[] xy) {
this.xy = xy;
this.img = img;
font = img.getGraphics().getFont();
fontName = font .getName().toLowerCase();
list = java.util.Arrays.asList(letters);
idx = list.indexOf("a");
}
public int ocrLetter( ) {
try {
// UpperCase or tall letter
boolean up = (xy[5] > xy[1]);
BufferedImage dImg, sImg, aImg, bImg;
sImg = ImageTools.extractImage(img, xy[0], xy[1], xy[10], xy[11]);
int width = sImg.getWidth(), height = sImg.getHeight(), alp = getAlphas(sImg);
for (int ix = up? 0:idx, mx = up? idx:list.size(); ix < mx; ++ix) {
String letter = letters[ix];
dImg = ImageTools.createFontImage(letter, fontName, Font.BOLD, 30);
if (dImg == null || alp > 0 && getAlphas(dImg) == 0 || alp == 0 && getAlphas(dImg) > 0) continue;
int dW = dImg.getWidth(), dH = dImg.getHeight();
if (dW > width || dH > height) {
aImg = sImg;
bImg = resize(dImg, width, height);
} else if (dW < width || dH < height) {
bImg = dImg;
aImg = resize(sImg, dW, dH);
} else {
aImg = sImg;
bImg = dImg;
}
if (compare(aImg, bImg)) {
return (int)letter.charAt(0);
}
}
} catch (Exception ex) {
ex.printStackTrace();
return 0;
}
return (int)' ';
}
// source Image: sImg, letter from Alphanumeric table: dImg
private boolean compare(BufferedImage sImg, BufferedImage dImg) {
int width = sImg.getWidth(), height = sImg.getHeight();
int dWidth = dImg.getWidth(), dHeight = dImg.getHeight();
if (width != dWidth || height != dHeight || width < 2 || height < 2) return false;
float matched = 0f, total = 0f;
//
for (int y = 0; y < height; ++y) for (int x = 0; x < width; ++x) {
if ((sImg.getRGB(x, y) & ALPHA) == (dImg.getRGB(x, y) & ALPHA)) ++matched;
++total;
}
int esW = width-1, edW = dWidth-1, esH = height-1, edH = dHeight-1;
int smW = width >> 1, dmW = dWidth >> 1, smH = height >> 1, dmH = dHeight >> 1;
boolean xOK = xCount(sImg, smH) == xCount(dImg, dmH) && xCount(sImg, 0) == xCount(dImg, 0) &&
xCount(sImg, esH) == xCount(dImg, edH);
boolean yOK = yCount(sImg, smW) == yCount(dImg, dmW) || yCount(sImg, 0) == yCount(dImg, 0) &&
yCount(sImg, esW) == yCount(dImg, edW);
// must match 75% pixels between 2 images and the dispersions of opaque & translucentified pixels
return (matched/total) > 0.75f && xOK && yOK;
}
// check all translucentified pixels
private int getAlphas(BufferedImage img) {
int alp = 0;
int width = img.getWidth(), height = img.getHeight();
for (int y = 0; y < height; ++y) for (int x = 0; x < width; ++x) if ((img.getRGB(x, y) & ALPHA) == 0) ++alp;
return alp;
}
// Resize the image to the given width and height
private BufferedImage resize(BufferedImage image, int width, int height) {
try {
if (image.getWidth() == width && image.getHeight() == height) return image;
BufferedImage letterImg = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
Graphics2D graphics2D = letterImg.createGraphics();
graphics2D.setBackground(Color.WHITE);
graphics2D.setPaint(Color.WHITE);
graphics2D.fillRect(0, 0, width, height);
graphics2D.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BILINEAR);
graphics2D.drawImage(image, 0, 0, width, height, null);
return ImageTools.filterColor(0, 0, letterImg);
} catch (Exception ex) { }
return image;
}
// the X dispersion
private int xCount(BufferedImage img, int y) {
int cnt = 0, width = img.getWidth();
LOOP: for (int x = 0; x < width; ++x) if ((img.getRGB(x, y) & ALPHA) != 0) {
for (++cnt, ++x; x < width; ++x) if ((img.getRGB(x, y) & ALPHA) == 0) break;
if (x < width) continue LOOP;
return cnt;
}
return cnt;
}
// the Y dispersion
private int yCount(BufferedImage img, int x) {
int cnt = 0, height = img.getHeight();
LOOP: for (int y = 0; y < height; ++y) if ((img.getRGB(x, y) & ALPHA) != 0) {
for (++cnt, ++y; y < height; ++y) if ((img.getRGB(x, y) & ALPHA) == 0) break;
if (y < height) continue LOOP;
return cnt;
}
return cnt;
}
// the Alphanumeric Table
private String[] letters = {
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P",
"Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z",
"b", "d", "f", "h", "k", "l", "t",
"1", "2", "3", "4", "5", "6", "7", "8", "9", "0",
"!", "§", "$", "%", "&", "/", "(", ")", "'",
"\\", "@", "€", "{", "}", "[", "]", "|",
"a", "c", "e",
"g", "i", "j", "m", "n", "o", "p", "q", "r", "s", "u",
"v", "w", "x", "y", "z",
"\"", "=", "?", "*", "+", "#", ";",
",", ":", "_", "-", "~", "°", "^", "<", ">"
};
}
The letter "r" is failed by the comparison because of the blurring of the image that needs to be "resized" accordingly to the other image (either the source or a letter from the table) so that a comparison can be made...







