Optical Character Recognition (OCR) PART I

Nobody viết ngày 09/01/2022

Part I

(This article is an introduction of OCR on this site HERE)

During the 80/90 the IT society was inundated and buried by a sudden Tsunami which was triggered and caused by Apple and Co. It was the WYSIWYG. What is it? It stands for What You See Is What You Get, Or in today's parlance: the ICONS. The icons were brand-new for that time when people had to struggle with cryptic texts and weird menus. They, the icons, were the "ground-breaking" revolution of the IT text-oriented world. A world of boffins and eggheads. A computer without graphical icons -regardless of what kind of computer- is today practically unsellable. Could you imagine that your iPhone or Android smartphone were driven by only "texts" and "menus" ? Even an ancient Chinese man with the name Confucius had to admit that a picture was better to memorize than a narrative.

I am not a revolutionary, nor a dictator in order to shuffle the IT society like Steve Jobs with his WYSIWYG. But what I want to show is a little wave ITOC-ICAW (If The Other Can - I Can As Well). What is OCR really? Instead of trying to explain OCR to you I took the freedom to cite a text from Wikipedia:

"Optical character recognition
or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image..."

More HERE. In short: getting a corrigible text out of an image.

And like Master Confucius said that if he did it he would understand it. So, I show you the way how to implement an OCR in JAVA and let you try to implement it so that you remember and understand OCR. I have showed you how to process Images and Pixels in JAVA and mentioned about the OCR in relationship with the Fonts. Let refresh our memory:

  public static BufferedImage createFontImage(String string, String fontName, int fontAtt, int size) {
    //  create a BufferImage with width = 1 and height = 1
    BufferedImage image = new BufferedImage(1, 1, BufferedImage.TYPE_INT_ARGB);
    Graphics2D g = image.createGraphics();
    // create a font with the given fontName (e.g. TimesRoman), attribute (e.g. Font.BOLD) and size (e.g. 15)
    Font font = new Font(fontName, fontAtt, size);
    // convert to FontMetrics
    FontMetrics metrics = g.getFontMetrics(font);
    int height = metrics.getHeight();
    int width  = metrics.stringWidth(string);
    // create an image with this width and height
    image = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
    // draw or write the given letter (or String) in BLACK with the background WHITE
    g = image.createGraphics();
    g.fillRect(0, 0, width, height);
    g.drawString(string, 0, height);
    // eliminate the "void"
    int y = 0;
    int white = Color.WHITE.getRGB();
    LOOP: for (; y < height; ++y)
    for (int x = 0; x < width; ++x)
    if (image.getRGB(x, y) != white) {
      if (y == 0) return image; // no need to rectify
      break LOOP;
    int H = height - y;
    int[] pixels  = image.getRGB(0, y, width, H, null, 0, width);
    BufferedImage img = new BufferedImage(width, H, BufferedImage.TYPE_INT_ARGB);
    img.setRGB(0, 0, width, H, pixels, 0, width);
    return img;

and we get this image: alt text

The Image is based on black pixels that are on a white background. Hard to recognize the "special features" as we know about Facial Recognition, for example, like this:
alt text
(source: click HERE)

As you see, there is an algorithm that leads you to the right direction. Not so complicated like Optical Face Recognition, OCR is simpler if you know how and where to start with. As I have showed you how the pixels of an image could be processed and altered. For example: to change the color of an image sentence from black to cyan alt text
The problem of OCR from an image is the "ghosting" (or to be more precise: fogging) between the letters and the background. Modern cars (e.g. AUDI, BMW, Mercedes, etc.) allow their users to write a destination address on a little screen, or the cars can "recognize" the traffic signs. All that bases on a very distinctive segregation between the letters and the background environment (i.e. color). The little screen acts as a transparent background. Example: a traffic sign.
alt text
The OCR is here a lot easier to work with. To achieve the same result of Optical Face Recognition we need to create a similar and distinctive environment for the letters out of an image like the OCR of modern cars with their (preconditioned) environments.

  • Focusing on the sentence to reduce the unnecessary scanning work and Removing the interfering colors by "translucentifying" the background
  • Isolating letter by letter
  • "DOTifying" the letter
  • Fixing the distinctive features (like Facial features)

For example

Step 1
Focused on the image
alt text
and extracted alt text

Step 2: pick letter by letter
alt text

Step 3: dotifying the letter
alt text
Step 4: set the distinctive features:
alt text

Bình luận

{{ comment.user.name }}
Bỏ hay Hay
Male avatar
{{ comment_error }}

Hiển thị thử

Chỉnh sửa



20 bài viết.
561 người follow
{{userFollowed ? 'Following' : 'Follow'}}
Cùng một tác giả
1 2
Chao Cac Ban I was absent for a very long time... To my wonder that Kipalog is still alive. It's a very good news. Today I show you a brief tutor...
Nobody viết 5 tháng trước
1 2
1 0
(Ảnh) I found this question in a Vietnamese forum ((Link)). The questioner is certainly not a man who's studied Computer Science (or in Vietnamese...
Nobody viết 4 tháng trước
1 0
1 0
This tutorial is a summary of the two last tutorials 1. OCR: (Link) 2. JAVA: (Link) With the knowledge we have about the way how to process an Ima...
Nobody viết 3 tháng trước
1 0
Bài viết liên quan
2 0
Trong bài viết này, một số hình ảnh hoặc nọi dung có thể bị thiếu do quá trình chế bản. Vui lòng xem nội dung ở blog gốc sau: (Link) (Link), chúng...
programmerit viết gần 7 năm trước
2 0
0 0
Giới thiệu Trong bài hôm nay chúng ta sẽ tìm hiểu cách handle request POST của Spring Boot. Trước đó, bạn nên biết 1. 「Spring Boot 8」Tạo Web He...
https://loda.me viết gần 3 năm trước
0 0
Male avatar
0 0
https://grokonez.com/deployment/vultr/howtoinstalljavainubunturemoteservervutrhostingvpsexample How to install Java on Ubuntu Remote Server – Vutr...
loveprogramming viết 1 năm trước
0 0


{{ comment_count }}

bình luận

{{liked ? "Đã kipalog" : "Kipalog"}}

{{userFollowed ? 'Following' : 'Follow'}}
20 bài viết.
561 người follow

 Đầu mục bài viết

Vẫn còn nữa! x

Kipalog vẫn còn rất nhiều bài viết hay và chủ đề thú vị chờ bạn khám phá!