OCR Part III
Java
225
White

Nobody viết ngày 19/01/2022

The so-called DOTifying of a letter (or character) is in other words the Visualization of the Pixel so that we can fsingle out the distinctive features of each individual letter. To do that we have to create a tool that does the job and bases on our knowledge about Image and Pixel Processing (see HERE). Example:

    JButton dot = new JButton("DOT");
    dot.addActionListener(e -> {
      try {
        BufferedImage img = ImageTools.createFontImage(letter, font, t, si ); // see Tutorial: Image and Pixel Processing
        img = ImageTools.createDOTImage(ImageTools.filterColor(0, 0, img), font, t, si); // call the ImageTools
        if (img != null) {
          String sav = JOptionPane.showInputDialog(jf,"Save Pixel File?", "yes");
          if ("yes".equalsIgnoreCase(sav)) {
            FileOutputStream fou = new FileOutputStream("./images/"+LET+"_"+font+"_"+type+"_"+size+"_DOT.png", false);
            ImageIO.write(img, "png", fou);
            fou.flush();
            fou.close();
          }
          dis.setIcon(new ImageIcon(img));
          dis.setText(letter);
        } else dis.setText("Invalid Image for:"+letter);
        jf.pack();
      } catch (Exception ex) {
        ex.printStackTrace();
      }
    });

The ImageTools API:

  /**
   @param img  BufferedImage
   @param fontName  String, e.g TimesRoman, Courier, etc. (Case Sensitive)
   @param fontAtt   int, FontAttribute, e.g. Font.Bold, etc.
   @param size      int, FontSize (e.g. 18 dpi)
   @return BufferedImage of the string with the given font
   @exception Exception thrown by JAVA
  */
  public static BufferedImage createDOTImage(BufferedImage image, String fontName, int fontAtt, int size) throws Exception {
    int dy, dot = 0;
    int width = image.getWidth();
    int height = image.getHeight();
    ByteArrayOutputStream bao = new ByteArrayOutputStream(width*height);
    Graphics2D g = image.createGraphics();
    int pixel = g.getColor().getRGB();
    // the X axis (uppermost)
    bao.write("     ".getBytes());
    for (int i = 0; i < width; ++i) bao.write(String.format(" %02X", i).getBytes());
    bao.write("\n".getBytes());
    // the DOTifying of Pixels and Y scale on the leftmost side
    for (int y = 0; y < height; ++y) {
      bao.write(String.format("%04X ", y).getBytes());
      for (int x = 0; x < width; ++x) { // Black color
        dot =  image.getRGB(x, y); // get the Pixel
        if ((dot & 0xFFFFFF) == 0 || dot == pixel)
             bao.write(" . ".getBytes());
        else bao.write("   ".getBytes());
      }
      bao.write("\n".getBytes());
    }
    bao.write("\n".getBytes());
    String[] lines = (new String(bao.toByteArray())).split("\n");
    // create the Font based on the given specification
    Font font = new Font(fontName, fontAtt, size);
    FontMetrics metrics = g.getFontMetrics(font);
    dy = metrics.getHeight();
    for (String line:lines) {
      int l = metrics.stringWidth(line);
      if (l > width) width = l;
    }
    width += 10;
    size = lines.length;
    height = dy * size;
    image = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
    g = image.createGraphics();
    // generate an image
    g.setFont(font);
    // Background: c0c0c0 Silver
    g.setColor(new Color(0xC0C0C0));
    g.fillRect(0, 0, width, height);
    g.setColor(Color.BLACK);
    for (int i = 0, y = dy; i < size; ++i) {
      g.drawString(lines[i], 10, y);
      y += dy;
    }
    return image;
  }

and that is what we get (the infamous WYSIWYG :))):
alt text

The hardest work is the isolation of each individual letter out of the image and from a string. Of course, it can be only done with some pre-conditions. For example: handwriting letter is NOT supported here. And mixing of different fonts is also unsupported. With some pre-defined rules the work with the isolation of the letters becomes more meaningful and more precise. However, if you want to cover everything you have to inflate your work as a real job. And if I did that here I would blast this forum into pieces. Example:
alt text
The letters within a string are separated and put in a pipeline so that they can be recognized (OCR) and then converted individually (from left to right) to their according appropriate font: A....Z a....z 0....9 and all the special characters ($ % & € etc.) Meaning that we have to implement each OCR method for each letter. Some letters are very similar to each other (e.g. uppercase I and lowercase L, V or Y or 5 and S, etc.) so that they can be grouped together and share the same implementation.

Because WHITE spaces (blanks, Tab, NewLine, etc.) don't have color they are unseen or to be more precise: they are translucent. Hence an OCR of such a white space is an arbitrary interpretation. I did here as 1 space between 2 words -regardless how many spaces are there, and a new line is inserted at the end of each string. And the result is as following:

alt text

To materialize our OCR approach we need to design some Lines of Feature (LOF) just like the Facial Recognition lines. The LOF base on the reduced X-Y coordinate system of the image. For example: the reduced X-Y coordinate system of the following image:
alt text
Reduced X-Y Coordinate System:
alt text

From the reduced Image we need only to separate and to pipeline each individual letter out of the text and then to visualize it pixel-by-pixel (DOTifying), so that we could finally set the LOF for each of the letters.
alt text

An implementation of the upperSegment:

  // Upper Segment left and right meet at yU: upward
  private boolean upperSegment(BufferedImage img, int yU, int y, int xL, int xR) {
    for (int a = xL, b = xR; y >= yU; --y, ++a, --b) {
      if ((img.getRGB(a, y) & ALPHA) != 0 || (img.getRGB(b, y) & ALPHA) != 0) continue;
      // tolerant +/-1
      if ((a+1) < xR && (img.getRGB(a+1, y) & ALPHA) != 0 || (b-1) > xL && (img.getRGB(b-1, y) & ALPHA) != 0) continue;
      return false;
    }
    return true;
  }

If we were able to complete the necessary LOF (with some tolerant or Safety Gap) we would get this similar result:
alt text

Bình luận


White
{{ comment.user.name }}
Bỏ hay Hay
{{comment.like_count}}
Male avatar
{{ comment_error }}
Hủy
   

Hiển thị thử

Chỉnh sửa

White

Nobody

20 bài viết.
557 người follow
Kipalog
{{userFollowed ? 'Following' : 'Follow'}}
Cùng một tác giả
White
1 2
Chao Cac Ban I was absent for a very long time... To my wonder that Kipalog is still alive. It's a very good news. Today I show you a brief tutor...
Nobody viết 5 tháng trước
1 2
White
1 0
(Ảnh) I found this question in a Vietnamese forum ((Link)). The questioner is certainly not a man who's studied Computer Science (or in Vietnamese...
Nobody viết 4 tháng trước
1 0
White
1 0
This tutorial is a summary of the two last tutorials 1. OCR: (Link) 2. JAVA: (Link) With the knowledge we have about the way how to process an Ima...
Nobody viết 3 tháng trước
1 0
Bài viết liên quan
White
2 0
Trong bài viết này, một số hình ảnh hoặc nọi dung có thể bị thiếu do quá trình chế bản. Vui lòng xem nội dung ở blog gốc sau: (Link) (Link), chúng...
programmerit viết hơn 6 năm trước
2 0
White
0 0
Giới thiệu Trong bài hôm nay chúng ta sẽ tìm hiểu cách handle request POST của Spring Boot. Trước đó, bạn nên biết 1. 「Spring Boot 8」Tạo Web He...
https://loda.me viết gần 3 năm trước
0 0
Male avatar
0 0
https://grokonez.com/deployment/vultr/howtoinstalljavainubunturemoteservervutrhostingvpsexample How to install Java on Ubuntu Remote Server – Vutr...
loveprogramming viết 1 năm trước
0 0
{{like_count}}

kipalog

{{ comment_count }}

bình luận

{{liked ? "Đã kipalog" : "Kipalog"}}


White
{{userFollowed ? 'Following' : 'Follow'}}
20 bài viết.
557 người follow

 Đầu mục bài viết

Vẫn còn nữa! x

Kipalog vẫn còn rất nhiều bài viết hay và chủ đề thú vị chờ bạn khám phá!