Home > Development, Under-the-hood > zkanji under-the-hood – handwriting recognition 2.

zkanji under-the-hood – handwriting recognition 2.

DISCLAIMER: This entry is about the inner workings of zkanji. It might not be suitable for non-programmers while programmers might find it too trivial. Don’t expect a mathematically elegant explanation either. My primary aim is to make the explanation easy to understand.

I have been working hard in the past 2 weeks to improve the handwriting recognition in zkanji that I ruined a bit in the latest version. (It wasn’t completely broken but it didn’t work as well as before.) In a previous post I have explained how the algorithm works for single strokes. That explanation was simplified compared to the real algorithm at the time, though it is more or less close to how the original form of the algorithm worked when I first made it.

The reasons behind the change was the inclusion of hiragana symbols in the recognition data. Kanji are mainly composed of strokes made up of short straight lines with sharp corners. Because of this, the original approach to compare the angle of specific parts of strokes was straightforward, but when hiragana were added to the mix, this kind of comparison failed half the time. At least it seemed to fail at first. After a little experimenting I found, that I only need to use a little trick when measuring angles of the hiragana strokes and everything would work as usual. Because of the round forms of most hiragana strokes, the only thing I had to do is to decide on artificial corner points inside the strokes, and make the long sections between these corner points give a unified angle. That is, I made the round shapes less round.

After using it for some time I realized that it just doesn’t work as good as I would like it to. The algorithm started confusing kanji more often than before, so I decided to dive into it once more trying to figure out what went wrong, and while I’m at it, I could just as well make it recognize a specific difference between kanji easier too.

When writing in the handwriting recognition I tend to be on the lazy side, leaving out some parts of the strokes, especially the short line endings that look like a hook: 亅
Unfortunately when I did that with kanji like 捕, I often got 埔 at the first place of the candidate list, with the real one nowhere to be found. The stroke count is the same for both kanji, and even the stroke models are the same, especially if I forget the hook from the end of the second stroke. The only difference is that the second stroke is longer for 捕, and crosses the third one, instead of ending just above it. At first I tried to fine-tune the penalty for placing strokes at a unwanted position and length, but playing a bit with numbers made me realize that it just won’t be that easy. If the penalty was too high, even those kanji were unrecognizable that worked perfectly fine before.

The solution to this problem was VERY complicated. At least while figuring out how to do it, but I got it eventually. When computing the difference between strokes, hooks are simply removed first. It’s easy to say “simply”, but I also had to find a good way to find the hooks. Currently those parts are considered hooks that are not very long in absolute and relative length: not longer than 2000 units (each kanji fits in a 10000*10000 square) and takes up less than a given percent of the stroke itself.

I can imagine that even what I wrote up to this point sounds a “bit” complex. I don’t want to complicate it even more so I won’t describe my algorithm in greater detail, and will leave out all the mistakes I made while finding the solution. Let’s just say that there were a lot.

By the way there was a bug in the handwriting recognizer that made recognizing hiragana more difficult at first and messed with the recognition of normal kanji too. It had nothing to do with my trick to recognize hiragana easier. That was fixed as well of course.

 

So there will be a new version very soon that will include:

  • A better handwriting recognition engine (the word “engine” sounds so professional and mysterious that I just couldn’t leave it out).
  • Kanji reading test at the end of long-term study sessions. (With settings to control what readings we want to test and under what condition.)
  • Click on the ON reading of kanji in the kanji information window to filter by that reading in the kanji list.
  • Kanji stroke order diagram and recognition data for all 6355 kanji in the JIS X 0208-1990 character set.
  • Smaller fixes that are required for every worthy update.

Won’t be included:

  • Undo in the long-term study test. I was thinking of implementing it, but after trying to understand my own code I realized that it wouldn’t be so easy as I first thought.
  • Kanji writing test. (Actually this is something I really want to do, but not for the next version yet.)
  • A mascot character for zkanji, even though I really want it! But at the moment I can’t think of one.
Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: