Archive

Posts Tagged ‘stroke order diagram’

Animated kanji stroke order for everyone?

March 5, 2011 4 comments

10-01-2012 update: I have noticed many hits in google to this post. If you would like to read about how to deal with kanji recognition in your program, head to this post.

I have completed the stroke order database for all JIS X 0208-1990 kanji, to be included in zkanji. That’s 6355 kanji that most Japanese fonts contain, and it took quite some time to finish. (No wonder making such a font file takes 3-4 years.) There have been open projects out there already (or at least one, called kanjiVG – hopefully fixed soon) that attempted the same. I started out using the Taka database years ago which contains a bit more than the Jouyou kanji. (When I started I didn’t know about kanjiVG, and the Taka database didn’t have all Jouyou kanji, but I had to start somewhere.)

The reason why my project is unique (I’m making this up), is that the information stored in the data file for zkanji doesn’t only contain coordinates for strokes and their order, and of course the kanji parts, but also some additional information about the type of stroke. That’s why it is possible (since v0.573) to display these stroke orders with a bit more style.

I have plans to add other information to this data. For example it would be nice if I could show which part of the kanji corresponds to a given ON reading. I hope that it can be done automatically from the data already in zkanji(, partly taken from KANJIDIC, and partly from the stroke order diagrams). I might have to specify other relations between kanji. For example when one kanji is the old form of another, and not used in modern writing anymore. This is not going to happen any time soon though.

The next step? I could release the stroke order information publicly. It is currently not free (can only be released with zkanji), but if someone asks nicely, I might write a post about the data format and even release it with some CC type license.

DISCLAIMER: zkanji doesn’t use data from the Taka database anymore. The stroke order diagrams are all my work.

zkanji under-the-hood – handwriting recognition

January 9, 2011 21 comments

DISCLAIMER: This entry is about the inner workings of zkanji. It might not be suitable for non-programmers while programmers might find it too trivial. Don’t expect a mathematically elegant explanation either. My primary aim is to make the explanation easy to understand.

The IME (Input Method Editor) pad in Windows allows its users to write Japanese characters with the mouse, and as they add stroke after stroke, it offers a list of character candidates that resemble the drawn image. Any text editor or input box that’s prepared to use the IME can accept these characters. Zkanji is not one of them for several reasons. When I started writing it, I didn’t have any idea about how the IME works, but most of all I didn’t like the idea that I, as a user, would have to manually switch between English and Japanese when searching in the dictionary. At first I only wrote an input box where anything typed in turns into kana and which accepts Japanese characters from the clipboard. Because the only requirement for this is to have one or two Japanese fonts installed, zkanji could be run on most systems, even those that have no IME installed. (For some reason many people don’t have it.) I didn’t want to add handwriting recognition, because I thought that it would be very difficult to do and a lot of work that’s not worth it, but as zkanji evolved I wanted some new challenge that’s different from what I was doing.

When I made up my mind, I started looking for a character recognition algorithm for Japanese kanji. You would think these algorithms are easy to find, because on the internet everything is free and everyone is eager to help others. After days of fruitless search I finally found an entry in a blog that gave me the solution. I could not understand it though (at least not fully), but it led me to the Levenshtein distance which I’m grateful for. Understanding that, and why it works was the first step towards writing the handwriting recognition. I won’t explain the Levenshtein distance, it’s quite straightforward in that wiki article.

My algorithm is very similar to Levenshtein’s with changes proposed by the above mentioned blog (which also just quotes some other algorithm, that I was too lazy to look up). Its principle is that we build a matrix from two sequences. The columns and rows represent elements of the two sequences, where every element of the matrix is the numerical distance between the corresponding two elements of the sequences, plus the smallest neighboring element from the previous row and/or column. The very last element of the matrix gives us a number that we can use as the shortest relative distance of the two sequences. After this is done the algorithm is pretty easy. We just go through every stroke of every kanji, sort their distances from the mouse drawn strokes, and we have a list of kanji in the order of their distance from our writing. This is what the blog suggested, and that’s exactly what I tried to implement.

The modified algorithm for Levenshtein distance is simple. Let me show you with pseudo-code:

double compare(SEQUENCE a, SEQUENCE b)
{
  int n = a.length;
  int m = b.length;
  // Create a n*m matrix of floating point numbers.
  double matrix[n][m]; 

  loop x from 1 to n {
    loop y from 1 to m {
      // Set the cell of the matrix at x,y to the distance of sequence points.
      matrix[x][y] = distance(a.pointarray[x],b.pointarray[y]);

      // Add the smallest neighboring cell's value to the distance.
      if (x > 1 and y > 1)
        matrix[x][y] = matrix[x][y] +
                       lowest(matrix[x-1][y-1],matrix[x-1][y],matrix[x][y-1]);
      else if (y == 1 and x > 1)
        matrix[x][y] = matrix[x][y] + matrix[x-1][y];
      else if (x == 1 and y > 1)
        matrix[x][y] = matrix[x][y] + matrix[x][y-1];
    }
  }

  // The last element of the matrix is the shortest relative distance.
  return matrix[n][m];
}

Basically this is all that I got from the sources I mentioned and I had to invent everything else. The most difficult part was building the two sequences from the stroke order database and the stroke the user draws with the mouse for comparison. All strokes are first smoothed and the points of little importance removed. This is done relatively simply so I won’t explain the method.

The author of the blog suggested the coordinates of the points in the strokes for making the sequences, since the distance between two points are very easy to calculate. I tried this approach at first, but soon found that it’s not flexible enough.

Figure 1.: A stroke from the database. Figure 2.: Mouse-drawn stroke.

After cleaning them up, the data left from the two strokes are the points alone (the big red dots). Getting these points is a lot trickier, because I have to take into account curves that don’t have end points in-between, but let’s say we have it all done correctly. Because drawing with the mouse can result in quite ugly strokes, I have to recognize these two as similar. The distance would be pretty big if I just took the points and tried to calculate a distance, so I resized both strokes into 10000*10000 unit squares:

Figure 3.: The original strokes resized and placed on each other.

This worked relatively well, but you can see that if the user’s own stroke was flatter than the original, the calculated distance would still be pretty big. So much, that in many cases the algorithm found smaller distance between strokes that were very different instead of being similar, just because their size fit the drawn stroke better. After this I experimented quite a lot with many things, until I ended up using “model strokes”:

Figure 4.: Model strokes in zkanji used for handwriting recognition.

I made these in a text editor as a comma separated list of coordinate sequences. I had experimented a lot until I found the right shapes that were recognized best. Of course this in itself did not solve the problem of different sizes, so I made a single segment of each model stroke “elastic”. When I resized these strokes to the exact size of the user drawn stroke, the elastic segment didn’t hold its shape. It became shorter or longer if it didn’t fit the user drawn stroke exactly:

Figure 5.: Resized model stroke with an elastic segment and the user drawn stroke.

This finally worked satisfactorily, but it was very complicated and still not as precise as I wanted. So I finally had enough, scraped everything I had done so far (I kept the stroke models but removed the elastic segment) and rewrote the whole recognition engine in a single day, which is still used in the latest version of zkanji.  I only had to fine-tune it a bit to work better with the new hiragana stroke models. Yes, most of what I wrote until now was just a little introduction to the final algorithm.

After all these experiments I realized that the less precise the comparison is, the better. All this time I wanted it to be precise instead of the direct opposite, so I tried everything to move the stroke model points as close to the user drawn ones as possible. In the final version instead of calculating the distance of every point in the two sequences, I found, that calculating their relative angle works much better. The relative angle is not clear in many cases, so I just find the smallest angle between segments (neighboring points of a stroke), taking their direction into account as well. This way the actual size of the strokes and their parts do not matter much, so even the badly drawn mouse strokes can be matched with greater accuracy, yet less precision. The modified Levenshtein’s distance is still used, but I deleted all code so I had to write it again. (Not that it was difficult or anything.)

A number representing the distance between two strokes in itself is not much, because there can be many kanji with the same strokes in either a different order or different position. But since the most difficult part of the algorithm (at least the theory) was finding out how to calculate the distance between two strokes, everything else was “just” work. For the algorithm to be fast I calculated and stored which model strokes make up each kanji in the stroke order database beforehand.

  1. When the user draws a stroke with the mouse, it is checked against the model strokes, storing their distance from the user drawn stroke.
  2. Then the program loops through each kanji and checks which model stroke was their first (or second or whatever) stroke.
  3. The distance of the user drawn stroke and the model stroke used in a kanji is then added to a value. This value is the distance of the user’s handwriting and the kanji. Each kanji has one.
  4. The difference between the relative position of this stroke to the other strokes in the kanji, and the relative position the user drawn stroke to the other user drawn strokes also increase the value.  This does not require any special algorithm. I just add a multiplied constant, depending on which “grid position” each stroke is.
  5. Finally, adding these distances together for each stroke it’s easy to calculate the distance of any kanji and the user drawn one, thus making an order of every kanji in the database.

Write a comment if you want to implement your own algorithm based on mine and need help.

What takes so long? :)

December 16, 2010 4 comments

I guess this kind of thought often appears in the minds of many users, who are not programmers themselves and are waiting for a new version to pop out of nowhere. In order to be able to release the final v0.57 of the program, I have set some difficult and some not so difficult goals for myself. A single new feature sometimes requires solving a hideous but hidden problem that wasn’t obvious at first. Let’s take for example a seemingly simple addition to the long-term study list.

I have filled a few notebooks (made of paper, not plastic) while studying kanji, because I have decided that I not just want to know how to read them, but also how to write them. Because environmentally friendly (also called green, no idea why) things are so popular these days, I thought that letting students practice inside the program without the need to waste valuable paper, ink and money is a good idea. I wouldn’t have thought about this in the first place, if I didn’t already have a good “hand”writing recognition engine built in (made also by me of course). Adding a single button next to the input box and popping up the recognizer window was cake. The problems surfaced soon afterward.

Take for example the word 日々. Everyone studying Japanese should know what 々 is. It is a repetition symbol for kanji, but it is not kanji itself. Thus, the handwriting recognition engine can’t recognize it as it was made solely for kanji. Why was it made like that and why it couldn’t handle anything else is a lot more complicated than to explain it here, so don’t ask. I didn’t want to force anyone to go and hunt for the 々 character elsewhere to copy and paste it into the input box, so I either had to add a fake kanji, which would make everything more painful later, or modify the whole recognition engine to be able to work with other characters too. So I have decided to do the latter. And if I’m doing it anyway, I will have to include all kana characters too, because typing a word with the keyboard and drawing with the mouse simultaneously is very difficult when you should type and draw 60 of them one after the other.

I can tell that modifying the engine wasn’t easy. It wasn’t too difficult either, just took a lot of time.

I wouldn’t have posted about this all if the problems were to stop here. Unfortunately what comes after this will be even more tiresome. I will have to add all the hiragana and katakana characters, and also all the kurikaeshi (kanji/kana repetition) symbols one by one to the stroke order database. And if my plans become reality and zkanji will handle words that have all kind of symbols and Latin characters in their written form I will probably have to create the stroke order drawing of all the full-width Latin characters and numbers too.

I only want to finish the kana and kurikaeshi symbols till the next release, but it will still take some time.

The compulsory screenshot:

Most users will probably never find this window in any of their menus nor will accidentally open it. Unless you add a [Debug] section in the zkanji ini file and a debug=1 value under it, the stroke order diagram editor won’t show up. There is currently no way to share single diagrams among users, so even if someone wanted to help me by creating the remaining 1200 S.O.D. (5100 is already done!) plus all the kana, they couldn’t send them to me. (Well, they could, and I could modify the program to get the required data from the received file, but that’s more work than how much I would save with it. (Probably not, but I don’t need the remaining 1200 S.O.D. right now and most people don’t either.))

Follow

Get every new post delivered to your Inbox.