Fast report so you know I’m still alive.
I’ve finally given in and started working on the user interface. Without some windows, like add-to-group, it’s difficult to code and test most of the features. The new user interface will be different from previous versions, and a bit different from conventional user interfaces (although there are programs that have a similar interface). The program will have a “main window” which only shows one of the many “modules” of zkanji, but you’ll be able to select which one is shown from its toolbar. When you need interaction between these (i.e. picking words from the dictionary to add to a group), you’ll open another “main window”, switch from dictionary view to word groups view, then drag-and-drop the word from the dictionary into a group.
I want most things to work by drag-dropping just as well as from context-menu of window main menu. It’s still not clear how selection inside words will work if you can drag-drop. There are many open questions but I’ll figure things out.
According to my plans, most windows will be non-modal (that is, you’ll be able to interact with other windows while a dialog window is shown.) I’ve read somewhere that this is how modern interface design should work, but not being an expert I’m not sure. This creates complexity in the code so this is just an experimental feature right now.
I’ll post pictures when there’s something to show. (Some windows are already working but with zero design…)
No, it’s not program update yet. I just thought I should share what I’ve been up to. If you are reading this, you are probably one of the few who can’t wait for me to write something. (Or you are subscribed to the RSS feed which is obviously the majority…) Enough of the introduction.
The latest addition to the new version of zkanji is a built-in JMdict importer. You’ll have to download the JMdict/KANJIDIC etc. files manually, but after that you’ll be able to import the latest dictionary, and not just the English one. (There are other languages in the JMdict now.) I don’t remember what other stuff I added, so I’ll just write about what’s missing.
The next one I want to make is a dictionary exporter + user file importer. The latter was a big task previously too so I expect a lot of resistance from my brain cells. This is not a minor feature that I can just skip and release early, because without it we won’t be able to use our old data. (It’s kinda complicated why.)
A smaller feature I need to implement is the example sentences list for the words. It is relatively easy to do fortunately.
The word tests feature is missing as well (not the long-term study, that’s done). I’m not sure what to do with it. Should it be the exact copy of the original, or does it need changes? The copy is easier to do so for now I might just do that.
These were the bigger features still missing, but there are lots and lots of small ones. Filtering in the dictionary, auto collecting words into groups by kanji, all the small dialog windows for adding stuff to groups or to the study lists. The context-menus for everything. The kanji info window and the stroke order diagram painting. The auto updater. There are so many small things that were added over the years which makes this into a huge project for a single person.
The biggest mystery right now is the popup-dictionary. I have no idea what is possible with Qt, and I’m kinda afraid of it, but it’s the feature I use most so it will be done. If it’s not possible to do within Qt I’ll make this feature a Windows exclusive (unless people help code for other OSes.) Let’s just hope it can be done.
And finally, when all the little things are done, I’ll have a struggle with Qt itself. Making the user interface right in Qt is a challenge but I can’t release something bad. I want to finish this version before the end of the year, maybe sooner. It’s definitely possible. It all depends on whether I can make myself to work on it. At least I now think that I’ll finish it.
I’m working on the handwritten kanji character recognition. At least that part doesn’t need reworking, I thought. Maybe it doesn’t, though I plan to improve on it a bit. What gives me headaches (apart from the fast changing weather) is that I don’t really understand a lot of the code now.
As I’m working on the new version, I try to describe what each part of the code does, so I’m adding comments everywhere. In the original code there was almost none. I could easily understand what the program does, so why add any? Good code explains itself? I have seen that line a lot before. Unfortunately it’s only true about code which solves a very simple, easy to understand problem with a simple algorithm. NOT for handwritten kanji character recognition.
I can’t make much out of the old code without considerable effort doing so. Because of this, even copying the old code takes a lot of time. Not that I’m just copying it either. I’m not sure whether all of my old decisions made sense or not, but regardless of that I would probably make different ones now. It’d be good if someone reading that code (even if it’s me years from now) could figure out why those decisions were made.
I think I’ve learned my lesson, but that doesn’t mean every single line will get commented either. Some parts really do explain themselves. Others might not be obvious to everyone, but as I was solving the same problems the same way years back, they will be easy to read in the future too. I have seen code before which had more comments than code, and it was hard to read because of that. If the algorithm is complicated, explaining it in a few long paragraphs at the start of the code is better than writing a description about each line.
Just to clarify, I’m writing this because it helps me concentrate while I code. I doubt anyone will change their habits because of this blog.
Disclaimer: This is a programming topic. Not that scary though.
As the title says, it is possible to not write well optimized code, but replace it with threads. (VERY simply put, a thread is a program within a program. It runs at the same time another part of the program* runs.)
Is it bad practice or good? I think it depends. For example in previous versions of zkanji I had to store a lot of temporary data during the long-term study, to make it fast to get the next item after answering another. This is a way of optimization, you make extra care that everything works fast. This also added a lot of complexity to the code, and complex code is harder to fix if something works badly. Because of that optimization I was too scared to change the code to allow adding new items to study, after you finished studying previous items for the day.
So how do we replace optimization with threads? If you can make more code to run at the same time, the program will appear to be doing its task faster. (I’m not an expert at this but maybe if you are on a laptop this will also drain its batteries a bit faster.) In our case in zkanji, while you are thinking about the answer to an item, the program can do “stuff”. For example, it can find the next item to show after the current one. Finding the next item is fast while there are not many items to study, so there would be no need to add a thread just for that, but after years and years adding new items, it can have a visible lag after the answer was given and before the new item is shown.
This is probably a very simple example, but as the lag wouldn’t be more than a second on slower computers, I don’t think it’s worth my time optimizing it. And also, you’ll be able to add new items to study any time. I will consider adding new threads for small things like this if they make my life easier and the program simpler, but threads come with their own complexities and difficulties, so avoid them if possible.
*Technically speaking, the “another part of the program” is another thread. Every program has at least one thread, even if it’s only the single main thread.
As I have to rewrite everything, I’m not only trying to achieve the same functionality of zkanji, but add some stuff I always wanted to. I never got to it because there were so many things to change for every small thing. Now that I have to rewrite zkanji anyway, it doesn’t matter anymore so I can do it as well.
For example in the new version, you can look up words with the kana in the middle of a word. There’s also a button to look for the exact kana characters. Before, if you had the button ?+ pressed and then typed あ, the results contained words ending in か or さ etc. Now if you check the “exact kana” button, only words ending in the kana あ will be found.
Another long awaited functionality was to enter kanji or kana directly with the system’s IME. This is finally possible. (But romaji is still converted as usual.)
As the title says, I’m currently working on word groups. In the past zkanji versions, single meanings are added to word groups. If you want to add several meanings of a word to a group, each of them are added as separate items. The change here is that I want the user to be able to select several meanings, and add them as a single item to a group. I still haven’t decided how to handle the case when you try to add a meaning to a group that contains the word already with different meanings. Should it merge the meaning with the existing item or add it as a separate item? I have to think about the reasons why someone would want to do one or the other. Or rather, if there’s a reason to add a separate item. Writing zkanji is a huge task and not adding unnecessary functionality (even if it sounds cool at first) will make it happen faster.
What is your opinion?
As usual, if I get no opinions I’ll just decide one thing and if someone complains later, it’ll be already too late.
This post is not exactly about what I planned the last time, but I’m sure nobody will be angry about it.
Instead of relying on STL to handle the storage of data, Qt has its own container classes. STL wasn’t standardized at the time Qt started out so this is somewhat understandable, though it still happened in ancient times. When I started writing the Qt version of zkanji, I had to decide which set of container classes to use. I originally developed zkanji in an entirely different framework that didn’t use STL itself, so now when I’m rewriting everything, it really means rewriting everything. This put me in the position where I had to choose one set of the container classes.
Because the Qt documentation is full of QList this and QList that, and they also say that QList is the recommended container ‘cos it’s so great, at first I put QLists everywhere. (It is not like std::list, but rather a vector holding pointers.) I can’t say whether it was a mistake or not, but it turned out that the QList implementation is very slow in the VS debugger. The debugger is notorious of being slow anyway, but when it sees a lot of asserts, memory allocations and access, it’s even slower. (Also if you use iterators a lot it can be very slow.) Once I replaced QList at some key locations with QVector, my program sped up in the debugger. I don’t know the reason behind it, but I was lazy to check out their source code.
This was the point when I started looking for comparisons of the Qt and the STL container classes, and only found a few, like this and this. Apart from these I only saw half sentences about the inefficiency of Qt and, from others, how great it is.
Qt containers are really convenient. They hold an internal pointer which is the real container, and only that pointer gets copied if you copy the containers. The real copy only happens if you modify their contents. This can also be seen as a drawback because you never know when the “hard copy” happens. A single mistake when accessing the container elements can cause this to happen too, so be careful.
What gave me the final push was when I made some of my classes movable but non-copyable (as it’s bad when there are multiple copies of them all over the place even if only by mistake), and removed the default constructor of others. The compiler immediately scolded me that it’s not good, how could it instantiate a QList with them this way? As it turns out, Qt is not very friendly with C++11 features, even if they claim otherwise. This unfortunate detail made me switch to STL, and if some containers are STL, it’s not a good practice to mix them with QT containers in a single code. For example I would have to use the at() function of Qt lists but avoid that in STL for speed (they do different things…), and that would cause a lot of confusion later.
Phew, this was a long introduction!
QList, as I mentioned above, is like an std::vector holding pointers, with the difference that it leaves some space in front of the allocated data, so insertion and removal from the first half of the container is faster than with std::vector. Because it internally holds pointers, it also manages them. You pass in a value and it allocates a new object for it. No more need for smart pointers when the list does the allocation and deletion. When I switched to a pure STL implementation this convenience was lost. But worry not! (You didn’t worry? Sorry.) I just had to create a new container class derived from std::vector that works like a list of smart pointers. Erasing an element also deletes the object via its pointer that was stored there. In general, deriving from STL containers is a mistake because they don’t have virtual destructors. I don’t plan passing them as pointers to the base vector’s type when handling my “smartvector” (as I named it) so no problems there. It’s all for convenience because this way VS shows the list of elements in the debugger, and I don’t have to add a natvis extension to it.
String storage? I had wchar_t all over zkanji originally while it was Windows only. This data type holds a 2 bytes wide character when compiled on Windows, and a 4 bytes wide one on linux implementations. Obviously I had to get rid of it in the new code. I was eyeing QString at first, and it’s probably a good choice usually, but I just don’t like the idea of storing unnecessary data, like reference counting, inner pointer for fast copy etc., when the average size of the strings is 3-5 characters (for kana / kanji) and maybe 20 for English definitions. std::string was out of question in the first place, because I could at most use UTF-8 with them, and string comparison and the like would be a pain.
At first I replaced every wchar_t* with QChar*. QChar is the character type used in arrays inside QString, and an array of QChar can be easily converted to QString where the Qt GUI code needs it. I tested it myself and found that using arrays of QChar or using unsigned short (which it holds internally) has exactly the same speed. QChar also has its convenience with its built in functions so it was an obvious choice. I also found out by testing that comparing strings with the QString comparison functions is 1.3 times slower than writing my own function that compares characters one by one in a QChar array. (Yes I tried this in release code as well.)
As a real solution, I finally got off my lazy bum and wrote a container class for QChar arrays that frees up the data when it gets deleted. I also made the natvis extension for it so the debugger correctly shows my strings, and not just the first character in the array. This string is designed like an array: it shouldn’t be modified too often, so it’s purely for holding the data, but because of that it can be very fast and memory efficient as well.
With this I have everything that satisfies my (current) container needs. I can’t give a real conclusion or judge which is better, Qt or STL containers. I found Qt containers to be much more convenient than STL containers, and not as much slower that would make them useless (when you are not handling huge amounts of data), but they are not as c++11 friendly as I wish they were, so I waved them good bye.
Next time I’ll write about… something. I can’t see the future it seems.
As I wrote previously v0.73 had a serious bug which can be disappointing for some long time users, so automatic updating from v0.717 has been disabled. (Though you can still get the new version from the web site.) I wrote it too that the bug has been fixed so why no release yet?
To make it short, I wanted to update the word group study to include a kanji input box like the long-term study does, and I’m in the middle of it. This small change, although seemingly not very difficult, required changing the window where you can set the properties of word study groups. So I went ahead and redesigned the window to make it match the style of the rest of the program. UI changes always take some time, and I also have to rewrite some code for it. The test itself won’t be changed that much, though I’m thinking about redesigning its window as well to look similar to the long-term study. I haven’t decided yet.