Aug 26, 2023

Typoon

Solo Project
April 2023 - June 2023
Skills Utilized: C++, Windows Programming, doctest, WiX

Introduction

Typoon is a unique text expander designed specifically for Hangeul, the Korean script. Most of the text expander out there doesn't really work well with Hangeul typing system, so I ended up making one myself. It supports the alphabets as well. It was heavily inspired by espanso.


What is a text expander?

"A text expander is a program that detects when you type a specific keyword and replaces it with something else. This is useful in many ways:
- Save a lot of typing, expanding common sentences.
- Create system-wide code snippets."

 -- From the espanso's readme file

Showcase


↑ Typoon comes with a convenient system tray icon

↑ You can write your own matches intuitively

Technical Highlights

Managing the trigger strings

In a nutshell, Typoon builds a "trigger tree" with 2 iterations to manage all the match data.
Let's say there are 4 matches:

Within the first iteration, Typoon actually builds a tree with the given data. While building, there can be 'stale' nodes that cannot be reached. Those nodes will be discarded in the second iteration.

In the second iteration, the tree gets flattened in level order to maximize the locality of reference and minimize the memory usage. This is the final form of the data that is used in the runtime.

Detection of Korean letter composition & IME simulation

Understanding the computer-based composition of Korean letters is essential before diving into this topic. A Korean letter comprises three components: an initial, a medial, and a final. Unlike alphabets, where one keyboard press corresponds to a single letter, forming a Korean letter may require up to five keystrokes. Hence, capturing keyboard input alone is insufficient; knowing the actually composed letters is crucial.

Source: https://rcneira.weebly.com/learn/reading-the-korean-alphabet-hangul

You might ask, "Why not decompose all the letters and use that to match the input?" Unfortunately, this approach often leads to an ambiguity. For instance, decomposing "곡" into "ㄱㅗㄱ" might inadvertently match "고가," which decomposes as "ㄱㅗㄱㅏ." This mismatch is usually undesirable. Well, matching the composed letters is the only option it is, then.

Acquiring the composed letters presented two options. First, extracting them from the Input Method Editor (IME), a layer for Korean composition. Second, simulating the IME to compose letters within Typoon based on the keyboard inputs.

Obviously, I went for the former one at first. But after days of researches, I've come to a conclusion that there is no way to get the letters from either the processes or the IMEs attached to them. You simply cannot access to the other processes' IMEs, and there is no unified way to get the composed letters from all kinds of processes.

So, the only option was to simulate the IME myself. It was a pretty deep case analysis, but I managed to figure it out.


File watcher

Typoon incorporates a file watcher feature capable of promptly detecting and implementing changes to both the config and match files upon saving. This functionality leverages Windows-specific functions: WaitForMultipleObjectsEx() and ReadDirectoryChangesW().
In essence, you put a thread into sleep and tell it to wake up if any of the watching events triggers(WaitForMultipleObjectsEx), and you tell the OS to trigger an event when a change is made to a given directory(ReadDirectoryChangesW).



Addressing the lack of wide character support in doctest

Typoon's focus on Hangeul and Windows necessitates the use of wide characters throughout its codebase. However, most unit test libraries often fall short in handling wide characters, resulting in character display issues like '�'.
So I decided to settle on one and devise a workaround. By delving into doctest's result printing functions, I identified a solution. I introduced a template specialization of a class responsible for converting objects to strings. In addition to that, given that I employed diverse string types for wide characters, such as std::wstring, std::wstring_view, wchar_t[ ], and others, I ensured that the specialization accommodate all these variations by utilizing the std::convertible_to concept.

Conclusion

Typoon has been a particularly enjoyable project for me, offering numerous valuable learning experiences. I gained expertise in Windows programming, realized the effectiveness and the importance of the unit testing, learned how to build an installer for Windows, delved into original data structures and algorithms, got familiarized with multithreading, and the list goes on.
Even though I'm having a temporary break right now, I will definitely pick it up again soon in the future.