Development Journal, June 9

Posted on Paz 10 Haziran 2018 in new • 2 min read

I began implementing Ottoman translator using Finite State Transducers via OpenFST. Instead of using ad hoc algorithms to translate Ottoman and Turkish into each other, I'll be creating FSTs.

In the past I have used FOMA and TRmorph, as a building block and basis for Ottoman conversion. However I saw that writing something on top of a morphological analyzer to convert Ottoman to Turkish requires almost another morphological analyzer. (This is also true for Turkish to Ottoman conversion as well, because spelling rules of Ottoman requires another layer of FSTs.)

After that initial failure, I have decided to write an adhoc algorithm that uses surface level representations of Turkish and Ottoman. The result runs in dervaze for two years now.

But as it can be observed, it's slow. I was working to replace it with a C library that will also be used in mobile applications. I have completed dictionary part that's used to search for words. In this version of the library, I was using trie structures.

I have decided, however, that using FSTs to translate may be more robust and formal in the long run. When you use tries, it's almost using half of FSTs without much thought on state. As I'm developing this solo and my main concern is expandability, converting ad hoc rules to states seemed easier than writing a C library.

I will be writing on this experience in the coming days.

I've learned that there is specialized Common Lisp implementation for mobile devices, mocl.

It should also be possible to write C-compatible shared libraries in CL, and use them in iOS and Android. Although I began to prefer writing libraries in plain C (instead of any other language), I may switch to CL if there is enough motivation.