Tuesday, November 1, 2011

Ubuntu 11.10 Chinese Handwriting Recognition

...still is not nearly as awesome as Windows. Ah well.

A paper on how the windows tablet recognition works (and it works eerily well). They had to collect a large sample set to train their machine learning algorithm (neural networks) with, though, so difficult to replicate with open-source.

Handwriting Recognition: Tablet PC Text Input (Sept. 2007)
10.1109/MC.2007.314

  • http://chico.inf-cr.uclm.es/aimolina/IPO1/ACTIVIDADES%20ENTREGABLES/ARTICULOS%20ESTILOS%20PARADIGMAS/NUEVOS%20ESTILOS%20INTERACCION/EI_HandwritingRecognition.pdf
Works, but not very well (can't tolerate stroke order mistakes or natural handwriting well)


HanziLookup (not really meant as IME) -- by stroke order and length of stroke (some toleration of out-of-order strokes)
http://kiang.org/jordan/software/hanzilookup/
http://xpwithubuntu.blogspot.com/2010/01/aiptek-tablet-and-chinese-handwriting.html

CellWriter (does not really work as Chinese IME since you have to train it on each character you want to write, and it does not have a fully listing of characters to train with):
https://sites.google.com/a/thejonus.net/stuff/how-to/chineseinputwithawacombamboo

IbusHandwrite:
http://code.google.com/p/ibus-handwrite/

an Ubuntu forum post:
http://ubuntuforums.org/showthread.php?t=1468468


Online alternatives:
Use nciku which has the Java lookup running (probably they're using hanzilookup engine).
Type pinyin into the Chinese google clone: http://www.baidu.com/

Also useful, use google translate to grab the pinyin for a block of text. http://translate.google.com/ and then click on the "A" (Read Phonetically) button. (It also has a read aloud option now).

Quizlet also does free pronunciation recognition for flashcards (they bought a dataset).