Tuesday, November 1, 2011

Ubuntu 11.10 Chinese Handwriting Recognition

...still is not nearly as awesome as Windows. Ah well.

A paper on how the windows tablet recognition works (and it works eerily well). They had to collect a large sample set to train their machine learning algorithm (neural networks) with, though, so difficult to replicate with open-source.

Handwriting Recognition: Tablet PC Text Input (Sept. 2007)

  • http://chico.inf-cr.uclm.es/aimolina/IPO1/ACTIVIDADES%20ENTREGABLES/ARTICULOS%20ESTILOS%20PARADIGMAS/NUEVOS%20ESTILOS%20INTERACCION/EI_HandwritingRecognition.pdf
Works, but not very well (can't tolerate stroke order mistakes or natural handwriting well)

HanziLookup (not really meant as IME) -- by stroke order and length of stroke (some toleration of out-of-order strokes)

CellWriter (does not really work as Chinese IME since you have to train it on each character you want to write, and it does not have a fully listing of characters to train with):


an Ubuntu forum post:

Online alternatives:
Use nciku which has the Java lookup running (probably they're using hanzilookup engine).
Type pinyin into the Chinese google clone: http://www.baidu.com/

Also useful, use google translate to grab the pinyin for a block of text. http://translate.google.com/ and then click on the "A" (Read Phonetically) button. (It also has a read aloud option now).

Quizlet also does free pronunciation recognition for flashcards (they bought a dataset).