Topics

[espeak-ng:master] reported: Voice Chinese (Mandarin): some characters are reported two times #github


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #685 Voice Chinese (Mandarin): some characters are reported two times
By jaacoppi:

This works: espeak-ng -X -v zh 你好

Replace: 你   ni3
Translate 'ni3'
  1     n        [n]

  1     i        [i]

 22     3        [214]

Replace: 好   hao3
Translate 'hao3'
  1     h        [X]

  1     a        [A]
 22     ao       [Au]

 22     3        [214]

n'i35_| X'Au214_|




This is wrong. The second word is interpreted twice: first as a part of the first word, then as a standalone word: espeak-ng -X -v zh 貸款

Replace: 貸 款   dai4kuan3
Translate 'dai4kuan3'
  1     d        [t]

  1     a        [A]
 22     ai       [ai]

 22     4        [51]

  1     k        [kh]

  1     u        [u]
 86     k) ua (DnK [wa]
 22     ua       [wA]
 65     ua (DnK  [ua]

  1     n        [n]

 22     3        [214]

Replace: 款   kuan3
Translate 'kuan3'
  1     k        [kh]

  1     u        [u]
 86     k) ua (DnK [wa]
 22     ua       [wA]
 65     ua (DnK  [ua]

  1     n        [n]

 22     3        [214]

tai51khw'a35n_| khw'a214n_|

Most likely this is a unicode/multibyte issue. Some characters aren't separated properly. I wonder why it only affects certain characters. Anyone have any ideas or should I keep digging?