[espeak-ng:master] new issue: Buffer overflow when compiling dictionaries #github

espeak-ng@groups.io Integration <espeak-ng@...>

[espeak-ng:master] New Issue Created by feerrenrut:
#287 Buffer overflow when compiling dictionaries


While updating the NVDA espeak-ng submodule to commit fb97d1bd7564c2ff7c305cf7fcbdd29132234846 we have run into some problems compiling the dictionaries. The build system was occasionally halting with a python crash. After some investigation I found that a missing new line at the end of 'ar_listx' was causing a buffer overrun, which I have worked around with https://github.com/nvaccess/espeak-ng/commit/0994206f710a4defc1eecfb78ab70ff57c58fcda.

For this problem (missing new line character); assuming there is no technical reason that a new line character must be present, I suggest that the dictionary compilation is modified to accept files with missing new lines. Otherwise, to save time in debugging and accidental newlines I suggest that a missing newline is detected and reported during dictionary compilation.

There still seems to be a crash or sometimes the process runs indefinitely. Further investigation has led me to find that in LoadDictionary length is sometimes negative. Trying to match up the signed / unsignedness of 'length' (particularly in compile_dictlist_end) I found in some cases it is actually negative when written. On my system I can reliably reproduce this when running through 'compile_dictlist_end()' for the 'bn' dictionary, at hash number 497 we eventually get a length value of -117 (signed) / 4294967179 (unsigned).

I don't really understand how length comes to be -117. I added some asserts (and then built espeak with /SDL /RTC1 /MDd /Od /D_DEBUG MSVC flags) to make debugging this a little easier, see the full diff of espeak: https://github.com/espeak-ng/espeak-ng/compare/master...nvaccessfixEspeakCrashDuringDictCompilation

Also the branch of NVDA that is used to build the fixEspeakCrashDuringDictCompilation is updateEspeak-debugBranch specifically commit: https://github.com/nvaccess/nvda/commit/6bf896248798ee041c034372cc0b18efa5c611dc

Key points:

  • Missing new line at the end of 'ar_listx' causes buffer overrun
  • Signed / unsigned mismatch of length in compile_dictlist_end / LoadDictionary
  • This value appears to be either too large or negative, indicating a problem.

To reproduce locally:

  • On a windows machine (with the required NVDA dependencies)
  • Clone https://github.com/nvaccess/nvda repository
  • Checkout 6bf896248798ee041c034372cc0b18efa5c611dc
  • Build nvda with python scons.py source

Join espeak-ng@groups.io to automatically receive all group messages.