Topics

Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng/espeak-ng] Pull request opened by jbowler:

#693 Fixed UTF8 BOM and consequent damage to !v files

The six modified files all had spurious characters introduced apparently as a result of files with the UTF8 BOM marker, U-FFFE, which is conventionally used at the start of text files to indicate a UTF-8 file and is invisible under normal circumstances (e.g. the file is opened as a text file).

None of these files are recognized by espeak-ng on Linux systems because the 'language variant' line is seen by espeak-ng as starting with a new character.

'gustave' is an uncorrupted file, it correctly starts with the BOM in UTF-8 (three bytes), however even though it is correct espeak-ng does not read it (this may be a separate bug!)

'marcelo' somehow got the BOM character replaced by a literal '?', notice that 'git diff' on these changes will, indeed, show the removed character in 'gustave' as a literal '?'. Notice also that the character in question, the BOM, is actually the Unicode 'zero width no-break space', so it is pretty invisible.

The remaining files seem to have suffered major corruption possible as a result of dostounix style convertions. The line endings, normally on Unix or on Windows, had been converted to and the BOM had been replaced by a character.

Signed-off-by: John Bowler jbowler@...


[espeak-ng:master] New Comment on Pull Request #693 Fixed UTF8 BOM and consequent damage to !v files
By jbowler:

NOTE: 'gustave' really does have a change, the github.com HTML interface does not display anything for the zero-width non-breaking space.