Date   
Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

4 New Commits:

[espeak-ng:master] By BenTalagan <ben_talagan@...>:
94677f4af8ad: Rule alignment fixes for non compliant platforms / Fix for emscripten demo

Modified: emscripten/Makefile
Modified: emscripten/post.js
Modified: src/libespeak-ng/dictionary.c
Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/readclause.h
Modified: src/libespeak-ng/translate.c


[espeak-ng:master] By BenTalagan <ben_talagan@...>:
9fd480afbf4f: Fixing typos and naming

Modified: src/libespeak-ng/dictionary.c
Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/readclause.h
Modified: src/libespeak-ng/translate.c


[espeak-ng:master] By BenTalagan <ben_talagan@...>:
02447abde8b3: Fixing is_str_totally_null

Modified: src/libespeak-ng/readclause.c


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
050d5e498261: Merge remote-tracking branch 'BenTalagan/master'

Modified: emscripten/Makefile
Modified: emscripten/post.js
Modified: src/libespeak-ng/dictionary.c
Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/readclause.h
Modified: src/libespeak-ng/translate.c


[espeak-ng/espeak-ng] Pull request closed by rhdunn:

#676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo

This is a fix for #584, but the PR scope may be potentially larger : without this fix, the handling of compiled rules is not guaranteed to be compliant across platforms, since casting to int* may happen on non aligned char* , which has to be avoided.

Some minor options also have to be added to the emscripten compilation workflow to make it work again with newer versions.


[espeak-ng:master] New Comment on Pull Request #676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo
By rhdunn:

That's what tests are for :).

Thanks for the fix.

[espeak-ng:master] reported: Rule alignment fixes for non compliant platforms / Fix for emscripten demo #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo
By BenTalagan:

Pheew! I really need some rest, you saved me from pushing some really silly code. Looks better now.

[espeak-ng:master] reported: Rule alignment fixes for non compliant platforms / Fix for emscripten demo #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo
By rhdunn:

They are passing on the master branch. The failing test is https://travis-ci.org/espeak-ng/espeak-ng/jobs/613903437#L2232.

Pull Request Updated #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng/espeak-ng] Pull request updated by BenTalagan:

#676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo

This is a fix for #584, but the PR scope may be potentially larger : without this fix, the handling of compiled rules is not guaranteed to be compliant across platforms, since casting to int* may happen on non aligned char* , which has to be avoided.

Some minor options also have to be added to the emscripten compilation workflow to make it work again with newer versions.

[espeak-ng:master] reported: Rule alignment fixes for non compliant platforms / Fix for emscripten demo #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo
By BenTalagan:

Hum, checks failed, but I've verified locally and it looks like that they were already broken before these changes. Is it normal?

Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng/espeak-ng] Pull request opened by BenTalagan:

#676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo

This is a fix for #584, but the PR scope may be potentially larger : without this fix, the handling of compiled rules is not guaranteed to be compliant across platforms, since casting to int* may happen on non aligned char* , which has to be avoided.

Some minor options also have to be added to the emscripten compilation workflow to make it work again with newer versions.


[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

@rhdunn : Thanks for your answer ! I have prepared a PR (#676), and limited myself to add a function to test sequential bytes to zero. It's very close to what was intended originally and non intrusive (the original code only tests four bytes, but after that they are still read one by one, not 4 by 4).

[espeak-ng:master] reported: emscripten demo broken, probably highlights underlying problem linked to dictionary compilation #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By rhdunn:

Thanks for the analysis. It looks like a version of Read4Bytes (https://github.com/espeak-ng/espeak-ng/blob/master/src/libespeak-ng/readclause.c#L280) for a const char * is needed to fix this -- renaming Read4Bytes to fread_uint32 and create a read_uint32 function. The code would then need to be audited to avoid direct casting to unsigned int *.

[espeak-ng:master] reported: emscripten demo broken, probably highlights underlying problem linked to dictionary compilation #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

Ok, after fixing the condition in FindReplacementChars, it seems I can get back a working generation/transcription with emscripten. I'd still need some expertise to tell me if I'm missing some potential similar alignment problems.

[espeak-ng:master] reported: emscripten demo broken, probably highlights underlying problem linked to dictionary compilation #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

After implementing a temp fix :

while (p[0] != 0 && p[1] != 0 && p[2] != 0 && p[3] != 0) {
				p++;
			}

the parsing of the rules looks ok now, but the translation is still messed up. Found at least one suspicious place (within commit 55c6403) :

https://github.com/espeak-ng/espeak-ng/blob/48719ad642f8a27d352983ab5964463a8c1e033e/src/libespeak-ng/translate.c#L1793-L1799

Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

After taking time to investigate, I think I have found the problem. It comes from the following lines :

https://github.com/espeak-ng/espeak-ng/blob/48719ad642f8a27d352983ab5964463a8c1e033e/src/libespeak-ng/dictionary.c#L153-L154

They behave differently when compiled with llvm and emscripten. Under llvm, like with gcc, this will have what I would call an 'expected' behaviour : the cast to unsigned int from any position in the char* buffer will take into account the fact that we are not aligned to a multiple of 4 bytes. Under emscripten it doesn't : shifting by n+0, n+1, n+2 or n+3 bytes leads indifferently to the same result when casting to an int. One of the rules of the 'en' dictionary falls under this case, so the condition of having 4 successive bytes at 0 is not met and the rule parser explodes.

@rhdunn, I'd like your opinion on that issue : should we implement a simple fix for this (like testing the four bytes instead of casting to unsigned int), are there any other part of the code that may be concerned?


[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

After taking some time to investigate, I think I have found the problem. It comes from the following lines :

https://github.com/espeak-ng/espeak-ng/blob/48719ad642f8a27d352983ab5964463a8c1e033e/src/libespeak-ng/dictionary.c#L153-L154

They behave differently when compiled with llvm and emscripten. Under llvm, like with gcc, this will have what I would call an 'expected' behaviour : the cast to unsigned int from any position in the char* buffer will take into account the fact that we are not aligned to a multiple of 4 bytes. Under emscripten it doesn't : shifting by n+0, n+1, n+2 or n+3 bytes leads indifferently to the same result when casting to an int. One of the rules of the 'en' dictionary falls under this case, so the condition of having 4 successive bytes at 0 is not met and the rule parser explodes.

@rhdunn, I'd like your opinion on that issue : should we implement a simple fix for this (like testing the four bytes instead of casting to unsigned int), are there any other part of the code that may be concerned?


[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

Add : after reading a bit on the net, it really looks like this should be rewritten. Some refs :

https://stackoverflow.com/questions/26995151/how-to-cast-char-array-to-int-at-non-aligned-position

https://stackoverflow.com/questions/13881487/should-i-worry-about-the-alignment-during-pointer-casting

Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

2 New Commits:

[espeak-ng:master] By BenTalagan <ben_talagan@...>:
3e0150a34fd4: Fixing ungetc bad behavior under macOS Catalina by avoiding to ungetc a different char from the last getc

Modified: src/libespeak-ng/compiledata.c


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
48719ad642f8: Merge remote-tracking branch 'BenTalagan/master'

Modified: src/libespeak-ng/compiledata.c


[espeak-ng/espeak-ng] Pull request closed by rhdunn:

#675 Fixing ungetc bad behavior under macOS Catalina

This is a fix for (#674). For archiving purpose, the problem was the following : it seems that the ungetc implementation under Catalina has interferences with ftell/fseek when ungetc pushes back a character which is different from the one that is preceding the current file pointer.

The fix consists in avoiding such a situation.


[espeak-ng:master] New Comment on Pull Request #675 Fixing ungetc bad behavior under macOS Catalina
By rhdunn:

Merged. Thanks.


[espeak-ng:master] Label added to issue #674 Build fails on MacOS Catalina by BenTalagan.


[espeak-ng:master] Issue #674 Build fails on MacOS Catalina closed by BenTalagan.

[espeak-ng:master] reported: Build fails on MacOS Catalina #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

PR ready (#675) :-) Thanks a lot for having taken such time to help!

Pull Request Opened #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng/espeak-ng] Pull request opened by BenTalagan:

#675 Fixing ungetc bad behavior under macOS Catalina

This is a fix for (#674). For archiving purpose, the problem was the following : it seems that the ungetc implementation under Catalina has interferences with ftell/fseek when ungetc pushes back a character which is different from the one that is preceding the current file pointer.

The fix consists in avoiding such a situation.

Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

That works on my machine, so feel free to create a patch.

Are there any other problems?


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

I don't think so. I remember having a problem with emscripten a few months ago (#584), the compiled js was unable to parse correctly the bundled data. I don't know, it might be related (or not). Will give it a try again later, but I will prepare a PR for now.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Great. Thanks.

[espeak-ng:master] reported: Build fails on MacOS Catalina #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Ok! I think I got it :

	while (isspace(c))
		c = get_char();
   
	item_terminator = ' ';
	if ((c == ')') || (c == '(') || (c == ','))
		item_terminator = c;

	if ((c == ')') || (c == ','))
		c = ' ';
	else if(!feof(f_in))
		unget_char(c);

This will allow to compile the full phoneme files. This is my result :

Refs 4021,  Reused 3068
Compiled phonemes: 0 errors.
touch dictsource/en_extra
  DICT      espeak-ng-data/en_dict
Can't read dictionary file: '/Users/ben/poub/espeak-ng/espeak-ng-data/en_dict'
Using phonemetable: 'en'
Compiling: 'en_list'
	5458 entries
Compiling: 'en_emoji'
	1690 entries
Compiling: 'en_extra'
	0 entries
Compiling: 'en_rules'
	6743 rules, 103 groups (0)

Is it ok ?

[espeak-ng:master] reported: Build fails on MacOS Catalina #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

I think it's this part :

https://github.com/espeak-ng/espeak-ng/blob/f4c2ad3b7f3c29b40cd6029c5f82f2ba1156fff7/src/libespeak-ng/compiledata.c#L790-L801

Line 798 will modify the character. I tried to invert the last line, but pushing back the ')' will result now in an infinite loop. I guess pushing back a space was a trick to get rid of the parsing of the ')'.

Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

One additional note, I could not find the source for Catalina. But in precedent versions of macOS, the code of ungetc is different depending on the fact that we push back the same character or not.

https://opensource.apple.com/source/Libc/Libc-1272.250.1/stdio/FreeBSD/ungetc.c.auto.html

In the first case, it's a simple rewind of the file pointer. In the second case, a buffer is used. That could explain why I see different behaviors depending on the fact that we push back the same character that was read and why it can interfere with ftell.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

One additional note, I could not find the source for Catalina. But in precedent versions of macOS, the code of ungetc is different depending on the fact that we push back the same character or not.

ungetc.c

In the first case, it's a simple rewind of the file pointer. In the second case, a buffer is used. That could explain why I see different behaviors depending on the fact that we push back the same character that was read and why it can interfere with ftell.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Interesting. Thanks.

I wonder what is causing espeak to unget a character different to the previously read character. Maybe addressing that will fix the issue you are seeing on the Mac (and possibly on other BSD-based platforms).

Github push to espeak-ng:espeak-ng #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

1 New Commit:

[espeak-ng:master] By Valdis Vitolins <valdis.vitolins@...>:
f4c2ad3b7f3c: docs: add missing languages to the list

Modified: docs/languages.md

Github push to espeak-ng:espeak-ng #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

1 New Commit:

[espeak-ng:master] By Valdis Vitolins <valdis.vitolins@...>:
656bb42c39e9: uz: add initial support for Uzbek language

Added: dictsource/uz_list
Added: dictsource/uz_rules
Added: espeak-ng-data/lang/trk/uz
Added: phsource/ph_uzbek
Modified: CHANGELOG.md
Modified: Makefile.am
Modified: phsource/phonemes

Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Will do, thanks.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Note : one other potential problem I see is that f_in can be switched to another file in the stack (thus the buffered byte for one file may interfere with another file). I don't know if it should be taken into account or not.