Topics

Updates to Github #github


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

My suspicions go to the ungetc function :

This is how I instrumented it :

static unsigned int get_char()
{
	unsigned int c;
	c = fgetc(f_in);
	if (c == '\n')
		linenum++;

  printf("Got '%c'\n", c);

	return c;
}

static void unget_char(unsigned int c)
{
	ungetc(c, f_in);
	if (c == '\n')
		linenum--;

  printf("Ungot '%c'\n", c);
}

For the parsing of the ? phoneme this is what I get :

Compile phoneme: ?
Got '/'
Got '/'
Got ' '
Got 'g'
Got 'l'
Got 'o'
Got 't'
Got 't'
Got 'a'
Got 'l'
Got ' '
Got 's'
Got 't'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'v'
Got 'l'
Got 's'
Got ' '
Got 'g'
7: vls -> 'g'
Ungot 'g'
Got 'g'
Got 'l'
Got 't'
Got ' '
Got 's'
7: glt -> 's'
Ungot 's'
Got 's'
Got 't'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'l'
7: stp -> 'l'
Ungot 'l'
Got 'l'
Got 'e'
Got 'n'
Got 'g'
Got 't'
Got 'h'
Got 'm'
Got 'o'
Got 'd'
Got ' '
Got '3'
7: lengthmod -> '3'
Ungot '3'
Got '3'
Got ' '
Got ' '
Got ' '
Got '/'
3: 3 -> '/'
Ungot '/'
Got '/'
Got '/'
Got ' '
Got '?'
Got '?'
Got '
'
Got ' '
Got ' '
Got 'n'
Got 'o'
Got 'l'
Got 'i'
Got 'n'
Got 'k'
Got '
'
Got ' '
Got ' '
Got 'V'
7: nolink -> 'V'
Ungot 'V'
Got 'V'
Got 'o'
Got 'w'
Got 'e'
Got 'l'
Got 'i'
Got 'n'
Got ' '
Got ' '
Got 'g'
7: Vowelin -> 'g'
Ungot 'g'
Got 'g'
Got 'l'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'V'
7: glstop -> 'V'
Ungot 'V'
Got 'V'
Got 'o'
Got 'w'
Got 'e'
Got 'l'
Got 'o'
Got 'u'
Got 't'
Got ' '
Got 'g'
7: Vowelout -> 'g'
Ungot 'g'
Got 'V'
Got 'o'
Got 'w'
Got 'e'
Got 'l'
Got 'o'
Got 'u'
Got 't'
Got ' '
Got 'g'
7: Vowelout -> 'g'
Ungot 'g'
Got 'g'
Got 'l'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'W'
7: glstop -> 'W'
Ungot 'W'
Got 'W'
Got 'A'
Got 'V'
Got '('
7: WAV -> '('
Ungot '('
Got 'W'
Got 'A'
Got 'V'
Got '('
7: WAV -> '('
Ungot '('
Got '('
Got 'u'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '/'
Got 'n'
Got 'u'
Got 'l'
Got 'l'
Got ')'
2: ustop/null -> ')'
Ungot ' '
Got ' '
Got 'b'
Got 'r'
Got 'k'
Got '
'
Got ' '
Got ' '
Got 'F'
7: brk -> 'F'
Ungot 'F'
Got 'F'
Got 'M'
Got 'T'
Got '('
7: FMT -> '('
Ungot '('
Got '('
Got 'r'
Got '3'
Got '/'
Got 'r'
Got '_'
Got 't'
Got 'r'
Got 'i'
Got 'l'
Got 'l'
Got ')'
2: r3/r_trill -> ')'
Ungot ' '
Got ' '
Got ' '
Got ' '
Got ' '
Got 'E'
Got 'n'
Got 'd'
Got 'S'
Got 'w'
Got 'i'
Got 't'
Got 'c'
Got 'h'
Got '
'
Got '
'
Got ' '
Got ' '
Got ' '
Got ' '
Got 'V'
7: EndSwitch -> 'V'

We can clearly see that after vowelout and WAV we don't get what we have unget. Worse, there's a full jump after the ustop/null that sends us way beyond the current position.

What's your opinion on this ?


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

My suspicions go to the ungetc function :

This is how I instrumented it :

static unsigned int get_char()
{
	unsigned int c;
	c = fgetc(f_in);
	if (c == '\n')
		linenum++;

        printf("Got '%c'\n", c);

	return c;
}

static void unget_char(unsigned int c)
{
	ungetc(c, f_in);
	if (c == '\n')
		linenum--;

        printf("Ungot '%c'\n", c);
}

For the parsing of the ? phoneme this is what I get :

Compile phoneme: ?
Got '/'
Got '/'
Got ' '
Got 'g'
Got 'l'
Got 'o'
Got 't'
Got 't'
Got 'a'
Got 'l'
Got ' '
Got 's'
Got 't'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'v'
Got 'l'
Got 's'
Got ' '
Got 'g'
7: vls -> 'g'
Ungot 'g'
Got 'g'
Got 'l'
Got 't'
Got ' '
Got 's'
7: glt -> 's'
Ungot 's'
Got 's'
Got 't'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'l'
7: stp -> 'l'
Ungot 'l'
Got 'l'
Got 'e'
Got 'n'
Got 'g'
Got 't'
Got 'h'
Got 'm'
Got 'o'
Got 'd'
Got ' '
Got '3'
7: lengthmod -> '3'
Ungot '3'
Got '3'
Got ' '
Got ' '
Got ' '
Got '/'
3: 3 -> '/'
Ungot '/'
Got '/'
Got '/'
Got ' '
Got '?'
Got '?'
Got '
'
Got ' '
Got ' '
Got 'n'
Got 'o'
Got 'l'
Got 'i'
Got 'n'
Got 'k'
Got '
'
Got ' '
Got ' '
Got 'V'
7: nolink -> 'V'
Ungot 'V'
Got 'V'
Got 'o'
Got 'w'
Got 'e'
Got 'l'
Got 'i'
Got 'n'
Got ' '
Got ' '
Got 'g'
7: Vowelin -> 'g'
Ungot 'g'
Got 'g'
Got 'l'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'V'
7: glstop -> 'V'
Ungot 'V'
Got 'V'
Got 'o'
Got 'w'
Got 'e'
Got 'l'
Got 'o'
Got 'u'
Got 't'
Got ' '
Got 'g'
7: Vowelout -> 'g'
Ungot 'g'
Got 'V'
Got 'o'
Got 'w'
Got 'e'
Got 'l'
Got 'o'
Got 'u'
Got 't'
Got ' '
Got 'g'
7: Vowelout -> 'g'
Ungot 'g'
Got 'g'
Got 'l'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'W'
7: glstop -> 'W'
Ungot 'W'
Got 'W'
Got 'A'
Got 'V'
Got '('
7: WAV -> '('
Ungot '('
Got 'W'
Got 'A'
Got 'V'
Got '('
7: WAV -> '('
Ungot '('
Got '('
Got 'u'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '/'
Got 'n'
Got 'u'
Got 'l'
Got 'l'
Got ')'
2: ustop/null -> ')'
Ungot ' '
Got ' '
Got 'b'
Got 'r'
Got 'k'
Got '
'
Got ' '
Got ' '
Got 'F'
7: brk -> 'F'
Ungot 'F'
Got 'F'
Got 'M'
Got 'T'
Got '('
7: FMT -> '('
Ungot '('
Got '('
Got 'r'
Got '3'
Got '/'
Got 'r'
Got '_'
Got 't'
Got 'r'
Got 'i'
Got 'l'
Got 'l'
Got ')'
2: r3/r_trill -> ')'
Ungot ' '
Got ' '
Got ' '
Got ' '
Got ' '
Got 'E'
Got 'n'
Got 'd'
Got 'S'
Got 'w'
Got 'i'
Got 't'
Got 'c'
Got 'h'
Got '
'
Got '
'
Got ' '
Got ' '
Got ' '
Got ' '
Got 'V'
7: EndSwitch -> 'V'

We can clearly see that after vowelout and WAV we don't get what we have unget. Worse, there's a full jump after the ustop/null that sends us way beyond the current position.

What's your opinion on this ?


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Thanks! So it is the following sequence that looks really suspicious :

7: WAV -> '('
Ungot '('
Got '('
Got 'u'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '/'
Got 'n'
Got 'u'
Got 'l'
Got 'l'
Got ')'
2: ustop/null -> ')'
Ungot ' '
Got ' '
Got 'b'
Got 'r'
Got 'k'
Got '
'
Got ' '
Got ' '
Got 'F'
7: brk -> 'F'

After putting back a space in the stream, we jump far away in the file (in the R phoneme)


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Yeah, that looks like it is the issue. I'm not sure what is going on with that yet.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Check UngetItem, which is calling fseek.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Yes, it's exactly what I was doing :-) Getting this sequence :

Compile phoneme: ?
Got '/'
Got '/'
Got ' '
Got 'g'
Got 'l'
Got 'o'
Got 't'
Got 't'
Got 'a'
Got 'l'
Got ' '
Got 's'
Got 't'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'v'
Got 'l'
Got 's'
Got ' '
Got 'g'
7: vls -> 'g'
Ungot 'g'
Got 'g'
Got 'l'
Got 't'
Got ' '
Got 's'
7: glt -> 's'
Ungot 's'
Got 's'
Got 't'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'l'
7: stp -> 'l'
Ungot 'l'
Got 'l'
Got 'e'
Got 'n'
Got 'g'
Got 't'
Got 'h'
Got 'm'
Got 'o'
Got 'd'
Got ' '
Got '3'
7: lengthmod -> '3'
Ungot '3'
Got '3'
Got ' '
Got ' '
Got ' '
Got '/'
3: 3 -> '/'
Ungot '/'
Got '/'
Got '/'
Got ' '
Got '?'
Got '?'
Got '
'
Got ' '
Got ' '
Got 'n'
Got 'o'
Got 'l'
Got 'i'
Got 'n'
Got 'k'
Got '
'
Got ' '
Got ' '
Got 'V'
7: nolink -> 'V'
Ungot 'V'
Got 'V'
Got 'o'
Got 'w'
Got 'e'
Got 'l'
Got 'i'
Got 'n'
Got ' '
Got ' '
Got 'g'
7: Vowelin -> 'g'
Ungot 'g'
Got 'g'
Got 'l'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'V'
7: glstop -> 'V'
Ungot 'V'
Got 'V'
Got 'o'
Got 'w'
Got 'e'
Got 'l'
Got 'o'
Got 'u'
Got 't'
Got ' '
Got 'g'
7: Vowelout -> 'g'
Ungot 'g'
Ungot item 2252
Got 'V'
Got 'o'
Got 'w'
Got 'e'
Got 'l'
Got 'o'
Got 'u'
Got 't'
Got ' '
Got 'g'
7: Vowelout -> 'g'
Ungot 'g'
Got 'g'
Got 'l'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '
'
Got ' '
Got ' '
Got 'W'
7: glstop -> 'W'
Ungot 'W'
Got 'W'
Got 'A'
Got 'V'
Got '('
7: WAV -> '('
Ungot '('
Ungot item 2270
Got 'W'
Got 'A'
Got 'V'
Got '('
7: WAV -> '('
Ungot '('
Got '('
Got 'u'
Got 's'
Got 't'
Got 'o'
Got 'p'
Got '/'
Got 'n'
Got 'u'
Got 'l'
Got 'l'
Got ')'
2: ustop/null -> ')'
Ungot ' '
Got ' '
Got 'b'
Got 'r'
Got 'k'
Got '
'
Got ' '
Got ' '
Got 'F'
7: brk -> 'F'
Ungot 'F'
Got 'F'
Got 'M'
Got 'T'
Got '('
````

(UngetItem is instrumented with : 

```C
printf("Ungot item %d\n", f_in_displ);

Looks like there were some calls to UngetItem, but not immediately before the jump.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

No better luck :-) Unfortunately it has no effect. I have to make a pause, but I'll get back at it later, and try to use GDB for better insight.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Have a good day.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Thanks, you too, and for your help!


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Further investigation, it looks there's something really nasty happening with the stream, if I instrument like this :

static unsigned int get_char()
{
	unsigned int c;

  int pb = ftell(f_in);
	c = fgetc(f_in);
  int pa = ftell(f_in);
	if (c == '\n')
		linenum++;

  printf("Got '%c' %d -> %d\n", c, pb, pa);

	return c;
}

static void unget_char(unsigned int c)
{
  int pb = ftell(f_in);
	ungetc(c, f_in);
  int pa = ftell(f_in);
	if (c == '\n')
		linenum--;

  printf("Ungot '%c' %d -> %d\n", c, pb, pa);
}

The presence of ftells modifies the behaviour of the program! (It should be non intrusive I guess) ; the output :

7: glstop -> 'W'
Ungot 'W' 0 -> 0
Got 'W' 2270 -> 2271
Got 'A' 2271 -> 2272
Got 'V' 2272 -> 2273
Got '(' 2273 -> 2274
7: WAV -> '('
Ungot '(' 0 -> 0
Ungot item 2270
Got 'W' 2270 -> 2271
Got 'A' 2271 -> 2272
Got 'V' 2272 -> 2273
Got '(' 2273 -> 2274
7: WAV -> '('
Ungot '(' 0 -> 0
Got '(' 2273 -> 2274
Got 'u' 2274 -> 2275
Got 's' 2275 -> 2276
Got 't' 2276 -> 2277
Got 'o' 2277 -> 2278
Got 'p' 2278 -> 2279
Got '/' 2279 -> 2280
Got 'n' 2280 -> 2281
Got 'u' 2281 -> 2282
Got 'l' 2282 -> 2283
Got 'l' 2283 -> 2284
Got ')' 2284 -> 2285
2: ustop/null -> ')'
Ungot ' ' 0 -> 0
Got ')' 2284 -> 2285
7:  -> ')'
Ungot ' ' 0 -> 0
phonemes(125): The phoneme feature is not recognised: ''.
Got ')' 2284 -> 2285
7:  -> ')'
Ungot ' ' 0 -> 0
phonemes(125): The phoneme feature is not recognised: ''.
Got ')' 2284 -> 2285
7:  -> ')'
Ungot ' ' 0 -> 0
phonemes(125): The phoneme feature is not recognised: ''.
Got ')' 2284 -> 2285
7:  -> ')'
Ungot ' ' 0 -> 0
... INFINITE LOOP...

I wonder if the reason is not that we put back a character which is not the one it was before (?!). We get ')' but we unget ' ' after ustop/null.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Further investigation, it looks like there's something really nasty happening with the stream, if I instrument like this :

static unsigned int get_char()
{
	unsigned int c;

  int pb = ftell(f_in);
	c = fgetc(f_in);
  int pa = ftell(f_in);
	if (c == '\n')
		linenum++;

  printf("Got '%c' %d -> %d\n", c, pb, pa);

	return c;
}

static void unget_char(unsigned int c)
{
  int pb = ftell(f_in);
	ungetc(c, f_in);
  int pa = ftell(f_in);
	if (c == '\n')
		linenum--;

  printf("Ungot '%c' %d -> %d\n", c, pb, pa);
}

The presence of ftells modifies the behaviour of the program! (It should be non intrusive I guess) ; the output :

7: glstop -> 'W'
Ungot 'W' 0 -> 0
Got 'W' 2270 -> 2271
Got 'A' 2271 -> 2272
Got 'V' 2272 -> 2273
Got '(' 2273 -> 2274
7: WAV -> '('
Ungot '(' 0 -> 0
Ungot item 2270
Got 'W' 2270 -> 2271
Got 'A' 2271 -> 2272
Got 'V' 2272 -> 2273
Got '(' 2273 -> 2274
7: WAV -> '('
Ungot '(' 0 -> 0
Got '(' 2273 -> 2274
Got 'u' 2274 -> 2275
Got 's' 2275 -> 2276
Got 't' 2276 -> 2277
Got 'o' 2277 -> 2278
Got 'p' 2278 -> 2279
Got '/' 2279 -> 2280
Got 'n' 2280 -> 2281
Got 'u' 2281 -> 2282
Got 'l' 2282 -> 2283
Got 'l' 2283 -> 2284
Got ')' 2284 -> 2285
2: ustop/null -> ')'
Ungot ' ' 0 -> 0
Got ')' 2284 -> 2285
7:  -> ')'
Ungot ' ' 0 -> 0
phonemes(125): The phoneme feature is not recognised: ''.
Got ')' 2284 -> 2285
7:  -> ')'
Ungot ' ' 0 -> 0
phonemes(125): The phoneme feature is not recognised: ''.
Got ')' 2284 -> 2285
7:  -> ')'
Ungot ' ' 0 -> 0
phonemes(125): The phoneme feature is not recognised: ''.
Got ')' 2284 -> 2285
7:  -> ')'
Ungot ' ' 0 -> 0
... INFINITE LOOP...

I wonder if the reason is not that we put back a character which is not the one it was before (?!). We get ')' but we unget ' ' after ustop/null.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

That is strange. On my machine, I get output like:

ungot 'l' 21037 => 21036

So ungetc is broken on the Mac?


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

I'm trying to narrow the problem ; could you try to compile and launch this small program on your machine ?

#include <stdio.h>
#include <stdlib.h>

FILE* f_in = NULL;

static unsigned int get_char()
{
	unsigned int c;
	c = fgetc(f_in);
  printf("Got '%c'\n", c);
	return c;
}

static void unget_char(unsigned int c)
{   
	ungetc(c, f_in);
  printf("Ungot '%c'\n", c);
}

int main(int argc, char** argv) {

  f_in = fopen("testfile","rb");

  if(!f_in)
  {
    printf("Failed to open test file \n");
    return -1;
  }

  int triggered = 0;
  while(!feof(f_in))
  {
    int c = get_char();
    if( c== '4' && !triggered) {
      unget_char(' ');
      triggered = 1;

      int cpos = ftell(f_in);
      printf("FTELL : %d\n",cpos);
    }
  }

  fclose(f_in);
  return 0;
}

The behaviour of that program is strange to me, it has the 'jump' effect :

Got '1'
Got '
'
Got '2'
Got '
'
Got '3'
Got '
'
Got '4'
Ungot ' '
FTELL : 14
Got '8'
Got '?'


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

This is the test file "testfile" that goes alongside the program :

1
2
3
4
5
6
7
8


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Got '1'
Got '
'
Got '2'
Got '
'
Got '3'
Got '
'
Got '4'
Ungot ' '
FTELL : 6
Got ' '
Got '
'
Got '5'
Got '
'
Got '6'
Got '
'
Got '7'
Got '
'
Got '8'
Got '
'
Got '�'


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Ouch. Looks like we have the problem :( The behavior of ungetc + ftell is no more compliant.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

I've been thinking of trying a local implementation of the unget behaviour, but haven't currently figured out how to get it working.

I'm not sure why ungetc is not working properly on Mac in this case. It is most likely a bug in their implementation.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Funny enough, if I replace the unget_char(' '); line by unget_char('4');, I have the same results as you. In fact, I trigger the jump if what I unget is different from what I've got.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

http://man7.org/linux/man-pages/man3/ungetc.3p.html doesn't say that the ungot character has to be the same as the character previously read.

The following simple ungetc replacement does not currently work:

diff --git a/src/libespeak-ng/compiledata.c b/src/libespeak-ng/compiledata.c
index acd62221..e55545ab 100644
--- a/src/libespeak-ng/compiledata.c
+++ b/src/libespeak-ng/compiledata.c
@@ -403,6 +403,7 @@ static FILE *f_report;
 static FILE *f_in;
 static int f_in_linenum;
 static int f_in_displ;
+static unsigned int f_in_ungetc = EOF;
 
 static int linenum;
 static int count_references = 0;
@@ -715,7 +716,11 @@ static int LookupPhoneme(const char *string, int control)
 static unsigned int get_char()
 {
        unsigned int c;
-       c = fgetc(f_in);
+       if (f_in_ungetc != EOF) {
+               c = f_in_ungetc;
+               f_in_ungetc = EOF;
+       } else
+               c = fgetc(f_in);
        if (c == '\n')
                linenum++;
        return c;
@@ -723,7 +728,7 @@ static unsigned int get_char()
 
 static void unget_char(unsigned int c)
 {
-       ungetc(c, f_in);
+       f_in_ungetc = c;
        if (c == '\n')
                linenum--;
 }

I'm getting a lot of errors, starting with:

phonemes(124): The phoneme feature is not recognised: 'gowelout'.
phonemes(359): The phoneme feature is not recognised: 'fowelout'.

That looks suspiciously similar to what you are seeing (esp. re: the line numbers), so maybe that is what the Mac implementation is doing internally.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Can you check what ungetc is returning, and if it is returning EOF then what is the errno value and associated message?


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Can you check what ungetc is returning, and if it is returning EOF then what is the errno value and associated message?

Ok, I've had a look at this. The result of ungetc looks ok : it always returns the value of the character that was ungot, even in the suspicious cases.


[espeak-ng:master] New Comment on Issue #652 Incorrect pronounciation of atelier
By valdisvi:

I added this as another word in en_list. Does atelier.wav.zip sounds right?


[espeak-ng:master] New Comment on Issue #655 Esperanto: pronunciation of A
By valdisvi:

I changed definition of a sound. Does this Esperanto.wav.zip sounds better?


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

However, I can see some potential problems with your implementation :

  • Multiple sequential calls to ungetc will not work (the buffer depths is 1)
  • Used in conjunction with fseek (like in UngetItem) or ftell (like at the start of NextItem) this may do strange things

Maybe one possible implementation would be to work with one big buffer instead of a file stream ?


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

However, I can see some potential problems with your implementation :

  • Multiple sequential calls to ungetc will not work (the buffer depth is 1)
  • Used in conjunction with fseek (like in UngetItem) or ftell (like at the start of NextItem) this may do strange things

Maybe one possible implementation would be to work with one big buffer instead of a file stream ?


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

However, I can see some potential problems with your implementation :

  • Multiple sequential calls to ungetc will not work (the buffer depth is 1)
  • If used in conjunction with fseek (like in UngetItem) or ftell (like at the start of NextItem), this may do strange things

Maybe one possible implementation would be to work with one big buffer instead of a file stream ?


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Will do, thanks.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

Note : one other potential problem I see is that f_in can be switched to another file in the stack (thus the buffered byte for one file may interfere with another file). I don't know if it should be taken into account or not.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

One additional note, I could not find the source for Catalina. But in precedent versions of macOS, the code of ungetc is different depending on the fact that we push back the same character or not.

https://opensource.apple.com/source/Libc/Libc-1272.250.1/stdio/FreeBSD/ungetc.c.auto.html

In the first case, it's a simple rewind of the file pointer. In the second case, a buffer is used. That could explain why I see different behaviors depending on the fact that we push back the same character that was read and why it can interfere with ftell.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

One additional note, I could not find the source for Catalina. But in precedent versions of macOS, the code of ungetc is different depending on the fact that we push back the same character or not.

ungetc.c

In the first case, it's a simple rewind of the file pointer. In the second case, a buffer is used. That could explain why I see different behaviors depending on the fact that we push back the same character that was read and why it can interfere with ftell.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Interesting. Thanks.

I wonder what is causing espeak to unget a character different to the previously read character. Maybe addressing that will fix the issue you are seeing on the Mac (and possibly on other BSD-based platforms).


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

That works on my machine, so feel free to create a patch.

Are there any other problems?


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By BenTalagan:

I don't think so. I remember having a problem with emscripten a few months ago (#584), the compiled js was unable to parse correctly the bundled data. I don't know, it might be related (or not). Will give it a try again later, but I will prepare a PR for now.


[espeak-ng:master] New Comment on Issue #674 Build fails on MacOS Catalina
By rhdunn:

Great. Thanks.


espeak-ng@groups.io Integration <espeak-ng@...>
 

2 New Commits:

[espeak-ng:master] By BenTalagan <ben_talagan@...>:
3e0150a34fd4: Fixing ungetc bad behavior under macOS Catalina by avoiding to ungetc a different char from the last getc

Modified: src/libespeak-ng/compiledata.c


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
48719ad642f8: Merge remote-tracking branch 'BenTalagan/master'

Modified: src/libespeak-ng/compiledata.c


[espeak-ng/espeak-ng] Pull request closed by rhdunn:

#675 Fixing ungetc bad behavior under macOS Catalina

This is a fix for (#674). For archiving purpose, the problem was the following : it seems that the ungetc implementation under Catalina has interferences with ftell/fseek when ungetc pushes back a character which is different from the one that is preceding the current file pointer.

The fix consists in avoiding such a situation.


[espeak-ng:master] New Comment on Pull Request #675 Fixing ungetc bad behavior under macOS Catalina
By rhdunn:

Merged. Thanks.


[espeak-ng:master] Label added to issue #674 Build fails on MacOS Catalina by BenTalagan.


[espeak-ng:master] Issue #674 Build fails on MacOS Catalina closed by BenTalagan.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

After taking time to investigate, I think I have found the problem. It comes from the following lines :

https://github.com/espeak-ng/espeak-ng/blob/48719ad642f8a27d352983ab5964463a8c1e033e/src/libespeak-ng/dictionary.c#L153-L154

They behave differently when compiled with llvm and emscripten. Under llvm, like with gcc, this will have what I would call an 'expected' behaviour : the cast to unsigned int from any position in the char* buffer will take into account the fact that we are not aligned to a multiple of 4 bytes. Under emscripten it doesn't : shifting by n+0, n+1, n+2 or n+3 bytes leads indifferently to the same result when casting to an int. One of the rules of the 'en' dictionary falls under this case, so the condition of having 4 successive bytes at 0 is not met and the rule parser explodes.

@rhdunn, I'd like your opinion on that issue : should we implement a simple fix for this (like testing the four bytes instead of casting to unsigned int), are there any other part of the code that may be concerned?


[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

After taking some time to investigate, I think I have found the problem. It comes from the following lines :

https://github.com/espeak-ng/espeak-ng/blob/48719ad642f8a27d352983ab5964463a8c1e033e/src/libespeak-ng/dictionary.c#L153-L154

They behave differently when compiled with llvm and emscripten. Under llvm, like with gcc, this will have what I would call an 'expected' behaviour : the cast to unsigned int from any position in the char* buffer will take into account the fact that we are not aligned to a multiple of 4 bytes. Under emscripten it doesn't : shifting by n+0, n+1, n+2 or n+3 bytes leads indifferently to the same result when casting to an int. One of the rules of the 'en' dictionary falls under this case, so the condition of having 4 successive bytes at 0 is not met and the rule parser explodes.

@rhdunn, I'd like your opinion on that issue : should we implement a simple fix for this (like testing the four bytes instead of casting to unsigned int), are there any other part of the code that may be concerned?


[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

Add : after reading a bit on the net, it really looks like this should be rewritten. Some refs :

https://stackoverflow.com/questions/26995151/how-to-cast-char-array-to-int-at-non-aligned-position

https://stackoverflow.com/questions/13881487/should-i-worry-about-the-alignment-during-pointer-casting


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng/espeak-ng] Pull request opened by BenTalagan:

#676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo

This is a fix for #584, but the PR scope may be potentially larger : without this fix, the handling of compiled rules is not guaranteed to be compliant across platforms, since casting to int* may happen on non aligned char* , which has to be avoided.

Some minor options also have to be added to the emscripten compilation workflow to make it work again with newer versions.


[espeak-ng:master] New Comment on Issue #584 emscripten demo broken, probably highlights underlying problem linked to dictionary compilation
By BenTalagan:

@rhdunn : Thanks for your answer ! I have prepared a PR (#676), and limited myself to add a function to test sequential bytes to zero. It's very close to what was intended originally and non intrusive (the original code only tests four bytes, but after that they are still read one by one, not 4 by 4).


espeak-ng@groups.io Integration <espeak-ng@...>
 

4 New Commits:

[espeak-ng:master] By BenTalagan <ben_talagan@...>:
94677f4af8ad: Rule alignment fixes for non compliant platforms / Fix for emscripten demo

Modified: emscripten/Makefile
Modified: emscripten/post.js
Modified: src/libespeak-ng/dictionary.c
Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/readclause.h
Modified: src/libespeak-ng/translate.c


[espeak-ng:master] By BenTalagan <ben_talagan@...>:
9fd480afbf4f: Fixing typos and naming

Modified: src/libespeak-ng/dictionary.c
Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/readclause.h
Modified: src/libespeak-ng/translate.c


[espeak-ng:master] By BenTalagan <ben_talagan@...>:
02447abde8b3: Fixing is_str_totally_null

Modified: src/libespeak-ng/readclause.c


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
050d5e498261: Merge remote-tracking branch 'BenTalagan/master'

Modified: emscripten/Makefile
Modified: emscripten/post.js
Modified: src/libespeak-ng/dictionary.c
Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/readclause.h
Modified: src/libespeak-ng/translate.c


[espeak-ng/espeak-ng] Pull request closed by rhdunn:

#676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo

This is a fix for #584, but the PR scope may be potentially larger : without this fix, the handling of compiled rules is not guaranteed to be compliant across platforms, since casting to int* may happen on non aligned char* , which has to be avoided.

Some minor options also have to be added to the emscripten compilation workflow to make it work again with newer versions.


[espeak-ng:master] New Comment on Pull Request #676 Rule alignment fixes for non compliant platforms / Fix for emscripten demo
By rhdunn:

That's what tests are for :).

Thanks for the fix.


espeak-ng@groups.io Integration <espeak-ng@...>
 


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng/espeak-ng] Pull request opened by BenTalagan:

#677 Fixing "Language Replace" tests under MacOS

A small PR for fixing the language-replace.test script under MacOS. The grep -P is unfortunately not portable, but in that simple case the grep -E option will do. Any other suggestion is welcome :-)


[espeak-ng/espeak-ng] Pull request updated by BenTalagan:

#677 Fixing "Language Replace" tests under MacOS

A small PR for fixing the language-replace.test script under MacOS. The grep -P is unfortunately not portable, but in that simple case the grep -E option will do. Any other suggestion is welcome :-)


[espeak-ng/espeak-ng] Pull request updated by BenTalagan:

#677 Fixing "Language Replace" tests under MacOS

A small PR for fixing the language-replace.test script under MacOS. The grep -P is unfortunately not portable, but in that simple case the grep -E option will do. Any other suggestion is welcome :-)


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #677 Fixing "Language Replace" tests under MacOS
By rhdunn:

Why not replace it directly with -E, as that is supported in GNU grep?


[espeak-ng/espeak-ng] Pull request updated by BenTalagan:

#677 Fixing "Language Replace" tests under MacOS

A small PR for fixing the language-replace.test script under MacOS. The grep -P is unfortunately not portable, but in that simple case the grep -E option will do. Any other suggestion is welcome :-)


[espeak-ng:master] New Comment on Pull Request #677 Fixing "Language Replace" tests under MacOS
By BenTalagan:

Good point! I wasn't sure. I have done the change, just waiting for the tests to be over.


[espeak-ng:master] New Comment on Pull Request #677 Fixing "Language Replace" tests under MacOS
By BenTalagan:

Good point! I wasn't sure. I have made the change, just waiting for the tests to be over... done.


espeak-ng@groups.io Integration <espeak-ng@...>
 

1 New Commit:

[espeak-ng:master] By BenTalagan <ben_talagan@...>:
c7827df43b16: Using grep -E on all platforms

Modified: tests/language-replace.test


[espeak-ng:master] New Comment on Pull Request #677 Fixing "Language Replace" tests under MacOS
By rhdunn:

I've merged this commit by cherry-picking the last commit. Thanks for the fix.


[espeak-ng/espeak-ng] Pull request closed by rhdunn:

#677 Fixing "Language Replace" tests under MacOS

A small PR for fixing the language-replace.test script under MacOS. The grep -P is unfortunately not portable, but in that simple case the grep -E option will do. Any other suggestion is welcome :-)


[espeak-ng:master] New Comment on Pull Request #677 Fixing "Language Replace" tests under MacOS
By BenTalagan:

Ok! Thanks a lot.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #677 Fixing "Language Replace" tests under MacOS
By valdisvi:

grep -E is the same as egrep in Linux. What about MacOS?


[espeak-ng:master] New Comment on Pull Request #677 Fixing "Language Replace" tests under MacOS
By BenTalagan:

Under MacOS, it seems egrep and grep are the same binary.

➜  espeak-ng git:(master) egrep --version
egrep (BSD grep) 2.5.1-FreeBSD
➜  espeak-ng git:(master) grep --version
grep (BSD grep) 2.5.1-FreeBSD
➜  espeak-ng git:(master) ls -lat /usr/bin/egrep
-rwxr-xr-x  1 root  wheel  47136 24 oct 03:33 /usr/bin/egrep
➜  espeak-ng git:(master) ls -lat /usr/bin/grep 
-rwxr-xr-x  1 root  wheel  47136 24 oct 03:33 /usr/bin/grep
➜  espeak-ng git:(master) md5sum /usr/bin/egrep 
fa0d64532039165615fb06d6143076d9  /usr/bin/egrep
➜  espeak-ng git:(master) md5sum /usr/bin/grep 
fa0d64532039165615fb06d6143076d9  /usr/bin/grep