Topics

Updates to Github #github


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #810 Add some patterns to ja_rules
By tset-tset-tset:

@rhdunn

Thank you for reviewing my PR. Sorry for the lack of explanation, here's why I removed "かあ", "すう", etc:.

  1. Currently, the "あ" is grouped with the "あ", ”ああ”, "あぁ", "あー".
  2. However, when "かあ" is present, "かあぁ" or "かあー" is recognized as "かあ" + "ぁ" or "かあ" + "ー". Therefore, when we process some of the words now, "(en)" comes up.

    $ echo 'コンピュータアート' | espeak-ng -x -v ja
    k'o N\ C'i _:(en)dZ'ap@ni:z(ja)l'et@ 'u _:(en)dZ'ap@ni:z(ja)l'et@ t'a 'a _:(en)dZ'ap@ni:z(ja)l'et@ t'o
    
    $ echo 'スウェーデン' | espeak-ng -x -v ja
    s'u 'u _:(en)dZ'ap@ni:z(ja)l'et@ _:(en)dZ'ap@ni:z(ja)l'et@ t'e _:(en)dZ'ap@ni:z(ja)l'et@ N\
    
  3. When I considered the above case, I thought it would be more appropriate for the case where ”かあ” becomes "kaa" rather than "ka:".


[espeak-ng:master] New Issue Created by nesrad:
#811 Improved Interlingua voice

Hi, I've created the file ia_list to improve pronunciation for the ia voice. I'm not sure how to add it to your project, so I'll just leave it here. ia_list.zip

Since Interlingua shares its sounds with the romance languages, I was wondering what steps would be needed to use existing mbrola voices like Italian.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Issue Created by rhdunn:
#812 Add rule logic to better support Japanese vowel lengthening rules

In Japanese, a Hiragana whose pronunciation ends with the vowel a can be lengthened by adding あ (full a), ぁ (short a), or ー (lengthen indicator). The same applies with the other vowels. This results in the following set of rules for each base Hiragana:

.L22    ぁ ー // long a

.group あ
    あ       a   // a
    あ (あL22 a   // aā
    ああ      a:  // ā
    あぁ      a:  // ā
    あー      a:  // ā

It would be better to have the following rules:

.L22    あ ぁ ー   // long a

.group あ
    あ [あL22 a   // a
    あL22        a:  // ā

This requires two changes to the espeak-ng rule logic: 1. Add a [b (or a different syntax) to mean "don't match this rule if the 'b' part matches, but if it does match then only consume the 'a' part". An equivalent a] b syntax should be added for supporting matches before the main matching segment. -- maybe something like a^) b (^c? 2. Support matching substitution groups (in this case L22) in the main matching text, not just in the pre/post sections. NOTE: This may require a special syntax to differentiate it from other text, so maybe something like ${...} to use a substitution group (A, L12, etc.)?


[espeak-ng:master] New Comment on Pull Request #810 Add some patterns to ja_rules
By rhdunn:

I've raised issue #812 about making it possible to simplify the Japanese Hiragana/Kiragana pronunciation rules.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Issue Created by nesrad:
#811 Improved Interlingua voice

Hi, I've created the file ia_list to improve pronunciation for the ia voice. I'm not sure how to add it to your project, so I'll just leave it here. ia_list.zip

Since Interlingua shares its sounds with the romance languages, I was wondering what steps would be needed to use existing mbrola voices like Italian.


[espeak-ng:master] New Issue Created by nesrad:
#811 Improved Interlingua voice

Hi, I've created the file ia_list to improve pronunciation for the ia voice. I'm not sure how to add it to your project, so I'll just leave it here. ia_list_new.zip

Since Interlingua shares its sounds with the romance languages, I was wondering what steps would be needed to use existing mbrola voices like Italian.


[espeak-ng:master] New Issue Created by nesrad:
#811 Improved Interlingua voice

Hi, I've created the file ia_list to improve pronunciation for the ia voice. I'm not sure how to add it to your project, so I'll just leave it here. ia_list.zip

Since Interlingua shares its sounds with the romance languages, I was wondering what steps would be needed to use existing mbrola voices like Italian.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Issue Created by repodiac:
#814 PR: Approx. 10,000 German loan words from DE-Wiktionary could be added to espeak-ng dictionary

Hi,

I've written a small script and tutorial to extract roughly 10k German loan words from the German wiktionary, use their IPA code and convert it into Kirshenbaum syntax for import as dictionary (de_extra) into espeak-ng.

If you're interested (don't know about possible license issues with wiktionary data) you could take this as a PR or alternatively, link to my repo for the curious reader/user.

https://github.com/repodiac/espeak-ng_german_loan_words


[espeak-ng:master] New Comment on Issue #814 PR: Approx. 10,000 German loan words from DE-Wiktionary could be added to espeak-ng dictionary
By hozosch:

Well, this certainly sounds very exciting! I don't have the permission to merge this though.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #814 PR: Approx. 10,000 German loan words from DE-Wiktionary could be added to espeak-ng dictionary
By repodiac:

See https://foundation.wikimedia.org/wiki/Terms_of_Use/en for the Terms of use. I think attribution must be done properly but other than that there shouldn't be any problems.

Well, I am not a lawyer into IP... but yes, it seems if you stick to "share alike" with the usage (i.e. to my understanding, nobody is for instance, allowed to make money with this data under a proprietary license...) it should be ok - what's espeak-ng's precise license on data usage btw.?

How easy would it be to use the script for other languages?

Generally speaking: it should be possible. I parse for specific strings in German coming from the markup and surroundings. In other languages, as I assume, there is the same markup but in another language!? But I don't know if other languages in Wiktionary also specify provenance of words (i.e. lean words)... but if so, then it should be straight forward.

The code is documented - but I can "guide" you if you want to use if for a specific language, where to replace the respective strings probably. Everything from then onward stays the same.


[espeak-ng:master] New Comment on Issue #814 PR: Approx. 10,000 German loan words from DE-Wiktionary could be added to espeak-ng dictionary
By repodiac:

See https://foundation.wikimedia.org/wiki/Terms_of_Use/en for the Terms of use. I think attribution must be done properly but other than that there shouldn't be any problems.

Well, I am not a lawyer into IP... but yes, it seems if you stick to "share alike" with the usage (i.e. to my understanding, nobody is for instance, allowed to make money with this data under a proprietary license...) it should be ok - what's espeak-ng's precise license on data usage btw.?

How easy would it be to use the script for other languages?

Generally speaking: it should be possible. I parse for specific strings in German coming from the markup and surroundings. In other languages, as I assume, there is the same markup but in another language!? But I don't know if other languages in Wiktionary also specify provenance of words (i.e. lean words)... but if so, then it should be straight forward.

The code is documented - but I can "guide" you if you want to use it also for another specific language, where to replace the respective strings probably. Everything from then onward stays the same.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #814 PR: Approx. 10,000 German loan words from DE-Wiktionary could be added to espeak-ng dictionary
By valdisvi:

Solution seems interesting and may be useful for other languages, where pronunciation is not easy deducted from writing. But I'm wondering why it needs docker to run Python script.


[espeak-ng:master] New Comment on Issue #814 PR: Approx. 10,000 German loan words from DE-Wiktionary could be added to espeak-ng dictionary
By valdisvi:

Solution seems interesting and may be useful for other languages, where pronunciation is not easy deducted from writing. But I'm wondering, why it needs docker to run Python script.


[espeak-ng:master] New Comment on Issue #814 PR: Approx. 10,000 German loan words from DE-Wiktionary could be added to espeak-ng dictionary
By repodiac:

Docker is used for a "turn-key" solution - no manual download or anything, just hit the button when you want the most recent file update, so to speak. It is all source code so if you are tech savy enough you can simply run -- or invoke the method even -- the script for yourself ;-)

However, all this does not answer my initial request: Either you might include it into your build/deploy process (as the tutorial says... there are "Compile errors" sometimes) or you might link somewhere to the repo for anyone interested in enriching his German phoneme dictionary!?


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #811 Improved Interlingua voice
By valdisvi:

Thanks for contribution! It is included into the project with commit 5deac40. Some notes: 1. as word list is very long, it is put in file ia_listx, where extended list of words is usually stored, except 2. pronunciation of numbers are put into updated ia_list file. 3. file had two different rules for the same word:

le	$nounf $u+
le 	$verb $verbextend $u

I left only first entry in ia_list file.

  1. There were capital first letters used for written words, these were converted to lowercase, because espeak-ng rules don't care about case of letters in written form. If you need to care about them, additional flag (e.g. $capital or $allcaps) should be added after pronunciation.

To use MBROLA Italian voice for Interlingua, look MBROLA voices guide and mb-it1 and it1 files as examples.


[espeak-ng:master] Issue #811 Improved Interlingua voice closed by nesrad.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #813 SSML cleanup
By valdisvi:

To check for regressions, I first cherry-picked only commit 959bf26 and run make check with everything else the same. I got error:

testing tests/ssml/references.ssml2
1c1
< l'EsDan_: gr'eIt@D,an_: 'amp@s,and t'Ik_: kw'oUts
---
> gr'eIt@D,an_: l'EsDan_: gr'eIt@D,an_: 'amp@s,and t'Ik_: kw'oUts
make: *** [Makefile:2654: tests/ssml.check] Error 1

Does it means, that current implementation actually is broken and doesn't work as expected? Why in tests/ssml/references.ssml2

<speak> &lt; &gt; &amp; &apos; &quot; </speak>
<speak> B &#66;</speak>
<speak>z &#x7A;</speak>

with all commits applied it works with spaces removed.


[espeak-ng:master] New Comment on Pull Request #813 SSML cleanup
By valdisvi:

To check for regressions, I first cherry-picked only commit 959bf26 and run make check with everything else the same. I got error:

testing tests/ssml/references.ssml2
1c1
< l'EsDan_: gr'eIt@D,an_: 'amp@s,and t'Ik_: kw'oUts
---
> gr'eIt@D,an_: l'EsDan_: gr'eIt@D,an_: 'amp@s,and t'Ik_: kw'oUts
make: *** [Makefile:2654: tests/ssml.check] Error 1

Does it means, that current implementation actually is broken and doesn't work as expected? Why in tests/ssml/references.ssml2 are unneded spaces?

<speak> &lt; &gt; &amp; &apos; &quot; </speak>
<speak> B &#66;</speak>
<speak>z &#x7A;</speak>

with all commits applied it works with these spaces removed.


[espeak-ng:master] New Comment on Pull Request #813 SSML cleanup
By valdisvi:

To check for regressions, I first cherry-picked only commit 959bf26 and run make check with everything else the same. I got error:

testing tests/ssml/references.ssml2
1c1
< l'EsDan_: gr'eIt@D,an_: 'amp@s,and t'Ik_: kw'oUts
---
> gr'eIt@D,an_: l'EsDan_: gr'eIt@D,an_: 'amp@s,and t'Ik_: kw'oUts
make: *** [Makefile:2654: tests/ssml.check] Error 1

Does it means, that current implementation actually is broken and doesn't work as expected? Why in tests/ssml/references.ssml2 are unneded spaces?

<speak> &lt; &gt; &amp; &apos; &quot; </speak>
<speak> B &#66;</speak>
<speak>z &#x7A;</speak>

with all commits applied it works with these spaces removed.


espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #813 SSML cleanup
By valdisvi:

I think, it is old problem which will be fixed, please add commit to make xml formatted nicer, and I'll merge it. In future, it is better to start by adding test before refactoring. That will allow to ensure, that it either didn't come worse, or it actually fixes some problem which was not found before.


[espeak-ng:master] New Comment on Pull Request #813 SSML cleanup
By valdisvi:

I think, it is old problem which will be fixed. Please add commit to make xml formatted nicer, and I'll merge it. In future, it is better to start by adding test before refactoring. That will allow to ensure, that it either didn't come worse, or it actually fixes some problem which was not found before.


espeak-ng@groups.io Integration <espeak-ng@...>
 

7 New Commits:

[espeak-ng:master] By Juho Hiltunen <jaacoppi@...>:
50f58168e100: code cleanup: remove unused *constcharptr and MakeWawe2()

Modified: src/libespeak-ng/synthesize.h
Modified: src/libespeak-ng/translate.h


[espeak-ng:master] By Juho Hiltunen <jaacoppi@...>:
75d66c89d035: code cleanup: Move self_closing checks to ProcessSsmlTag()

This is a bit slower since we don't pass n_xml_buf as an argument but
rather get it with a call to wcslen. It is much cleaner though, since
the name ProcessSsmlTag() implies that all processing should be done
there.

Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/ssml.c
Modified: src/libespeak-ng/ssml.h


[espeak-ng:master] By Juho Hiltunen <jaacoppi@...>:
34657e7ea4fb: code cleanup: move check for SSML comments and declarations to
ProcessSsmlTag()

Note the line in readclause.c:
if ((c2 == '/') || iswalpha(c2) || c2 == '!' || c2 == '?') {

It might be enough to pass everything to ProcessSsmlTag. What are the
cases that are skipped because of this?

Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/ssml.c


[espeak-ng:master] By Juho Hiltunen <jaacoppi@...>:
959bf26b6b1a: Add a test for XML/SSML character and entity references

Added: tests/ssml/references.expected
Added: tests/ssml/references.ssml2
Modified: tests/ssml.test


[espeak-ng:master] By Juho Hiltunen <jaacoppi@...>:
54d93cf2b4bc: code cleanup: move ssml reference handling logic to a new function ParseSsmlReference()

It's unclear why c2 needs to be set after an entity reference.

Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/ssml.c
Modified: src/libespeak-ng/ssml.h


[espeak-ng:master] By Juho Hiltunen <jaacoppi@...>:
c9dee003bf62: better explanation and nicer formatting for the ssml reference test.

Modified: tests/ssml/references.ssml2


[espeak-ng:master] By Valdis Vitolins <valdis.vitolins@...>:
5b3da5950aea: Merge pull request #813

Added: tests/ssml/references.expected
Added: tests/ssml/references.ssml2
Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/ssml.c
Modified: src/libespeak-ng/ssml.h
Modified: src/libespeak-ng/synthesize.h
Modified: src/libespeak-ng/translate.h
Modified: tests/ssml.test


[espeak-ng/espeak-ng] Pull request closed by valdisvi:

#813 SSML cleanup

Move SSML related logic from ReadClause to ProcessSsmlTag and to a new function ParseSsmlReference().

Contributes to #369 and should make locating SSML bugs easier.