Date   

Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #216 Make it easy to combine languages in different scripts.
By rhdunn:

The later espeak versions have partial support for this with the alphabet2 features. This code should be extended and improved so that:

  • [ ] Scripts are specified with their ISO 15924 codes, not custom/language codes -- these should be packed into int32_t values like other 4 character or less values in espeak-ng.
  • [ ] The alphabet2 feature should be renamed and restricted to specifying related scripts for a top-level language/accent (e.g. sd-Deva and sd-Arab for the sd language).
  • [ ] The espeak_ng_SetVoiceForScript method sets up the appropriate ALPHABET, equivalent to using the espeak alphabet2 voice file property.
  • [ ] Extend espeak_ng_SetVoiceByName to handle the Arab=sd,Latn=fr style language specification -- this should also make the command line support this syntax.


[espeak-ng:master] New Comment on Issue #216 Make it easy to combine languages in different scripts.
By rhdunn:

The later espeak versions have partial support for this with the alphabet2 features. This code should be extended and improved so that:

  • [ ] Scripts are specified with their ISO 15924 codes, not custom/language codes -- these should be packed into int32_t values like other 4 character or less values in espeak-ng.
  • [ ] The alphabet2 feature should be renamed and restricted to specifying related scripts for a top-level language/accent (e.g. sd-Deva and sd-Arab for the sd language).
  • [ ] The espeak_ng_SetVoiceForScript method sets up the appropriate ALPHABET, equivalent to using the espeak alphabet2 voice file property.
  • [ ] Extend espeak_ng_SetVoiceByName to handle the Arab=sd,Latn=fr style language specification -- this should also make the command line support this syntax.


[espeak-ng:master] New Issue Created by rhdunn:
#216 Make it easy to combine languages in different scripts.

The Problem

The non-latin script languages fall back to English for any latin text. Any other unrecognised script, espeak currently speaks "Script name letter" for each character in that script. The current Persian voices have variants for falling back to British or American English for latin text. Some languages like Japanese and Sindhi can be written in multiple scripts, and can take different forms (e.g. the different styles of Romaji for writing Japanese in latin characters).

If using MBROLA voices, or other voices specific to some languages, the user may want to switch between those when switching between languages for better intelligability of those languages.

The Solution

Language scripts are specified using the 4-letter ISO 15924 codes such as Grek for Greek. BCP 47 supports using these in language names, e.g. sd-Deva for Sindhi in the Devanagari script or jp-Hrkt for Japanese in the Hiragana and Katakana syllabaries. Following BCP 47 convention, the script name should not be used when it is the primary script for the language (e.g. using es instead of es-Latn).

Language definition files (currently voice files, but see issue #19) should specify the script they support using a script ISO_15924 line such as script Latn. Languages should list both the language code and the language code with the script as supported languages, for example specifying both language sd and language sd-Arab. The language file containing the default script should have the highest priority over the others, just like accents have a lower priority to the base language.

Each language dictionary should be restricted to a single Script. It should be possible to create processing chains (e.g. jp-Hant processes Traditional Chinese Han characters into Hiragana which jp-Hrkt then pronounces).

The command line should support a comma/semicolon separated list of Script=language, e.g. Arab=sd,Deva=sd,Latn=en-GB-scotish to use Sindhi for Arabic and Devanagari characters and Scottish English for Latin characters. It should also support using Script=language/voice, e.g. Latn=en-GB/mb-de1 for using the MBROLA de1 German voice to read Latin characters in British English.

The existing API should support that syntax in addition to having a new API method espeak_ng_STATUS espeak_ng_SetVoiceForScript(const char *script, const char *language, const char *script); where script can be NULL to use the default script of a language.


Github push to espeak-ng:espeak-ng #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

1 New Commit:

[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
1a8624ffdb1b: Use the language names in the 'name' field, not a shorthand or other identifier.

Modified: README.md
Modified: docs/add_language.md
Modified: docs/languages.md
Modified: espeak-ng-data/lang/aav/vi
Modified: espeak-ng-data/lang/aav/vi-VN-x-central
Modified: espeak-ng-data/lang/aav/vi-VN-x-south
Modified: espeak-ng-data/lang/art/eo
Modified: espeak-ng-data/lang/art/ia
Modified: espeak-ng-data/lang/art/jbo
Modified: espeak-ng-data/lang/art/lfn
Modified: espeak-ng-data/lang/azc/nci
Modified: espeak-ng-data/lang/bat/lt
Modified: espeak-ng-data/lang/bat/lv
Modified: espeak-ng-data/lang/bnt/sw
Modified: espeak-ng-data/lang/bnt/tn
Modified: espeak-ng-data/lang/ccs/ka
Modified: espeak-ng-data/lang/cel/cy
Modified: espeak-ng-data/lang/cel/ga
Modified: espeak-ng-data/lang/cel/gd
Modified: espeak-ng-data/lang/cus/om
Modified: espeak-ng-data/lang/dra/kn
Modified: espeak-ng-data/lang/dra/ml
Modified: espeak-ng-data/lang/dra/ta
Modified: espeak-ng-data/lang/dra/te
Modified: espeak-ng-data/lang/esx/kl
Modified: espeak-ng-data/lang/eu
Modified: espeak-ng-data/lang/gmq/da
Modified: espeak-ng-data/lang/gmq/is
Modified: espeak-ng-data/lang/gmq/no
Modified: espeak-ng-data/lang/gmq/sv
Modified: espeak-ng-data/lang/gmw/af
Modified: espeak-ng-data/lang/gmw/de
Modified: espeak-ng-data/lang/gmw/en
Modified: espeak-ng-data/lang/gmw/en-029
Modified: espeak-ng-data/lang/gmw/en-GB-scotland
Modified: espeak-ng-data/lang/gmw/en-GB-x-gbclan
Modified: espeak-ng-data/lang/gmw/en-GB-x-gbcwmd
Modified: espeak-ng-data/lang/gmw/en-GB-x-rp
Modified: espeak-ng-data/lang/gmw/en-US
Modified: espeak-ng-data/lang/gmw/nl
Modified: espeak-ng-data/lang/grk/el
Modified: espeak-ng-data/lang/grk/grc
Modified: espeak-ng-data/lang/inc/as
Modified: espeak-ng-data/lang/inc/bn
Modified: espeak-ng-data/lang/inc/gu
Modified: espeak-ng-data/lang/inc/hi
Modified: espeak-ng-data/lang/inc/kok
Modified: espeak-ng-data/lang/inc/mr
Modified: espeak-ng-data/lang/inc/ne
Modified: espeak-ng-data/lang/inc/or
Modified: espeak-ng-data/lang/inc/pa
Modified: espeak-ng-data/lang/inc/sd
Modified: espeak-ng-data/lang/inc/si
Modified: espeak-ng-data/lang/inc/ur
Modified: espeak-ng-data/lang/ine/hy
Modified: espeak-ng-data/lang/ine/hy-arevmda
Modified: espeak-ng-data/lang/ine/sq
Modified: espeak-ng-data/lang/ira/fa
Modified: espeak-ng-data/lang/ira/fa-Latn
Modified: espeak-ng-data/lang/ira/ku
Modified: espeak-ng-data/lang/itc/la
Modified: espeak-ng-data/lang/jpx/jp
Modified: espeak-ng-data/lang/poz/id
Modified: espeak-ng-data/lang/poz/ms
Modified: espeak-ng-data/lang/roa/an
Modified: espeak-ng-data/lang/roa/ca
Modified: espeak-ng-data/lang/roa/es
Modified: espeak-ng-data/lang/roa/es-419
Modified: espeak-ng-data/lang/roa/fr
Modified: espeak-ng-data/lang/roa/fr-BE
Modified: espeak-ng-data/lang/roa/it
Modified: espeak-ng-data/lang/roa/pap
Modified: espeak-ng-data/lang/roa/pt-BR
Modified: espeak-ng-data/lang/roa/pt-PT
Modified: espeak-ng-data/lang/roa/ro
Modified: espeak-ng-data/lang/sai/gn
Modified: espeak-ng-data/lang/sem/am
Modified: espeak-ng-data/lang/sem/ar
Modified: espeak-ng-data/lang/sem/mt
Modified: espeak-ng-data/lang/sit/cmn
Modified: espeak-ng-data/lang/sit/mni
Modified: espeak-ng-data/lang/sit/my
Modified: espeak-ng-data/lang/sit/yue
Modified: espeak-ng-data/lang/trk/az
Modified: espeak-ng-data/lang/trk/ky
Modified: espeak-ng-data/lang/trk/tr
Modified: espeak-ng-data/lang/trk/tt
Modified: espeak-ng-data/lang/und/und-fonipa
Modified: espeak-ng-data/lang/urj/et
Modified: espeak-ng-data/lang/urj/fi
Modified: espeak-ng-data/lang/urj/hu
Modified: espeak-ng-data/lang/zls/bg
Modified: espeak-ng-data/lang/zls/bs
Modified: espeak-ng-data/lang/zls/cs
Modified: espeak-ng-data/lang/zls/hr
Modified: espeak-ng-data/lang/zls/mk
Modified: espeak-ng-data/lang/zls/pl
Modified: espeak-ng-data/lang/zls/ru
Modified: espeak-ng-data/lang/zls/sk
Modified: espeak-ng-data/lang/zls/sl
Modified: espeak-ng-data/lang/zls/sr


Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

6 New Commits:

[espeak-ng:master] By Valdis Vitolins <valdis.vitolins@...>:
6bd656cf4aba: Limit search for jump POST rule to null byte for last word in sentence

Modified: src/libespeak-ng/dictionary.c


[espeak-ng:master] By Valdis Vitolins <valdis.vitolins@...>:
29f0b673ee15: Updated comments for POST jump rule

Modified: src/libespeak-ng/dictionary.c


[espeak-ng:master] By Valdis Vitolins <valdis.vitolins@...>:
b2057635c4f6: PRE jump rule e.g. 'xyJ)' implemented

Modified: src/libespeak-ng/dictionary.c


[espeak-ng:master] By Valdis Vitolins <valdis.vitolins@...>:
cabe5001b519: Issue #199 'xxJ)' statement as precondition implemented and documented

Modified: docs/dictionary.md


[espeak-ng:master] By Valdis Vitolins <valdis.vitolins@...>:
f4bcc119780e: Documentation: more than one can be used for skipped characters.

Modified: docs/dictionary.md


[espeak-ng:master] By Valdis Vitolins <valdis.vitolins@...>:
0c3ceb2e703d: As secondary stress is still spelled differently it is disabled

Modified: src/libespeak-ng/tr_languages.c


[espeak-ng:master] New Comment on Pull Request #219 Implementation of 'J' statement as precondition, stress flag changes for Latvian
By rhdunn:

I have cherry-picked the J statement changes into the master branch. Thanks for the PR.

Can you reset your master branch to the espeak-ng one so future PRs don't pick up the Burmese changes. It would also be useful to develop the changes on separate branches so they don't interfere with each other if one of the branches does not get merged.


[espeak-ng/espeak-ng] Pull request closed by rhdunn:

#219 Implementation of 'J' statement as precondition, stress flag changes for Latvian


[espeak-ng/espeak-ng] Pull request updated by rhdunn:

#219 Implementation of 'J' statement as precondition, stress flag changes for Latvian


Pull Request Opened #github

espeak-ng@groups.io Integration <espeak-ng@...>
 


[espeak-ng:master] new issue: Language analysis improvements #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Issue Created by ValdisVitolins:
#199 Language analysis improvements

Language analysis and spelling decisions could be improved by introducing following new features: - [ ] extend verb follows/noun follows marks to more/arbitrary flags, which then can be used to make different pronunciation rules for homonyms - [x] J statement as precondition, to allow choosing pronunciation from preceding word. This could help solving names of numbers as different words #83 - [ ] possibility to go back to start of the rules and redo analysis again (e.g. issue #121 not only after removing pre/suf-fixes. Could be performance drain, if used improperly.) - [ ] replace rule extended to replace not only characters, but group of characters, also probably replace using matching rules - [ ] _list extended to mark arbitrary defined word types (e.g. $units #115) and by comparing only root part of the word (i.e. partial match without pre/suffixes) - [ ] output (prosody data) extended to mark syllables with more/arbitrary defined ways for different pronuciations (e.g. high/low pitch for Chinese etc.) - [ ] Fix issue #196 Word end mark _ doesn't work properly with ~ character group. - [ ] Common rule for stress decision before or after specific spelling decision of word is made. E.g. to put stress for penultimate syllable in Italian #80 as common rule.


[espeak-ng:master] new issue: Language analysis improvements #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Issue Created by ValdisVitolins:
#199 Language analysis improvements

Language analysis and spelling decisions could be improved by introducing following new features: - [ ] extend verb follows/noun follows marks to more/arbitrary flags, which then can be used to make different pronunciation rules for homonyms - [ ] J statement as precondition, to allow choosing pronunciation from preceding word. This could help solving names of numbers as different words #83 - [ ] possibility to go back to start of the rules and redo analysis again (e.g. issue #121 not only after removing pre/suf-fixes. Could be performance drain, if used improperly.) - [ ] replace rule extended to replace not only characters, but group of characters, also probably replace using matching rules - [ ] _list extended to mark arbitrary defined word types (e.g. $units #115) and by comparing only root part of the word (i.e. partial match without pre/suffixes) - [ ] output (prosody data) extended to mark syllables with more/arbitrary defined ways for different pronuciations (e.g. high/low pitch for Chinese etc.) - [ ] Fix issue #196 Word end mark _ doesn't work properly with ~ character group. - [ ] Common rule for stress decision before or after specific spelling decision of word is made. E.g. to put stress for penultimate syllable in Italian #80 as common rule.


[espeak-ng:master] new issue: Create language specification files #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Issue Created by rhdunn:
#19 Create language specification files

These files should use the BCP47 code for the language (en-GB-scotland, pt-BR, da, etc.) and cover the details for defining that language's behaviour: 1. the dictrule, etc. options in the voice definition files; 2. the options programatically defined in tr_languages.c.

For (2), a set of rule commands will be defined in the language files and processed in a generic processing function. For example, using letter-vowel w in languages/cel/cy instead of SetLetterVowel(tr, 'w'); in tr_languages.c.

The approach to implementing this should be to do it on a language-by-language basis. That is, define the commands needed for a language, supporting it through the generic definition file, then move to the next language.


Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] Label added to issue #67 Clean up the portability support for espeak. by rhdunn.


[espeak-ng:master] Issue #67 Clean up the portability support for espeak. closed by rhdunn.


[espeak-ng:master] New Issue Created by rhdunn:
#218 Define the values from tr_languages.c in the language files

A set of rule commands will be defined in the language files and processed in a generic processing function. For example, using letter-vowel w in languages/cel/cy instead of SetLetterVowel(tr, 'w'); in tr_languages.c.

This should result in the tr_languages.c file being removed and all the language settings being in the language files.


[espeak-ng:master] Label added to issue #218 Define the values from tr_languages.c in the language files by rhdunn.


[espeak-ng:master] Label added to issue #218 Define the values from tr_languages.c in the language files by rhdunn.


[espeak-ng:master] New Issue Created by rhdunn:
#218 Define the values from tr_languages.c in the language files

A set of rule commands will be defined in the language files and processed in a generic processing function. For example, using letter-vowel w in languages/cel/cy instead of SetLetterVowel(tr, 'w'); in tr_languages.c.

This should result in the tr_languages.c file being removed and all the language settings being in the language files.


Github push to espeak-ng:espeak-ng #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

6 New Commits:

[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
df6a2228b73f: Use -EISDIR instead of -2 in GetFileLength for directories.

Modified: src/libespeak-ng/readclause.c
Modified: src/libespeak-ng/speech.c
Modified: src/libespeak-ng/synth_mbrola.c
Modified: src/libespeak-ng/synthdata.c
Modified: src/libespeak-ng/voices.c


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
3ce7fab7db4c: Return the actual error from GetFileLength instead of 0.

Modified: src/libespeak-ng/speech.c


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
022e4e82dda1: Don't check if the MBROLA voice data is present on Windows. With the original eSpeak Windows port, the MBROLA voices had to be present in an mbrola folder of the espeak-data. This will be different in espeak-ng, allowing the MBROLA voice files to be located elsewhere on the system.

Modified: src/libespeak-ng/voices.c


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
bb7630e4c1dc: Make len_path_voices a parameter to GetVoices.

Modified: src/libespeak-ng/voices.c


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
6eaf1d2ddc30: Look in espeak-ng-data/lang for voices. This is to support splitting language specification (dictionaries) from voice specification (phon*).

Modified: src/libespeak-ng/voices.c


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
285e88c72087: Move the language files to espeak-ng-data/lang.

Added: espeak-ng-data/lang/aav/vi
Added: espeak-ng-data/lang/aav/vi-VN-x-central
Added: espeak-ng-data/lang/aav/vi-VN-x-south
Added: espeak-ng-data/lang/art/eo
Added: espeak-ng-data/lang/art/ia
Added: espeak-ng-data/lang/art/jbo
Added: espeak-ng-data/lang/art/lfn
Added: espeak-ng-data/lang/azc/nci
Added: espeak-ng-data/lang/bat/lt
Added: espeak-ng-data/lang/bat/lv
Added: espeak-ng-data/lang/bnt/sw
Added: espeak-ng-data/lang/bnt/tn
Added: espeak-ng-data/lang/ccs/ka
Added: espeak-ng-data/lang/cel/cy
Added: espeak-ng-data/lang/cel/ga
Added: espeak-ng-data/lang/cel/gd
Added: espeak-ng-data/lang/cus/om
Added: espeak-ng-data/lang/dra/kn
Added: espeak-ng-data/lang/dra/ml
Added: espeak-ng-data/lang/dra/ta
Added: espeak-ng-data/lang/dra/te
Added: espeak-ng-data/lang/esx/kl
Added: espeak-ng-data/lang/eu
Added: espeak-ng-data/lang/gmq/da
Added: espeak-ng-data/lang/gmq/is
Added: espeak-ng-data/lang/gmq/no
Added: espeak-ng-data/lang/gmq/sv
Added: espeak-ng-data/lang/gmw/af
Added: espeak-ng-data/lang/gmw/de
Added: espeak-ng-data/lang/gmw/en
Added: espeak-ng-data/lang/gmw/en-029
Added: espeak-ng-data/lang/gmw/en-GB-scotland
Added: espeak-ng-data/lang/gmw/en-GB-x-gbclan
Added: espeak-ng-data/lang/gmw/en-GB-x-gbcwmd
Added: espeak-ng-data/lang/gmw/en-GB-x-rp
Added: espeak-ng-data/lang/gmw/en-US
Added: espeak-ng-data/lang/gmw/nl
Added: espeak-ng-data/lang/grk/el
Added: espeak-ng-data/lang/grk/grc
Added: espeak-ng-data/lang/inc/as
Added: espeak-ng-data/lang/inc/bn
Added: espeak-ng-data/lang/inc/gu
Added: espeak-ng-data/lang/inc/hi
Added: espeak-ng-data/lang/inc/kok
Added: espeak-ng-data/lang/inc/mr
Added: espeak-ng-data/lang/inc/ne
Added: espeak-ng-data/lang/inc/or
Added: espeak-ng-data/lang/inc/pa
Added: espeak-ng-data/lang/inc/sd
Added: espeak-ng-data/lang/inc/si
Added: espeak-ng-data/lang/inc/ur
Added: espeak-ng-data/lang/ine/hy
Added: espeak-ng-data/lang/ine/hy-arevmda
Added: espeak-ng-data/lang/ine/sq
Added: espeak-ng-data/lang/ira/fa
Added: espeak-ng-data/lang/ira/fa-Latn
Added: espeak-ng-data/lang/ira/fa-en-us
Added: espeak-ng-data/lang/ira/ku
Added: espeak-ng-data/lang/itc/la
Added: espeak-ng-data/lang/jpx/jp
Added: espeak-ng-data/lang/ko
Added: espeak-ng-data/lang/poz/id
Added: espeak-ng-data/lang/poz/ms
Added: espeak-ng-data/lang/roa/an
Added: espeak-ng-data/lang/roa/ca
Added: espeak-ng-data/lang/roa/es
Added: espeak-ng-data/lang/roa/es-419
Added: espeak-ng-data/lang/roa/fr
Added: espeak-ng-data/lang/roa/fr-BE
Added: espeak-ng-data/lang/roa/it
Added: espeak-ng-data/lang/roa/pap
Added: espeak-ng-data/lang/roa/pt-BR
Added: espeak-ng-data/lang/roa/pt-PT
Added: espeak-ng-data/lang/roa/ro
Added: espeak-ng-data/lang/sai/gn
Added: espeak-ng-data/lang/sem/am
Added: espeak-ng-data/lang/sem/ar
Added: espeak-ng-data/lang/sem/mt
Added: espeak-ng-data/lang/sit/cmn
Added: espeak-ng-data/lang/sit/mni
Added: espeak-ng-data/lang/sit/my
Added: espeak-ng-data/lang/sit/yue
Added: espeak-ng-data/lang/trk/az
Added: espeak-ng-data/lang/trk/ky
Added: espeak-ng-data/lang/trk/tr
Added: espeak-ng-data/lang/trk/tt
Added: espeak-ng-data/lang/und/und-fonipa
Added: espeak-ng-data/lang/urj/et
Added: espeak-ng-data/lang/urj/fi
Added: espeak-ng-data/lang/urj/hu
Added: espeak-ng-data/lang/zls/bg
Added: espeak-ng-data/lang/zls/bs
Added: espeak-ng-data/lang/zls/cs
Added: espeak-ng-data/lang/zls/hr
Added: espeak-ng-data/lang/zls/mk
Added: espeak-ng-data/lang/zls/pl
Added: espeak-ng-data/lang/zls/ru
Added: espeak-ng-data/lang/zls/sk
Added: espeak-ng-data/lang/zls/sl
Added: espeak-ng-data/lang/zls/sr
Removed: espeak-ng-data/voices/aav/vi
Removed: espeak-ng-data/voices/aav/vi-VN-x-central
Removed: espeak-ng-data/voices/aav/vi-VN-x-south
Removed: espeak-ng-data/voices/art/eo
Removed: espeak-ng-data/voices/art/ia
Removed: espeak-ng-data/voices/art/jbo
Removed: espeak-ng-data/voices/art/lfn
Removed: espeak-ng-data/voices/azc/nci
Removed: espeak-ng-data/voices/bat/lt
Removed: espeak-ng-data/voices/bat/lv
Removed: espeak-ng-data/voices/bnt/sw
Removed: espeak-ng-data/voices/bnt/tn
Removed: espeak-ng-data/voices/ccs/ka
Removed: espeak-ng-data/voices/cel/cy
Removed: espeak-ng-data/voices/cel/ga
Removed: espeak-ng-data/voices/cel/gd
Removed: espeak-ng-data/voices/cus/om
Removed: espeak-ng-data/voices/dra/kn
Removed: espeak-ng-data/voices/dra/ml
Removed: espeak-ng-data/voices/dra/ta
Removed: espeak-ng-data/voices/dra/te
Removed: espeak-ng-data/voices/esx/kl
Removed: espeak-ng-data/voices/eu
Removed: espeak-ng-data/voices/gmq/da
Removed: espeak-ng-data/voices/gmq/is
Removed: espeak-ng-data/voices/gmq/no
Removed: espeak-ng-data/voices/gmq/sv
Removed: espeak-ng-data/voices/gmw/af
Removed: espeak-ng-data/voices/gmw/de
Removed: espeak-ng-data/voices/gmw/en
Removed: espeak-ng-data/voices/gmw/en-029
Removed: espeak-ng-data/voices/gmw/en-GB-scotland
Removed: espeak-ng-data/voices/gmw/en-GB-x-gbclan
Removed: espeak-ng-data/voices/gmw/en-GB-x-gbcwmd
Removed: espeak-ng-data/voices/gmw/en-GB-x-rp
Removed: espeak-ng-data/voices/gmw/en-US
Removed: espeak-ng-data/voices/gmw/nl
Removed: espeak-ng-data/voices/grk/el
Removed: espeak-ng-data/voices/grk/grc
Removed: espeak-ng-data/voices/inc/as
Removed: espeak-ng-data/voices/inc/bn
Removed: espeak-ng-data/voices/inc/gu
Removed: espeak-ng-data/voices/inc/hi
Removed: espeak-ng-data/voices/inc/kok
Removed: espeak-ng-data/voices/inc/mr
Removed: espeak-ng-data/voices/inc/ne
Removed: espeak-ng-data/voices/inc/or
Removed: espeak-ng-data/voices/inc/pa
Removed: espeak-ng-data/voices/inc/sd
Removed: espeak-ng-data/voices/inc/si
Removed: espeak-ng-data/voices/inc/ur
Removed: espeak-ng-data/voices/ine/hy
Removed: espeak-ng-data/voices/ine/hy-arevmda
Removed: espeak-ng-data/voices/ine/sq
Removed: espeak-ng-data/voices/ira/fa
Removed: espeak-ng-data/voices/ira/fa-Latn
Removed: espeak-ng-data/voices/ira/fa-en-us
Removed: espeak-ng-data/voices/ira/ku
Removed: espeak-ng-data/voices/itc/la
Removed: espeak-ng-data/voices/jpx/jp
Removed: espeak-ng-data/voices/ko
Removed: espeak-ng-data/voices/poz/id
Removed: espeak-ng-data/voices/poz/ms
Removed: espeak-ng-data/voices/roa/an
Removed: espeak-ng-data/voices/roa/ca
Removed: espeak-ng-data/voices/roa/es
Removed: espeak-ng-data/voices/roa/es-419
Removed: espeak-ng-data/voices/roa/fr
Removed: espeak-ng-data/voices/roa/fr-BE
Removed: espeak-ng-data/voices/roa/it
Removed: espeak-ng-data/voices/roa/pap
Removed: espeak-ng-data/voices/roa/pt-BR
Removed: espeak-ng-data/voices/roa/pt-PT
Removed: espeak-ng-data/voices/roa/ro
Removed: espeak-ng-data/voices/sai/gn
Removed: espeak-ng-data/voices/sem/am
Removed: espeak-ng-data/voices/sem/ar
Removed: espeak-ng-data/voices/sem/mt
Removed: espeak-ng-data/voices/sit/cmn
Removed: espeak-ng-data/voices/sit/mni
Removed: espeak-ng-data/voices/sit/my
Removed: espeak-ng-data/voices/sit/yue
Removed: espeak-ng-data/voices/trk/az
Removed: espeak-ng-data/voices/trk/ky
Removed: espeak-ng-data/voices/trk/tr
Removed: espeak-ng-data/voices/trk/tt
Removed: espeak-ng-data/voices/und/und-fonipa
Removed: espeak-ng-data/voices/urj/et
Removed: espeak-ng-data/voices/urj/fi
Removed: espeak-ng-data/voices/urj/hu
Removed: espeak-ng-data/voices/zls/bg
Removed: espeak-ng-data/voices/zls/bs
Removed: espeak-ng-data/voices/zls/cs
Removed: espeak-ng-data/voices/zls/hr
Removed: espeak-ng-data/voices/zls/mk
Removed: espeak-ng-data/voices/zls/pl
Removed: espeak-ng-data/voices/zls/ru
Removed: espeak-ng-data/voices/zls/sk
Removed: espeak-ng-data/voices/zls/sl
Removed: espeak-ng-data/voices/zls/sr
Modified: docs/add_language.md
Modified: docs/voices.md


Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

4 New Commits:

[espeak-ng:master] By chrislm <llajta2012@...>:
59c940837613: IT: Removed automatic secondari stress IT: Unstressed final syllable is diminished IT: Preserve unstressed monosyllable words and allow secondary stress in multisillable words indicated as unstressed in it_list.

Modified: src/libespeak-ng/tr_languages.c


[espeak-ng:master] By chrislm <llajta2012@...>:
d757fd974392: IT: Reading roman numerals as ordinals.

Modified: src/libespeak-ng/tr_languages.c


[espeak-ng:master] By chrislm <llajta2012@...>:
5f423fc1b4be: IT: last improvements tested on january 2017.

Modified: dictsource/it_list
Modified: dictsource/it_listx
Modified: dictsource/it_rules
Modified: phsource/intonation
Modified: phsource/ph_italian


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
d89c6fdcc307: Add the Italian language updates to the CHANGELOG.md file.

Modified: CHANGELOG.md


[espeak-ng/espeak-ng] Pull request closed by rhdunn:

#217 Italian: secondary stress, Roman numbers and last updates
In this PR: IT: Removed automatic secondari stress #214 IT: Unstressed final syllable is diminished IT: Preserve unstressed monosyllable words and allow secondary stress in multisillable words indicated as unstressed in it_list #214 . IT: Reading roman numerals as ordinals #215 IT: last improvements tested on january 2017.


[espeak-ng:master] New Comment on Pull Request #217 Italian: secondary stress, Roman numbers and last updates
By rhdunn:

Thanks for the PR.


Pull Request Opened #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng/espeak-ng] Pull request opened by Christianlm:

#217 Italian: secondary stress, Roman numbers and last updates
In this PR: IT: Removed automatic secondari stress #214 IT: Unstressed final syllable is diminished IT: Preserve unstressed monosyllable words and allow secondary stress in multisillable words indicated as unstressed in it_list #214 . IT: Reading roman numerals as ordinals #215 IT: last improvements tested on january 2017.


Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Issue Created by rhdunn:
#16 Support emoticons and emoji

There are 3 types of emoticons/emoji that can be supported: 1. special punctuation/symbol sequences like :); 2. Unicode characters like 😃 (smiling face with open mouth); 3. using emoji shortcodes like <img src="https://groups.io/img/emojis/1f604.png" alt="smile" width="22">.

Ideally, the punctuation sequences and emoji shortcodes should be mapped to the Unicode characters, and those characters specify the text of how they are pronounced (e.g. "smiling face"). The Unicode character support should be possible, but I suspect the others would need modifications to espeak's text analysis logic to detect the emoticon/emoji sequences -- I don't currently understand this logic too well, so would need some time understanding how it works.

The other issue is sharing the punctuation sequence and emoji shortcode mappings, so the different languages don't need to duplicate those definitions. I don't currently know how possible that would be.

Pronouncing the Unicode characters can be tricky as well ("emoji" technically also cover emoticons and other characters like playing cards that are not necessarily part of the emoji block).

The Unicode characters can combine in complex ways. The flag "emoji" are an encoding of the 2-letter country code that flag represents (IT for the Italian flag, US for the American flag, GB for the British flag, etc.) -- all these permutations need supporting. Another complexity is the recent addition of skin tone modifiers.

Resources/References:

  1. http://www.unicode.org/emoji/charts/full-emoji-list.html -- a list of emoji/emoticon characters;
  2. http://www.emoji-cheat-sheet.com/ -- a list of emoji/emoticon shortcodes;
  3. http://www.unicode.org/Public/emoji/1.0/emoji-data.txt -- information about emoji;
  4. http://www.unicode.org/reports/tr51/tr51-2.html -- Unicode emoji technical report;
  5. http://cldr.unicode.org/ -- Unicode Common Locale Data Repository (includes TTS name annotations for many emoji in several languages, including Italian);
  6. http://emojipedia.org/ -- a catalogue of emoji;
  7. http://www.unicode.org/Public/8.0.0/ucd/UnicodeData.txt (1.5Mb) -- includes the names of all the Unicode characters (including emoji);
  8. http://www.unicode.org/Public/8.0.0/charts/CodeCharts.pdf (98Mb) -- the Unicode code charts, including the emoji, emoticon and other symbol charts;
  9. https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 -- the 2-letter country codes for the flag emoji.


[espeak-ng:master] New Issue Created by rhdunn:
#16 Support emoticons and emoji symbols (`Zsye`).

There are 3 types of emoticons/emoji that can be supported: 1. special punctuation/symbol sequences like :); 2. Unicode characters like 😃 (smiling face with open mouth); 3. using emoji shortcodes like <img src="https://groups.io/img/emojis/1f604.png" alt="smile" width="22">.

Ideally, the punctuation sequences and emoji shortcodes should be mapped to the Unicode characters, and those characters specify the text of how they are pronounced (e.g. "smiling face"). The Unicode character support should be possible, but I suspect the others would need modifications to espeak's text analysis logic to detect the emoticon/emoji sequences -- I don't currently understand this logic too well, so would need some time understanding how it works.

The other issue is sharing the punctuation sequence and emoji shortcode mappings, so the different languages don't need to duplicate those definitions. I don't currently know how possible that would be.

Pronouncing the Unicode characters can be tricky as well ("emoji" technically also cover emoticons and other characters like playing cards that are not necessarily part of the emoji block).

The Unicode characters can combine in complex ways. The flag "emoji" are an encoding of the 2-letter country code that flag represents (IT for the Italian flag, US for the American flag, GB for the British flag, etc.) -- all these permutations need supporting. Another complexity is the recent addition of skin tone modifiers.

Resources/References:

  1. http://www.unicode.org/emoji/charts/full-emoji-list.html -- a list of emoji/emoticon characters;
  2. http://www.emoji-cheat-sheet.com/ -- a list of emoji/emoticon shortcodes;
  3. http://www.unicode.org/Public/emoji/1.0/emoji-data.txt -- information about emoji;
  4. http://www.unicode.org/reports/tr51/tr51-2.html -- Unicode emoji technical report;
  5. http://cldr.unicode.org/ -- Unicode Common Locale Data Repository (includes TTS name annotations for many emoji in several languages, including Italian);
  6. http://emojipedia.org/ -- a catalogue of emoji;
  7. http://www.unicode.org/Public/8.0.0/ucd/UnicodeData.txt (1.5Mb) -- includes the names of all the Unicode characters (including emoji);
  8. http://www.unicode.org/Public/8.0.0/charts/CodeCharts.pdf (98Mb) -- the Unicode code charts, including the emoji, emoticon and other symbol charts;
  9. https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 -- the 2-letter country codes for the flag emoji.


[espeak-ng:master] New Comment on Issue #16 Support emoticons and emoji symbols (Zsye).
By rhdunn:

I am restricting this to just support reading the Zsye characters, instead of also supporting their ASCII equivalents. This still covers the combined emoji characters (https://en.wikipedia.org/wiki/Emoji), e.g.: 1. flags 2. skin colours (fitzpatrick skin tones) 3. joined emoji characters (e.g. man+woman+girl = family)


[espeak-ng:master] reported: Support emoticons and emoji #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Issue #16 Support emoticons and emoji
By rhdunn:

Issue #216 is relevant here, and should be the preferred solution in the long-term. That is, the emoji would be defined in en-Zsye, de-Zsye, etc. language dictionaries that if present would be used to speak the emoji characters. The same applies to symbols (Zsym) and mathematical notation (Zmth), as well as reading things like Greek characters (Grek) in English.

The more complex cases are for things like the shrug character (¯\_(ツ)_/¯) that mix punctuation characters with Japanese characters ().


Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] Label added to issue #216 Make it easy to combine languages in different scripts. by rhdunn.


[espeak-ng:master] New Issue Created by rhdunn:
#216 Make it easy to combine languages in different scripts.

The Problem

The non-latin script languages fall back to English for any latin text. Any other unrecognised script, espeak currently speaks "Script name letter" for each character in that script. The current Persian voices have variants for falling back to British or American English for latin text. Some languages like Japanese and Sindhi can be written in multiple scripts, and can take different forms (e.g. the different styles of Romaji for writing Japanese in latin characters).

If using MBROLA voices, or other voices specific to some languages, the user may want to switch between those when switching between languages for better intelligability of those languages.

The Solution

Language scripts are specified using the 4-letter ISO 15924 codes such as Grek for Greek. BCP 47 supports using these in language names, e.g. sd-Deva for Sindhi in the Devanagari script or jp-Hrkt for Japanese in the Hiragana and Katakana syllabaries. Following BCP 47 convention, the script name should not be used when it is the primary script for the language (e.g. using es instead of es-Latn).

Language definition files (currently voice files, but see issue #19) should specify the script they support using a script ISO_15924 line such as script Latn. Languages should list both the language code and the language code with the script as supported languages, for example specifying both language sd and language sd-Arab. The language file containing the default script should have the highest priority over the others, just like accents have a lower priority to the base language.

Each language dictionary should be restricted to a single Script. It should be possible to create processing chains (e.g. jp-Hant processes Traditional Chinese Han characters into Hiragana which jp-Hrkt then pronounces).

The command line should support a comma/semicolon separated list of Script=language, e.g. Arab=sd,Deva=sd,Latn=en-GB-scotish to use Sindhi for Arabic and Devanagari characters and Scottish English for Latin characters. It should also support using Script=language/voice, e.g. Latn=en-GB/mb-de1 for using the MBROLA de1 German voice to read Latin characters in British English.

The existing API should support that syntax in addition to having a new API method espeak_ng_STATUS espeak_ng_SetVoiceForScript(const char *script, const char *language, const char *script); where script can be NULL to use the default script of a language.


[espeak-ng:master] New Issue Created by rhdunn:
#216 Make it easy to combine languages in different scripts.

The Problem

The non-latin script languages fall back to English for any latin text. Any other unrecognised script, espeak currently speaks "Script name letter" for each character in that script. The current Persian voices have variants for falling back to British or American English for latin text. Some languages like Japanese and Sindhi can be written in multiple scripts, and can take different forms (e.g. the different styles of Romaji for writing Japanese in latin characters).

If using MBROLA voices, or other voices specific to some languages, the user may want to switch between those when switching between languages for better intelligability of those languages.

The Solution

Language scripts are specified using the 4-letter ISO 15924 codes such as Grek for Greek. BCP 47 supports using these in language names, e.g. sd-Deva for Sindhi in the Devanagari script or jp-Hrkt for Japanese in the Hiragana and Katakana syllabaries. Following BCP 47 convention, the script name should not be used when it is the primary script for the language (e.g. using es instead of es-Latn).

Language definition files (currently voice files, but see issue #19) should specify the script they support using a script ISO_15924 line such as script Latn. Languages should list both the language code and the language code with the script as supported languages, for example specifying both language sd and language sd-Arab. The language file containing the default script should have the highest priority over the others, just like accents have a lower priority to the base language.

Each language dictionary should be restricted to a single Script. It should be possible to create processing chains (e.g. jp-Hant processes Traditional Chinese Han characters into Hiragana which jp-Hrkt then pronounces).

The command line should support a comma/semicolon separated list of Script=language, e.g. Arab=sd,Deva=sd,Latn=en-GB-scotish to use Sindhi for Arabic and Devanagari characters and Scottish English for Latin characters. It should also support using Script=language/voice, e.g. Latn=en-GB/mb-de1 for using the MBROLA de1 German voice to read Latin characters in British English.

The existing API should support that syntax in addition to having a new API method espeak_ng_STATUS espeak_ng_SetVoiceForScript(const char *script, const char *language, const char *script); where script can be NULL to use the default script of a language.


[espeak-ng:master] Label added to issue #216 Make it easy to combine languages in different scripts. by rhdunn.


[espeak-ng:master] reported: Support for Indic languages #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #213 Support for Indic languages
By rhdunn:

No problem. For small things like that, I can always correct those issues after merging.


[espeak-ng:master] reported: Support for Indic languages #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] New Comment on Pull Request #213 Support for Indic languages
By vrdhn:

Thanks for merging, and sorry to have missed adding two languages from adding to Makefile and language.md ;


eSpeak NG 1.49.1 seems not to work even with vc 2015 redistributables installed

Simon Eigeldinger <simon.eigeldinger@...>
 

Hi all,

I just tried again to install espeak ng 1.49.1 on a windows 10 machine,
64 bit.
seems this doesn't work.
on the command line i get the error access denied.
though i have everything installed.
i wonder if someone has the same problem?

greetings,
simon




--
Simon Eigeldinger
Follow me on Twitter: http://www.twitter.com/domasofan/
E-Mail: simon.eigeldinger@vol.at
MSN: simon_eigeldinger@hotmail.com
ICQ: 121823966
Jabber: domasofan@andrelouis.com

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus


Github push to espeak-ng:espeak-ng #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

2 New Commits:

[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
f320d349f07f: Build the kok and sd languages in the dictionary target.

Modified: Makefile.am


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
2dac2f2c49f8: Update the documentation for the new and updated languages.

Modified: CHANGELOG.md
Modified: README.md
Modified: docs/languages.md


Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

8 New Commits:

[espeak-ng:master] By Alberto Pettarin <alberto@...>:
88a588e635dd: Updated emscripten to 1.49.1. Fixed voice default and CSS in demo.html. Updated copyright strings to 2017.

Modified: .gitignore
Modified: emscripten/Makefile
Modified: emscripten/README.md
Modified: emscripten/demo.html
Modified: emscripten/espeakng_glue.cpp
Modified: emscripten/espeakng_glue.idl
Modified: emscripten/js/demo.js
Modified: emscripten/js/espeakng.js
Modified: emscripten/post.js


[espeak-ng:master] By Vardhan <vardhanvarma@...>:
49ca474b653c: Updating Gujarati ( inc/gu )

Modified: dictsource/gu_list
Modified: dictsource/gu_rules


[espeak-ng:master] By Vardhan <vardhanvarma@...>:
4fb391270df0: Updating Hindi ( inc/hi )

Modified: dictsource/hi_list
Modified: dictsource/hi_rules
Modified: phsource/ph_hindi


[espeak-ng:master] By Vardhan <vardhanvarma@...>:
639bd9f55846: Adding Konkani ( inc/kok )

Added: dictsource/kok_list
Added: dictsource/kok_rules
Added: espeak-ng-data/voices/inc/kok
Added: phsource/ph_konkani
Modified: Makefile.am
Modified: phsource/phonemes


[espeak-ng:master] By Vardhan <vardhanvarma@...>:
43e37b2c7261: Adding Manipuri ( sit/mni ) support

Added: dictsource/mni_rules
Added: espeak-ng-data/voices/sit/mni
Modified: Makefile.am
Modified: dictsource/mni_list
Modified: docs/languages.md


[espeak-ng:master] By Vardhan <vardhanvarma@...>:
ecea83d6a5f5: Updateing Oriya (inc/or)

Modified: dictsource/or_list
Modified: dictsource/or_rules


[espeak-ng:master] By Vardhan <vardhanvarma@...>:
b44995dba2c4: Adding Sindhi - Arabic (inc/sd )

Added: dictsource/sd_list
Added: dictsource/sd_rules
Added: espeak-ng-data/voices/inc/sd
Added: phsource/ph_sindhi
Modified: Makefile.am
Modified: phsource/phonemes


[espeak-ng:master] By Reece H. Dunn <msclrhd@...>:
7c57bdb357c6: Merge remote-tracking branch 'pettarin/master'

Modified: .gitignore
Modified: emscripten/Makefile
Modified: emscripten/README.md
Modified: emscripten/demo.html
Modified: emscripten/espeakng_glue.cpp
Modified: emscripten/espeakng_glue.idl
Modified: emscripten/js/demo.js
Modified: emscripten/js/espeakng.js
Modified: emscripten/post.js


[espeak-ng/espeak-ng] Pull request closed by rhdunn:

#211 Fixed voice default in demo.html and updated to 1.49.1
Also: 1. added ``a.out`` and ``a.out.js`` (generated by emscripten) to the main ``.gitignore`` 2. fixed the default option in ``emscripten/demo.html``, now being ``english`` 3. fixed CSS that made the slider bars look ugly in Firefox 4. updated copyright strings in emscripten files to 2017


[espeak-ng/espeak-ng] Pull request closed by rhdunn:

#213 Support for Indic languages
These are the languages updated/developed by informal NVDA India group on the older espeak, and ported to current espeak-ng. There are following 6 checkins in this PR: b44995db Adding Sindhi - Arabic (inc/sd ) ecea83d6 Updateing Oriya (inc/or) 43e37b2c Adding Manipuri ( sit/mni ) support 639bd9f5 Adding Konkani ( inc/kok ) 4fb39127 Updating Hindi ( inc/hi ) 49ca474b Updating Gujarati ( inc/gu ) Thanks Vardhan


[espeak-ng:master] New Comment on Pull Request #211 Fixed voice default in demo.html and updated to 1.49.1
By rhdunn:

Thanks for the PR.


[espeak-ng:master] New Comment on Pull Request #213 Support for Indic languages
By rhdunn:

Thanks for the PR.


Updates to Github #github

espeak-ng@groups.io Integration <espeak-ng@...>
 

[espeak-ng:master] Label added to issue #214 Secondary stress in Italian language. by Christianlm.


[espeak-ng:master] Label added to issue #215 Roman numbers as ordinal numbers in italian language. by Christianlm.


[espeak-ng:master] Label added to issue #215 Roman numbers as ordinal numbers in italian language. by Christianlm.


[espeak-ng:master] New Issue Created by Christianlm:
#215 Roman numbers as ordinal numbers in italian language.

Usually in italian roman numbers are used to indicate ordinal numbers.

Expected behaviour: 1. Consider such as roman number only if are written in upper case 2. Pronounce the roman numbers as ordinals.


[espeak-ng:master] Label removed from issue #215 Roman numbers as ordinal numbers in italian language. by Christianlm.

3641 - 3660 of 3992