Date
1 - 11 of 11
Arguments for disunification (was: Re: [QuikScript] Welcome)
Nathan Galt
On Nov 11, 2019, at 4:23 PM, John Cowan <cowan@...> wrote: Sweet, thanks! I’m new to the language-tagging part of the Internet — where can I watch the progress of this request? |
|
John Cowan
On Mon, Nov 11, 2019 at 6:52 PM Nathan Galt <mailinglists@...> wrote: This seems like the proper mechanism to selectively enable proper-Quikscript letters in fonts that have both Shavian-style and Quikscript-style glyphs for the same code point. At the very least, the page’s authors seem to be open to adding not-quite-languages: they’ve already added ones for both Americanist and IPA phonetic transcriptions. For the record, Michael is Benevolent Dictator of both the IANA language tagging group, whose page you were looking at, and the ISO 15924 Registration Authority, which is responsible for the 4-letter script codes. So he da man. I have just filed a formal request to add Quikscript as a script with the name "Shaq". This does not refer to Mr. O'Neal, but codes Quikscript as a script variant of Shavian. The ISO standard defines a script variant as "a particular form of one script which is so distinctive a rendering as to almost be considered a unique script in itself." Existing examples are the Gaelic and Fraktur variants of Latin, the Old Church Slavonic variant of Cyrillic, and the Nastaliq style of Arabic that is used for Persian, Urdu, and other Eastern languages. John Cowan http://vrici.lojban.org/~cowan cowan@... There is / One art / No more / No less To do / All things / With art- / -Lessness --Piet Hein |
|
Nathan Galt
On Nov 10, 2019, at 7:20 AM, Michael Everson <everson@...> wrote:The one up top on the left (𐑳) with no Quikscript counterpart looks like a Shavian letter not used in Quikscript. U+10478 through U+1047F look like pre-ligated Shavian letters that have no real Quikscript equivalent. Incidentally, I oppose encoding pre-ligated Quikscript letters for the same reasons why encoding the “fi” ligature (U+FB01) was a bad idea. This sort of thing should be handled by fonts, not typists. Maybe “flings”? U+2E00 is a long ways from U+104xx. At any rate, putting them in the Supplemental Punctuation block seems like a good idea to me.• In 2013, Michael Everson has a second proposal. This proposal shoves the angled parentheses into the Miscellaneous Punctuation block and uses only one block of 16 (with room to spare) for the Quikscript letters. You can view it at https://www.quikscript.net/proposals/2013.pdf.“Shoves”? Please. I agree that mixed Shavian/Quikscript documents are likely to stay rare except for technical discussions like this one.• The 2013 proposal was controversial. Michael Everson seemed to think that reducing the block count would help it get accepted, especially since the Shavian proposal just squeaked in. I and other people thought that this was too unified and would make it difficult, if not impossible, to be able to develop good fonts that can display both Quikscript and Shavian at the same time.There is in my view no chance that the UTC will accept a full disunification. Quikscript is a set of Shavian extensions, and a Quikscript font is essentially a handwriting font when compared to a more typographic Shavian font. Bi-scriptal documents written in both Shavian and Quikscript, or mixing them, would seem to be really rather rare. In fact it should be considered whether Quikscript could be considered an italic version of Shavian. In such a case it might still require extensions. (This is probably not a very good idea.)I generally figure that italics have a 1:1 letter correspondence with an upright. However, this assumption doesn’t hold for Shavian:Quikscript. Kingsley Read… - merged 𐑩/𐑳 (/ə/ and /ʌ/) into ·Utter (/ə/) - split 𐑢 (/w/) into ·Way and ·Why (/w/ and /ʍ/) - added letters for /ɬ/ and /x/ - added letters for /ks/ and /gz/ (·Ax and ·Exam, to be used sometimes) Agreed.• Another person here whose name I forget said something like “Did the Consortium not learn anything in the Greek/Coptic disunification mess?” While an inside-baseball reference, I later learned that this captured the problem (from my view) nicely. It’s a massive pain in the rear to users of somewhat-different scripts to have to share characters, especially for the minority users.In fairness there is an ocean of both Greek and Coptic data spanning many centuries while there is barely a puddle of Shavian data and an even smaller puddle of Quikscript data. Agreed.While I’m not sure we’re a numerical minority compared to Shavianists, we’re Johnny-come-latelies and I expect most every-script font (Noto Sans, Segoe UI Historic, etc.), to cater to Shavian tastes in letterforms first, leaving Quikscript users unable to have letterforms they use. https://www.unicode.org/L2/L2002/02205-n2444-coptic.pdf goes into some of the hassles Coptic readers and fontmakers have; I’ve made similar arguments arguing that I shouldn’t have to read Shavian's ·𐑱 as Quikscript’s ·Eight.Nothing written in Shavian or Quikscript has the same status as anything written in Coptic or Greek. Sorry but you have to be a realist. “If the shapes were faithful” is the main point of contention here. My biggest objection to the 2013 proposal is that omnifonts, which will be the fonts used for most display of most Quikscript text, will have confusing letters.• Between 2013 and now, I’ve developed better arguments in favor of maximally-disunified encodings, like “way more text sent/received these days isn’t sent with fonts under the sender’s or receiver’s control”. (I’ll go into this more for a later worm-can opening.)Extensions to Shavian would require Shaw-style glyphs for the new Quikscript characters. If the shapes were faithful, you’d get used to reading a typographically rectified Quikscript if it turned up on your phone. Specifically: - 𐑱/𐑲 are difficult to properly map, while reading, to ·Eight/·I (although your explanation certainly helped) - similarly, 𐑬/𐑶 are difficult to properly map to to ·Out/·Oy without screwing this up I don’t have a major objection to “doesn’t look quite right”; I do to “easily confused with a similar letter”. While I very much appreciate your assistance in toughening up our arguments in favor of standardization, do you realize you’ve entirely ignored my objection and substituted a straw-man one in its place? If (e-)books, HTML+CSS pages, and Word documents were the only ways in which Quikscript were on the Internet, this wouldn’t be an issue. However, Agreed, but not relevant. From the page:Now then. What does it take make one font that can display Shavian properly and unligated Quikscript Junior (at the very least) properly? I can think of two main ways that will do this:Neither Shavian nor Quikscript are languages. They are scripts. Therefore, in several respects, language system tags do not correspond in a one-to-one manner with languages. Even so, many registered tags are intended to represent typographic conventions for a particular language. For cases in which a correlation exists between a tag and one or more languages, the language identities are documented here by reference to ISO 639-2 and ISO 639-3.This seems like the proper mechanism to selectively enable proper-Quikscript letters in fonts that have both Shavian-style and Quikscript-style glyphs for the same code point. At the very least, the page’s authors seem to be open to adding not-quite-languages: they’ve already added ones for both Americanist and IPA phonetic transcriptions. The 2013 proposal disunifies nine — not even a whole row. If I can pick out sixteen (or fewer!) of the most problematic characters, does that materially reduce the chances of convincing the UTC?• Unify no letters. Not even ·𐑩/·utter (Shavian vowels tend to be written to be skinny for whatever reason). Update omnifonts to add Quikscript glyphs.If you can’t convince me of this there’s little chance I could convince the UTC. |
|
Nathan Galt
On Nov 11, 2019, at 9:30 AM, Michael Everson <everson@...> wrote:This is fine, à la LATIN LETTER C. Also fine: SHAVIAN LETTER THEY and the close-enough QUIKSCRIPT LETTER VIE. No, it doesn’t. 𐑝 goes below the baseline (is Deep); ·He goes above the x-height (is Tall). Incidentally, Quikscript’s own ·They (Deep) and ·She (Tall) also only mostly differ by vertical placement (and therefore what they can ligate with). You’d have to have a particular kind of horrible, un-ligated handwriting to write something in Quikscript where someone else can’t tell whether you wrote “other” or “usher”. This is fine. I don’t see us getting these three through as extensions. Rather, what I see is LATIN LETTER C, which may be used for /dʒ/, /ð/, /k/, /s/, /ts/, /tʃ/, /ʃ/, /θ/, /ʕ/, and /ǀ/.That’s what I see, too, modulo 𐑝/·He. |
|
Michael Everson
Yes, fine, but that wasn’t the point.
toggle quoted message
Show quoted text
On 11 Nov 2019, at 17:47, John Cowan <cowan@...> wrote: |
|
Michael Everson
Here you go. This includes QS VEE HE AH and AWE but they are identical to THIGH VOW MIME and NUN
toggle quoted message
Show quoted text
On 10 Nov 2019, at 18:39, John Cowan <cowan@...> wrote: |
|
John Cowan
Not to mention /c/, the voiceless palatal plosive. On Mon, Nov 11, 2019 at 12:30 PM Michael Everson <everson@...> wrote: The real problem are these four: |
|
Michael Everson
The real problem are these four:
toggle quoted message
Show quoted text
SHAVIAN LETTER THIGH looks just like QUIKSCRIPT LETTER FEE SHAVIAN LETTER VOW looks just like QUIKSCRIPT LETTER HE SHAVIAN LETTER MIME looks just like QUIKSCRIPT LETTER AH SHAVIAN LETTER NUN looks just like QUIKSCRIPT LETTER AWE I don’t see us getting these three through as extensions. Rather, what I see is LATIN LETTER C, which may be used for /dʒ/, /ð/, /k/, /s/, /ts/, /tʃ/, /ʃ/, /θ/, /ʕ/, and /ǀ/. Discuss. On 10 Nov 2019, at 18:39, John Cowan <cowan@...> wrote: |
|
John Cowan
While Michael expresses himself more emphatically than I would, I associate myself with all his arguments and conclusions. It would indeed be interesting to see a QS font in a "typographical" Shavian style like the Androcles font. Most scripts have a strong distinction between printing and handwriting, with the exceptions of Arabic and Indic scripts. On Sun, Nov 10, 2019 at 10:21 AM Michael Everson <everson@...> wrote: On 8 Nov 2019, at 20:23, Nathan Galt <mailinglists@...> wrote: |
|
Michael Everson
On 8 Nov 2019, at 20:23, Nathan Galt <mailinglists@...> wrote:
• In 2007, Michael Everson proposes Shavian Quikscript extensions. He notes that there are some letters that look identical or at least fairly similar in both Quikscript and Shavian and proposes a two-row set of extensions, plus angled parentheses. You can view it at https://www.quikscript.net/proposals/2007.pdf.I do not remember offhand what the pink bits were. Something for discussion? Or Shavian items not used in Quikscript? • In 2013, Michael Everson has a second proposal. This proposal shoves the angled parentheses into the Miscellaneous Punctuation block and uses only one block of 16 (with room to spare) for the Quikscript letters. You can view it at https://www.quikscript.net/proposals/2013.pdf.“Shoves”? Please. • The 2013 proposal was controversial. Michael Everson seemed to think that reducing the block count would help it get accepted, especially since the Shavian proposal just squeaked in. I and other people thought that this was too unified and would make it difficult, if not impossible, to be able to develop good fonts that can display both Quikscript and Shavian at the same time.There is in my view no chance that the UTC will accept a full disunification. Quikscript is a set of Shavian extensions, and a Quikscript font is essentially a handwriting font when compared to a more typographic Shavian font. Bi-scriptal documents written in both Shavian and Quikscript, or mixing them, would seem to be really rather rare. In fact it should be considered whether Quikscript could be considered an italic version of Shavian. In such a case it might still require extensions. (This is probably not a very good idea.) • Another person here whose name I forget said something like “Did the Consortium not learn anything in the Greek/Coptic disunification mess?” While an inside-baseball reference, I later learned that this captured the problem (from my view) nicely. It’s a massive pain in the rear to users of somewhat-different scripts to have to share characters, especially for the minority users.In fairness there is an ocean of both Greek and Coptic data spanning many centuries while there is barely a puddle of Shavian data and an even smaller puddle of Quikscript data. While I’m not sure we’re a numerical minority compared to Shavianists, we’re Johnny-come-latelies and I expect most every-script font (Noto Sans, Segoe UI Historic, etc.), to cater to Shavian tastes in letterforms first, leaving Quikscript users unable to have letterforms they use. https://www.unicode.org/L2/L2002/02205-n2444-coptic.pdf goes into some of the hassles Coptic readers and fontmakers have; I’ve made similar arguments arguing that I shouldn’t have to read Shavian's ·𐑱 as Quikscript’s ·Eight.Nothing written in Shavian or Quikscript has the same status as anything written in Coptic or Greek. Sorry but you have to be a realist. • Between 2013 and now, I’ve developed better arguments in favor of maximally-disunified encodings, like “way more text sent/received these days isn’t sent with fonts under the sender’s or receiver’s control”. (I’ll go into this more for a later worm-can opening.)Extensions to Shavian would require Shaw-style glyphs for the new Quikscript characters. If the shapes were faithful, you’d get used to reading a typographically rectified Quikscript if it turned up on your phone. I have a number of Shavian fonts and styles (including italic) in my Shavian Alice. None of them are what were found in Androcles. Now then. What does it take make one font that can display Shavian properly and unligated Quikscript Junior (at the very least) properly? I can think of two main ways that will do this:Neither Shavian nor Quikscript are languages. They are scripts. Or they are a single script, which already has an ISO 15924 code “Shaw”. • Unify no letters. Not even ·𐑩/·utter (Shavian vowels tend to be written to be skinny for whatever reason). Update omnifonts to add Quikscript glyphs.If you can’t convince me of this there’s little chance I could convince the UTC. Michael |
|
Nathan Galt
I have a handful of pre-publication standardization-advocacy web pages that use glyphs from both scripts. While I suspect the Consortium will be underwhelmed by this even if I do end up publishing them, I’d argue that the document doesn’t matter as much anymore.
toggle quoted message
Show quoted text
What do I mean by this? I can imagine why the Consortium cares about documents. For most rarely-used scripts, there’s a large body of existing text that would be better off as text instead of images or PUA codepoints. This text tends to be collected in single documents, whether Word or OpenOffice or HTML. A document’s author has near-complete control over font selection†, and can be expected to specify a good-looking font that supports the text in it. In this view, multi-script documents are rare, modulo $SCRIPT+Latin combinations (think of genus-species names in a Chinese-language biology textbook). A lot of things worth typing and reading aren’t in documents, though. They’re in plain-text services that are language-agnostic like SMS messages, microblog posts, and chat rooms. These services tend to be run in a language-agnostic fashion, oftentimes with no extra font support beyond what the computer/smartphone vendor provides. While there’s usually at least one good font included with the operating system (Apple Symbols, Segoe UI Historic, Noto Sans), it’s a rare communication platform that allows its users to use obscure user-chosen fonts. Because of this, the relevant question isn’t “How frequently are these scripts mixed in the same document?”, but rather “How frequently will these two scripts be written by different people chatting over the same network/program?”. In the case of Discord, a text-and-voice-chat service, the answer is “as soon as people start typing in Quikscript on Discord, even if none of those users interact with any (preexisting) Shavianists over it”. Now then. What does it take make one font that can display Shavian properly and unligated Quikscript Junior (at the very least) properly? I can think of two main ways that will do this:
(2) seems like the better, least-likely-to-fail-partway option at the low, low price of two additional SMP Unicode blocks. It makes good Quikscript display much less dependent on multiple large organizations’ font-renderer teams and merely puts it on their in-house font-making teams (which, at this point, can be counted on to support just about every script in Unicode). † Specifying fallback fonts is difficult, if not impossible, in most word-processing software I’ve worked with. HTML+CSS is much better at this.
|
|