Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Diacritics don't need to be used the way they are in French, i.e. to preserve the original spelling. On the contrary, most languages use them to make their spelling more phonetic.

Nor is there a need for some insane kind of diacritics to handle English. Its phonemic inventory is considerable, yes, but it can be easily organized, especially when you keep in mind that many distinct sounds are allophones (and thus don't need a separate representation) - a good example is the glottal stop for "t" in words like "cat", it really doesn't need its own character since it's predictable.

Let's take General American as an example. First you have the consonant phonemes:

Nasals: m,n,ŋ

Plosives: p,b,t,d,k,g

Affricates: t͡ʃ, d͡ʒ

Fricatives: f,v,θ,ð,s,z,ʃ,ʒ,h

Approximants: l,r,j,w

Right away we can see that most are actually covered by the basic Latin alphabet. Affricates can be reasonably represented as plosive-fricative pairs since English doesn't have a contrast between tʃ/t͡ʃ or between dʒ/d͡ʒ; then we can repurpose Jj for ʒ. For ŋ one can adopt a phonemic analysis which treats it as an allophone of the sequence ng that only occurs at the end of the word (with g deleted in this context) and as allophone of n before velars.

Thus, distinct characters are only strictly needed for θ,ð,ʃ, and perhaps ʒ. All of these except for θ actually exist as extended Latin characters in their own right, with proper upper/lowercase pairs, so we could just use them as such: Ðð Ʃʃ Ʒʒ. And for θ there's the historical English thorn: Þþ. The same goes for Ŋŋ if we decide that we do want a distinct letter for it.

If one wants to hew closer to basic Latin look, we could use diacritics. Caron is the obvious candidate for Šš =ʃ and Žž=ʒ, and we could use e.g. crossbar for the other two: Đđ and Ŧŧ. If we're doing that, we might also take Čč for c. And if we really want a distinct letter for ŋ, we could use Ňň.

You can also consider which basic Latin letters are redundant in English when using phonemic spelling. These would be c (can always be replaced with k or s), q (can always be replaced with k), and x (can always be replaced with ks or gz). These can then be repurposed - e.g. if we go with two-letter affricates and then take c=ʃ x=ð q=θ we don't need any diacritics at all!

Moving on to vowels, in GA we have:

Monopthongs: ʌ,æ,ɑ,ɛ,ə,i,ɪ,o,u,ʊ

Diphthongs: aɪ,eɪ,ɔɪ,aʊ,oʊ

R-colored: ɑ˞,ɚ,ɔ˞.

Diphthongs can be reasonably represented using the combination of vowel + y/w for the glide, thus: ay,ey,oy,aw,ow.

For monophthongs, firstly, ʌ can be treated as stressed allophone of ə. If we do so, then all vowels (save for o which stands by itself) form natural pairs which can be expressed as diacritics: Aa=ɑ, Ää=æ, Ee=ɛ, Ëë=ə, Ii=i, Ïï=ɪ, Oo=o, Uu=u, Üü=ʊ.

For R-colored vowels, we can just adopt the phonemic analysis that treats them as vowel+r pairs: ar, er, or.

To sum it all up, we could have a decent phonemic American English spelling using just 4 extra vowel letters with diacritics: ä,ë,ï,ü - if we're okay with repurposing existing redundant letters and spelling affricates as two-letter sequences.

And worst case - if we don't repurpose letters, and with each affricate as well as ŋ getting its own letter - we need 10: ä,č,đ,ë,ï,ň,š,ŧ,ž,ü.

I don't think that's particularly excessive, not even the latter variant.





Now try to get close to a billion people around the world with already varied cultures to follow the "new" rules of their native language.

I'm well aware that any kind of English spelling reform is non-viable for backwards compatibility reasons.

But that is a different argument from saying that English can't use diacritic-based orthography because the phonemic inventory is too complex.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: