Difference between revisions of "Features/Language-List-REQUEST"
Line 63: | Line 63: | ||
|} | |} | ||
<span style=" color: red">GrimPixel:</span> Combined names are according to https://tech.lds.org/wiki/ISO_Language_Codes, I think Combined Names can be URL names. <span style=" color: red">Vincent:</span> Yes for URLs we could use /zho-Hant instead of /madarin-chinese-traditional but I chose to use an understandable URL because it is better for SEO (<span style=" color: red">http://digitalverge.net/seo/url-optimization-best-practices-to-create-seo-friendly-url-structure/</span>). I will create the url by combining the "English script name" (<span style=" color: red">http://unicode.org/iso15924/iso15924-codes.html</span>) + "Language name". Changing the url <span style=" color: red">https://polyglotclub.com/index/translate-russian</span> to [https://polyglotclub.com/index/translate-russian https://polyglotclub.com/index/translate-rus-cyrl] would mean change about 1,000,000 indexed pages !! For the russian language there is only one script, the URL do not need to change. <span style=" color: red">GrimPixel: It's understandable. But I have found an interesting fact - people in Romania and Serbia use Latin for Bulgarian.</span> | <span style=" color: red">GrimPixel:</span> Combined names are according to https://tech.lds.org/wiki/ISO_Language_Codes, I think Combined Names can be URL names. <span style=" color: red">Vincent:</span> Yes for URLs we could use /zho-Hant instead of /madarin-chinese-traditional but I chose to use an understandable URL because it is better for SEO (<span style=" color: red">http://digitalverge.net/seo/url-optimization-best-practices-to-create-seo-friendly-url-structure/</span>). I will create the url by combining the "English script name" (<span style=" color: red">http://unicode.org/iso15924/iso15924-codes.html</span>) + "Language name". Changing the url <span style=" color: red">https://polyglotclub.com/index/translate-russian</span> to [https://polyglotclub.com/index/translate-russian https://polyglotclub.com/index/translate-rus-cyrl] would mean change about 1,000,000 indexed pages !! For the russian language there is only one script, the URL do not need to change. <span style=" color: red">GrimPixel:</span> It's understandable. But I have found an interesting fact - people in Romania and Serbia use Latin for Bulgarian. <span style=" color: red">Vincent: ok</span> | ||
<span style=" color: red">Vincent:</span> Ok I see what you mean. How could I get the full table you are describing but with ISO 639-3? The best would be to have both ISO 639-3 and ISO 15924 on the same table because I do not see any Bulgarian, on this page <span style=" color: red">http://scriptsource.org/cms/scripts/page.php?item_id=script_overview</span> NOR on this page <span style=" color: red">https://en.wikipedia.org/wiki/ISO_15924</span> I will also need to make simplified script names for URLs. | <span style=" color: red">Vincent:</span> Ok I see what you mean. How could I get the full table you are describing but with ISO 639-3? The best would be to have both ISO 639-3 and ISO 15924 on the same table because I do not see any Bulgarian, on this page <span style=" color: red">http://scriptsource.org/cms/scripts/page.php?item_id=script_overview</span> NOR on this page <span style=" color: red">https://en.wikipedia.org/wiki/ISO_15924</span> I will also need to make simplified script names for URLs. |
Revision as of 20:25, 13 September 2017
Main Features
- Where on the site use 'Macrolanguage' in the language list?:
- Find Friends page: we must be able to search according to 'individual' or 'macrolanguage', but in the search box, results must show 'macrolanguage' first in BOLD (with number of 'individual' inside the macrolanguage), then 'invididual'. it must be clear that individual belong to macro. Example for 'Arabic', there are 40 lines!
- Corrections, profile, translations pages: only individual language
- Lessons, videos, questions, quizzes : macrolanguage AND individual
- Type: all (easiest to keep them all)
- For "language you can TEACH" and "Language you want to LEARN", there will be an explanation for users about that if they couldn't find the result, they can search alternative names in Wikipedia, find the ISO 639-3 and type it. If they still can't find it on that, they can choose "Special" ISO 639-3 code "mis" and type the name. There are four "special" codes in that file, for example, Pinghua Chinese. The definition of usage is here: https://en.wikipedia.org/wiki/ISO_639-3#Special_codes There can be a list for missed languages which has been known, and can be choosen only from them. There can be also a function to "suggest a language".
- When creating wiki lessons, users' "language you can TEACH" and "Language you want to LEARN" can be shown firstly: vincent: OK
- Create a special page to allow an administrator to edit the 'new_lib' and the 'new_lib2' fields. Only create a page in /wiki/Features : table with the following columns, 'iso', 'new_lib', 'new_lib2' and i'll use this table to update the main table. Only for the main 500 languages. Vincent: I started a new list with autonyms and alternative English names here: https://polyglotclub.com/wiki/Features/Language-List-autonym
- Icons of flags for languages will be removed. Forvo, Glosbe, HiNative, Wikipedia. Italki uses flags only to indicate location. Where there are only handful languages, flags are used. Tatoeba is an exception, it made up flags to ensure every language has one there. Vincent: In the first version, flags will still be there because there are too many changes to do. I will decide later if we need to remove them. Remember 99% of users will select the most common languages. Tatoeba has more than 300 languages, if we can have flags for the main 300 or 500, we will cover maybe 99,9% of users. See the table below, it is interesting. Almost no other website/app has the full language list, that's why I want to do it. But I am doing all that for the very minority, anyway, the majority is not always right...GrimPixel: Yes, it is seizing the initiative that matters. We will be the only stronghold for the minority. Vincent: true :) !
Translations
GrimPixel: For Chinese, there should be Traditional and Simplified. Also, Cyrillic and Latin for Serbian, and so on for some other.
GrimPixel: Yeah, new table for scripts. But there is a problem - Japanese uses both Kana and Chinese, Korean uses Hangul and sometimes Chinese. Also, both Traditional and Simplified Chinese are belonging to Chinese script. So it's not strictly according to script systems.
GrimPixel: A professtional site: http://scriptsource.org/cms/scripts/page.php It will require ISO 15924: https://en.wikipedia.org/wiki/ISO_15924 But it's still not good enough, because of being exceedingly precise, not practical. Some should be hidden, such as Hiragana, Katakana, Japanese Syllables, because there is a "Japanese" which includes them all.
GrimPixel: I think it's a difficult problem. It seems that using scripts to divide them is not enough, people in different area have different customs. There are differences between Portugese (Brazil) and Portugese (Portugal), English (USA) and English (UK), and so on. So it would be too much content if divide by regions, just as what Windows 10 Regions & languages - add a language.
GrimPixel: I think it will finally be according to scripts, though there are different customs in different areas within a script.
GrimPixel: I think it makes - Bulgarian can be Cyrillic, Russian can also be Cyrillic. In each entry, there is Writing systems that use this script, which tells which languages use the script.
Language | ISO 639-3 | ISO 15924 Code | ISO 15924 Name | URL name | |
---|---|---|---|---|---|
Bulgarian | bul | Cyrl | Cyrillic | bulgarian | |
Russian | rus | Cyrl | Cyrillic | russian | |
Mandarin Chinese | cmn | Hans | Han (Simplified variant) | mandarin-chinese-simplified | |
Mandarin Chinese | cmn | Hant | Han (Traditional variant) | mandarin-chinese-traditional |
GrimPixel: Combined names are according to https://tech.lds.org/wiki/ISO_Language_Codes, I think Combined Names can be URL names. Vincent: Yes for URLs we could use /zho-Hant instead of /madarin-chinese-traditional but I chose to use an understandable URL because it is better for SEO (http://digitalverge.net/seo/url-optimization-best-practices-to-create-seo-friendly-url-structure/). I will create the url by combining the "English script name" (http://unicode.org/iso15924/iso15924-codes.html) + "Language name". Changing the url https://polyglotclub.com/index/translate-russian to https://polyglotclub.com/index/translate-rus-cyrl would mean change about 1,000,000 indexed pages !! For the russian language there is only one script, the URL do not need to change. GrimPixel: It's understandable. But I have found an interesting fact - people in Romania and Serbia use Latin for Bulgarian. Vincent: ok
Vincent: Ok I see what you mean. How could I get the full table you are describing but with ISO 639-3? The best would be to have both ISO 639-3 and ISO 15924 on the same table because I do not see any Bulgarian, on this page http://scriptsource.org/cms/scripts/page.php?item_id=script_overview NOR on this page https://en.wikipedia.org/wiki/ISO_15924 I will also need to make simplified script names for URLs.
GrimPixel: You can see it after clicking on "Cyrillic". It's listed in Writing systems that use this script. Let me check if there is a ready-made sheet. Vincent: In case you don't find, it's not too much work to extract by hand on each page like here for cyrillic http://scriptsource.org/cms/scripts/page.php?item_id=script_detail&key=Cyrl.
Vincent: Oh my God! here there is the full Ethnologue list with the fields I needed: Alternate names and Is mainly used in: I can update our full list with this. http://scriptsource.org/cms/scripts/page.php?item_id=language_overview&var_first_letter=B&uid=22gb39ehc2
GrimPixel: Very well! If we hadn't found it, it worth $1000! XD Vincent: LOL. I'll check also if I can find autonyms somewhere.
GrimPixel: Some autonyms can be found here, relatively reliable http://www.omniglot.com/language/names.htm Vincent: The idea would be to find all the autonyms from Ethnologue but your list https://polyglotclub.com/wiki/Features/Language-List-autonym is already very good GrimPixel: I don't know how many autonyms does Ethnologue have, but I guess it's no more than that of Omniglot.
Search box
Here are examples:
GrimPixel: It's very clear. I have noticed that it's better to have a <hr/> between Andalusian Arabic and Judeo-Yemeni Arabic. Vincent: Ok I'll add a break line at the end of the family members
GrimPixel: Why can't I see adds? Vincent: VIP don't have ads: https://polyglotclub.com/trust GrimPixel: Then why can you? :^)Vincent: except me ;)
GrimPixel: I have just realized that the word "family" is a linguistic term, and it is not appropriate here. Vincent: 99,9% of our users are not linguists and the word family seemed understandable by usual people. "Macro language" does not mean anything for people. Do you know any other simple word? "Group"? GrimPixel: It's nearly impossible to replace "macrolanguage", because a macrolanguage is both one language and many languages. I think the word "family" or "group" should be omitted, because it changes "Arabic" from a noun to an adjective. Vincent: I'll keep 'family' for now even if it's wrong from the linguist point of view. it will bother only 0.01% of people whereas if I write 'macrolanguage' it will bother 99.9% of people GrimPixel: I still think no word attaching is good. If you still want to use a word, then "cluster" is the best one, introduced from Handbook of African Languages. "language cluster" seems to be the only alternative of "macrolanguage". Vincent: I did not get you were saying 'no word', ok, I will not use any word then. Vincent: You say: "a macrolanguage is both one language and many languages" so it mean a member can add a macrolanguage as "Language you can teach" or "language you can learn"? GrimPixel: A macrolanguage can be considered as a language, because its dialects are very similar in writing, and one written dialect can be understood by people of other dialects. It is used in ISO 639-2. But in ISO 639-3, a macrolanguage is considered as many languages (instead of dialects), the reason is that the native speakers of those languages are mutually unintelligible (when speaking, and in many cases even when writing). So a member shouldn't select a macrolanguage as a "Language you can teach" or "Language you want to learn". Vincent: here we are using ISO 639-3, so a macrolanguage is not a language. for me, it's simply a group/category of languages.
GrimPixel: I think there are too many unnecessary "Arabic". So, that "Arabic" can be sticked on the top of the list when scroll down, until it meets the separate line. And then, those "Arabic" ("Arabic family" on the image) after each item can be removed. Vincent: Ok GrimPixel: When typing an individual language, the macrolanguage it belongs to (if there is one) and a separate line should also be displayed.
Page Name Changes
When a language is 'renamed' or 'merged', many URL will be updated and a redirection created. Old URL will be automatically redirected to the new ones (for example in case the old page still exists in Google search results, which will be the case for several months) so the user doesn't get a 404 error page.
I'm writing here the type of pages to test when the new version is online.
Automatic update
- https://polyglotclub.com/index/translate-chinese-mandarin will be redirected to /index/translate-chinese-simplified (translate language list)
- https://polyglotclub.com/find/language-Chinese,_Mandarin will be redirected to /find/language-mandarin-chinese (main list)
- https://polyglotclub.com/language/chinese-mandarin/question will be redirected to /language/mandarin-chinese/question (main list)
- https://polyglotclub.com/language/chinese-mandarin/note (...)
- https://polyglotclub.com/language/chinese-mandarin/video
- https://polyglotclub.com/language/chinese-mandarin/post
Manual update
- https://polyglotclub.com/wiki/Language/Chinese-mandarin/Pronunciation/Accents will be redirected to /wiki/Language/Mandarin-chinese/Pronunciation/Accents (main list)
The wiki will need to by updated for each page using the mediawiki 'redirect' button (for admins only). It cannot be done automatically. There are not so many pages to change, so it will be OK.