Difference between revisions of "Language/Multiple-languages/Culture/Internet-Vocabularies"

From Polyglot Club WIKI
Jump to navigation Jump to search
 
(6 intermediate revisions by 2 users not shown)
Line 10: Line 10:


In progress.
In progress.
Visit https://codeberg.org/GrimPixel/standard-character-lists to download standard lists in TSV format.


== Common word/character list ==
== Common word/character list ==
Line 18: Line 20:
** https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8
** https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8
** https://zh.wiktionary.org/zh/Appendix:%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8
** https://zh.wiktionary.org/zh/Appendix:%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8
* https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8
* 常用國字標準字體表 https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8
* http://input.foruto.com/ccc/gongbiu/dzijingbiu/index.htm


English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt
English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt
Line 34: Line 35:
** https://kanji.jitenon.jp/cat/jimmei.html
** https://kanji.jitenon.jp/cat/jimmei.html
** https://kanjitisiki.com/zinmei/
** https://kanjitisiki.com/zinmei/
Korean https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217


Thai https://github.com/nv23/thai-wordlist
Thai https://github.com/nv23/thai-wordlist
Line 53: Line 51:
Kannada https://github.com/kakashi/kannada_IN_dictionary
Kannada https://github.com/kakashi/kannada_IN_dictionary


Korean https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%EC%9E%90%EC%A3%BC_%EC%93%B0%EC%9D%B4%EB%8A%94_%ED%95%9C%EA%B5%AD%EC%96%B4_%EB%82%B1%EB%A7%90_5800
Korean 자주 쓰이는 한국어 낱말 https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%EC%9E%90%EC%A3%BC_%EC%93%B0%EC%9D%B4%EB%8A%94_%ED%95%9C%EA%B5%AD%EC%96%B4_%EB%82%B1%EB%A7%90_5800


== Graded list ==
== Graded list ==
Chinese
Chinese
* https://www.tw.org/tocfl/
* T.O.C.F.L. Word lists https://www.tw.org/tocfl/
* http://www.chinesetest.cn/godownload.do#list_1
* 新汉语水平考试(HSK)词汇(2012年修订版) http://www.chinesetest.cn/godownload.do#list_1


Japanese  
Japanese  
* https://ja.wikipedia.org/wiki/%E5%AD%A6%E5%B9%B4%E5%88%A5%E6%BC%A2%E5%AD%97%E9%85%8D%E5%BD%93%E8%A1%A8
* 学年別漢字配当表 https://ja.wikipedia.org/wiki/%E5%AD%A6%E5%B9%B4%E5%88%A5%E6%BC%A2%E5%AD%97%E9%85%8D%E5%BD%93%E8%A1%A8
* https://www.tanos.co.uk/jlpt/skills/vocab/
* JLPT Vocabulary https://www.tanos.co.uk/jlpt/skills/vocab/


Korean
Korean
* https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800
* 한문 교육용 기초 한자 https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800
* https://www.topikguide.com/korean-frequency-list-top-6000-words/
* Korean Frequency List -Top 6000 Words https://www.topikguide.com/korean-frequency-list-top-6000-words/


Russian https://en.openrussian.org/vocab/A1
Russian https://en.openrussian.org/vocab/A1
Line 83: Line 81:


Kazakh https://github.com/taem/hunspell-kk
Kazakh https://github.com/taem/hunspell-kk
==Other Lessons==
* [[Language/Multiple-languages/Culture/Different-ways-to-greet-in-the-world|Different ways to greet in the world]]
* [[Language/Multiple-languages/Culture/Texts-and-Audios-under-a-Public-License|Texts and Audios under a Public License]]
* [[Language/Multiple-languages/Culture/Producing-dictionaries-with-web-scraping|Producing dictionaries with web scraping]]
* [[Language/Multiple-languages/Culture/Websites-with-Multilingual-Articles|Websites with Multilingual Articles]]
* [[Language/Multiple-languages/Culture/How-to-locate-the-origin-of-a-video-or-a-photo|How to locate the origin of a video or a photo]]
* [[Language/Multiple-languages/Culture/Elements-of-Traditional-Architectures:-Eastern-Asia|Elements of Traditional Architectures: Eastern Asia]]
* [[Language/Multiple-languages/Culture/Good-Memories|Good Memories]]
* [[Language/Multiple-languages/Culture/Cities-with-the-best-quality-of-life|Cities with the best quality of life]]
* [[Language/Multiple-languages/Culture/The-Polyglot-Club-Team|The Polyglot Club Team]]
* [[Language/Multiple-languages/Culture/Important-Technologies|Important Technologies]]
<span links></span>

Latest revision as of 17:24, 22 May 2023

Multiple-languages-flag-polyglotclub.jpg

On this page you will find vocabularies to memorise. This page is not to be confused with Language/Multiple-languages/Culture/Internet-Dictionaries. Here are word lists that possess one of the following features:

  • contain frequency or grading information
  • no translations, definitions or pronunciations

Computer programs are included, too.

They can be made use of by merging with dictionary meaning data. This may require web scraping.

In progress.

Visit https://codeberg.org/GrimPixel/standard-character-lists to download standard lists in TSV format.

Common word/character list[edit | edit source]

Multiple languages https://en.wiktionary.org/wiki/Appendix:Swadesh_lists

Chinese

English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt

Japanese

Thai https://github.com/nv23/thai-wordlist

Vietnamese https://www.chunom.org/

Frequency list[edit | edit source]

Multiple languages https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists

Chinese

Kannada https://github.com/kakashi/kannada_IN_dictionary

Korean 자주 쓰이는 한국어 낱말 https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%EC%9E%90%EC%A3%BC_%EC%93%B0%EC%9D%B4%EB%8A%94_%ED%95%9C%EA%B5%AD%EC%96%B4_%EB%82%B1%EB%A7%90_5800

Graded list[edit | edit source]

Chinese

Japanese

Korean

Russian https://en.openrussian.org/vocab/A1

Spell checker[edit | edit source]

The word lists are in the spell checkers' source code: CWL files in GNU Aspell, can be opened with TeXstudio; DIC files in Hunspell, can be opened with a text editor.

Multiple languages

Croatian https://github.com/spideyfusion/elasticsearch-croatian

Indonesian https://github.com/shuLhan/hunspell-id

Kazakh https://github.com/taem/hunspell-kk

Other Lessons[edit | edit source]