Difference between revisions of "Language/Multiple-languages/Culture/Internet-Vocabularies"

From Polyglot Club WIKI
Jump to navigation Jump to search
 
(27 intermediate revisions by 4 users not shown)
Line 1: Line 1:
[[Category:Free-Resources]]
[[Category:Free-Resources]]
{{Multiple-languages-flag}}
On this page you will find vocabularies to memorise. This page is not to be confused with [[Language/Multiple-languages/Culture/Internet-Dictionaries]]. Here are word lists that possess one of the following features:
* contain frequency or grading information
* no translations, definitions or pronunciations


On this page you will find vocabularies to memorise. This page is not to be confused with [[Language/Multiple-languages/Culture/Internet-Dictionaries]]. Here are word lists that do not have translations, definitions or pronunciations, and programs that apply such word lists.
Computer programs are included, too.


They can be made use of by [https://polyglotclub.com/wiki/Language/Multiple-languages/Culture/How-to-make-a-TSV-file#How_to_combine_data_with_same_column_from_two_spreadsheets merging with dictionary meaning data]. This may require [https://polyglotclub.com/wiki/Language/Multiple-languages/Culture/Producing-dictionaries-with-web-scraping web scraping].
They can be made use of by [https://polyglotclub.com/wiki/Language/Multiple-languages/Culture/How-to-make-a-TSV-file#How_to_combine_data_with_same_column_from_two_spreadsheets merging with dictionary meaning data]. This may require [https://polyglotclub.com/wiki/Language/Multiple-languages/Culture/Producing-dictionaries-with-web-scraping web scraping].


In progress.
In progress.
Visit https://codeberg.org/GrimPixel/standard-character-lists to download standard lists in TSV format.


== Common word/character list ==
== Common word/character list ==
Multiple languages https://en.wiktionary.org/wiki/Appendix:Swadesh_lists
Multiple languages https://en.wiktionary.org/wiki/Appendix:Swadesh_lists
Chinese
* 通用规范汉字表
** https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8
** https://zh.wiktionary.org/zh/Appendix:%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8
* 常用國字標準字體表 https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8


English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt
English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt


Japanese
Japanese
* https://ja.wiktionary.org/wiki/%E4%BB%98%E9%8C%B2:%E5%B8%B8%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7
* 常用漢字
* https://ja.wiktionary.org/wiki/Wiktionary:%E4%BA%BA%E5%90%8D%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7
** https://ja.wiktionary.org/wiki/%E4%BB%98%E9%8C%B2:%E5%B8%B8%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7
 
** https://joyokanji.info/list.html
Korean https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217
** https://kanji.jitenon.jp/cat/joyo.html
 
** https://kanjitisiki.com/zyouyou/
Mandarin Chinese
* 人名用漢字
* https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8
** https://ja.wiktionary.org/wiki/Wiktionary:%E4%BA%BA%E5%90%8D%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7
* https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8
** https://joyokanji.info/jinmei.html
** https://kanji.jitenon.jp/cat/jimmei.html
** https://kanjitisiki.com/zinmei/


Thai https://github.com/nv23/thai-wordlist
Thai https://github.com/nv23/thai-wordlist
Line 29: Line 43:
Multiple languages https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists
Multiple languages https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists


Chinese https://humanum.arts.cuhk.edu.hk/Lexis/chifreq/
Chinese
* https://humanum.arts.cuhk.edu.hk/Lexis/chifreq/
* https://humanum.arts.cuhk.edu.hk/Lexis/lexi-can/faq.php
* https://lingua.mtsu.edu/chinese-computing/statistics/index.html
* http://technology.chtsai.org/charfreq/


Kannada https://github.com/kakashi/kannada_IN_dictionary
Kannada https://github.com/kakashi/kannada_IN_dictionary


Mandarin Chinese
Korean 자주 쓰이는 한국어 낱말 https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%EC%9E%90%EC%A3%BC_%EC%93%B0%EC%9D%B4%EB%8A%94_%ED%95%9C%EA%B5%AD%EC%96%B4_%EB%82%B1%EB%A7%90_5800
* https://lingua.mtsu.edu/chinese-computing/statistics/index.html
* http://technology.chtsai.org/charfreq/


Yue Chinese https://humanum.arts.cuhk.edu.hk/Lexis/lexi-can/faq.php
== Graded list ==
Chinese
* T.O.C.F.L. Word lists https://www.tw.org/tocfl/
* 新汉语水平考试(HSK)词汇(2012年修订版) http://www.chinesetest.cn/godownload.do#list_1


== Graded word/character list ==
Japanese
Japanese https://ja.wikipedia.org/wiki/%E5%AD%A6%E5%B9%B4%E5%88%A5%E6%BC%A2%E5%AD%97%E9%85%8D%E5%BD%93%E8%A1%A8
* 学年別漢字配当表 https://ja.wikipedia.org/wiki/%E5%AD%A6%E5%B9%B4%E5%88%A5%E6%BC%A2%E5%AD%97%E9%85%8D%E5%BD%93%E8%A1%A8
* JLPT Vocabulary https://www.tanos.co.uk/jlpt/skills/vocab/


Korean https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800
Korean
* 한문 교육용 기초 한자 https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800
* Korean Frequency List -Top 6000 Words https://www.topikguide.com/korean-frequency-list-top-6000-words/


Mandarin Chinese
Russian https://en.openrussian.org/vocab/A1
* http://www.chinesetest.cn/godownload.do#list_1
* http://www.tw.org/tocfl/


== Spell checker ==
== Spell checker ==
''Some require knowledge about [http://aspell.net/ GNU Aspell] and [https://hunspell.github.io/ Hunspell]. The word lists are in the spell checkers' source code.''
''The word lists are in the spell checkers' source code: CWL files in GNU Aspell, can be opened with [https://www.texstudio.org/ TeXstudio]; DIC files in Hunspell, can be opened with a text editor.''


Multiple languages
Multiple languages
Line 61: Line 81:


Kazakh https://github.com/taem/hunspell-kk
Kazakh https://github.com/taem/hunspell-kk
==Other Lessons==
* [[Language/Multiple-languages/Culture/Different-ways-to-greet-in-the-world|Different ways to greet in the world]]
* [[Language/Multiple-languages/Culture/Texts-and-Audios-under-a-Public-License|Texts and Audios under a Public License]]
* [[Language/Multiple-languages/Culture/Producing-dictionaries-with-web-scraping|Producing dictionaries with web scraping]]
* [[Language/Multiple-languages/Culture/Websites-with-Multilingual-Articles|Websites with Multilingual Articles]]
* [[Language/Multiple-languages/Culture/How-to-locate-the-origin-of-a-video-or-a-photo|How to locate the origin of a video or a photo]]
* [[Language/Multiple-languages/Culture/Elements-of-Traditional-Architectures:-Eastern-Asia|Elements of Traditional Architectures: Eastern Asia]]
* [[Language/Multiple-languages/Culture/Good-Memories|Good Memories]]
* [[Language/Multiple-languages/Culture/Cities-with-the-best-quality-of-life|Cities with the best quality of life]]
* [[Language/Multiple-languages/Culture/The-Polyglot-Club-Team|The Polyglot Club Team]]
* [[Language/Multiple-languages/Culture/Important-Technologies|Important Technologies]]
<span links></span>

Latest revision as of 17:24, 22 May 2023

Multiple-languages-flag-polyglotclub.jpg

On this page you will find vocabularies to memorise. This page is not to be confused with Language/Multiple-languages/Culture/Internet-Dictionaries. Here are word lists that possess one of the following features:

  • contain frequency or grading information
  • no translations, definitions or pronunciations

Computer programs are included, too.

They can be made use of by merging with dictionary meaning data. This may require web scraping.

In progress.

Visit https://codeberg.org/GrimPixel/standard-character-lists to download standard lists in TSV format.

Common word/character list[edit | edit source]

Multiple languages https://en.wiktionary.org/wiki/Appendix:Swadesh_lists

Chinese

English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt

Japanese

Thai https://github.com/nv23/thai-wordlist

Vietnamese https://www.chunom.org/

Frequency list[edit | edit source]

Multiple languages https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists

Chinese

Kannada https://github.com/kakashi/kannada_IN_dictionary

Korean 자주 쓰이는 한국어 낱말 https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%EC%9E%90%EC%A3%BC_%EC%93%B0%EC%9D%B4%EB%8A%94_%ED%95%9C%EA%B5%AD%EC%96%B4_%EB%82%B1%EB%A7%90_5800

Graded list[edit | edit source]

Chinese

Japanese

Korean

Russian https://en.openrussian.org/vocab/A1

Spell checker[edit | edit source]

The word lists are in the spell checkers' source code: CWL files in GNU Aspell, can be opened with TeXstudio; DIC files in Hunspell, can be opened with a text editor.

Multiple languages

Croatian https://github.com/spideyfusion/elasticsearch-croatian

Indonesian https://github.com/shuLhan/hunspell-id

Kazakh https://github.com/taem/hunspell-kk

Other Lessons[edit | edit source]