Difference between revisions of "Language/Multiple-languages/Culture/Internet-Vocabularies"

From Polyglot Club WIKI
Jump to navigation Jump to search
Line 12: Line 12:
English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt
English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt


Mandarin Chinese https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8
Mandarin Chinese
* https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8
* https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8


Mandarin Chinese https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8
Japanese
* https://ja.wiktionary.org/wiki/%E4%BB%98%E9%8C%B2:%E5%B8%B8%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7
* https://ja.wiktionary.org/wiki/Wiktionary:%E4%BA%BA%E5%90%8D%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7


Japanese https://ja.wiktionary.org/wiki/%E4%BB%98%E9%8C%B2:%E5%B8%B8%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7
Korean https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217
 
Japanese https://ja.wiktionary.org/wiki/Wiktionary:%E4%BA%BA%E5%90%8D%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7


Thai https://github.com/nv23/thai-wordlist
Thai https://github.com/nv23/thai-wordlist
Line 31: Line 33:
Kannada https://github.com/kakashi/kannada_IN_dictionary
Kannada https://github.com/kakashi/kannada_IN_dictionary


Mandarin Chinese https://lingua.mtsu.edu/chinese-computing/statistics/index.html
Mandarin Chinese
 
* https://lingua.mtsu.edu/chinese-computing/statistics/index.html
Mandarin Chinese http://technology.chtsai.org/charfreq/
* http://technology.chtsai.org/charfreq/


Yue Chinese https://humanum.arts.cuhk.edu.hk/Lexis/lexi-can/faq.php
Yue Chinese https://humanum.arts.cuhk.edu.hk/Lexis/lexi-can/faq.php
Line 42: Line 44:
Korean https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800
Korean https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800


Korean https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217
Mandarin Chinese
 
* http://www.chinesetest.cn/godownload.do#list_1
Mandarin Chinese http://www.chinesetest.cn/godownload.do#list_1
* http://www.tw.org/tocfl/
 
Mandarin Chinese http://www.tw.org/tocfl/


== Spell checker ==
== Spell checker ==
''Some require knowledge about [http://aspell.net/ GNU Aspell] and [https://hunspell.github.io/ Hunspell]. The word lists are in the spell checkers' source code.''
''Some require knowledge about [http://aspell.net/ GNU Aspell] and [https://hunspell.github.io/ Hunspell]. The word lists are in the spell checkers' source code.''


Multiple languages https://addons.mozilla.org/en-US/firefox/language-tools/
Multiple languages
 
* https://addons.mozilla.org/en-US/firefox/language-tools/
Multiple languages https://ftp.gnu.org/gnu/aspell/dict/0index.html
* https://ftp.gnu.org/gnu/aspell/dict/0index.html
 
* https://wiki.documentfoundation.org/Language_support_of_LibreOffice
Multiple languages https://wiki.documentfoundation.org/Language_support_of_LibreOffice


Croatian https://github.com/spideyfusion/elasticsearch-croatian
Croatian https://github.com/spideyfusion/elasticsearch-croatian

Revision as of 11:47, 17 April 2021


On this page you will find vocabularies to memorise. This page is not to be confused with Language/Multiple-languages/Culture/Internet-Dictionaries. Here are word lists that do not have translations, definitions or pronunciations, and programs that apply such word lists.

They can be made use of by merging with dictionary data. This may require web scraping.

In progress.

Common word/character list

Multiple languages https://en.wiktionary.org/wiki/Appendix:Swadesh_lists

English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt

Mandarin Chinese

Japanese

Korean https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217

Thai https://github.com/nv23/thai-wordlist

Vietnamese https://www.chunom.org/

Frequency list

Multiple languages https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists

Chinese https://humanum.arts.cuhk.edu.hk/Lexis/chifreq/

Kannada https://github.com/kakashi/kannada_IN_dictionary

Mandarin Chinese

Yue Chinese https://humanum.arts.cuhk.edu.hk/Lexis/lexi-can/faq.php

Graded word/character list

Japanese https://ja.wikipedia.org/wiki/%E5%AD%A6%E5%B9%B4%E5%88%A5%E6%BC%A2%E5%AD%97%E9%85%8D%E5%BD%93%E8%A1%A8

Korean https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800

Mandarin Chinese

Spell checker

Some require knowledge about GNU Aspell and Hunspell. The word lists are in the spell checkers' source code.

Multiple languages

Croatian https://github.com/spideyfusion/elasticsearch-croatian

Indonesian https://github.com/shuLhan/hunspell-id

Kazakh https://github.com/taem/hunspell-kk