Difference between revisions of "Language/Multiple-languages/Culture/Internet-Vocabularies"
Line 1: | Line 1: | ||
[[Category:Free-Resources]] | [[Category:Free-Resources]] | ||
On this page you will find vocabularies to memorise. This page is not to be confused with [[Language/Multiple-languages/Culture/Internet-Dictionaries]]. Here are word lists that possess one of the following | On this page you will find vocabularies to memorise. This page is not to be confused with [[Language/Multiple-languages/Culture/Internet-Dictionaries]]. Here are word lists that possess one of the following features: | ||
* contain frequency or grading information | * contain frequency or grading information | ||
* | * no translations, definitions or pronunciations | ||
They can be made use of by [https://polyglotclub.com/wiki/Language/Multiple-languages/Culture/How-to-make-a-TSV-file#How_to_combine_data_with_same_column_from_two_spreadsheets merging with dictionary meaning data]. This may require [https://polyglotclub.com/wiki/Language/Multiple-languages/Culture/Producing-dictionaries-with-web-scraping web scraping]. | They can be made use of by [https://polyglotclub.com/wiki/Language/Multiple-languages/Culture/How-to-make-a-TSV-file#How_to_combine_data_with_same_column_from_two_spreadsheets merging with dictionary meaning data]. This may require [https://polyglotclub.com/wiki/Language/Multiple-languages/Culture/Producing-dictionaries-with-web-scraping web scraping]. |
Revision as of 11:00, 24 April 2021
On this page you will find vocabularies to memorise. This page is not to be confused with Language/Multiple-languages/Culture/Internet-Dictionaries. Here are word lists that possess one of the following features:
- contain frequency or grading information
- no translations, definitions or pronunciations
They can be made use of by merging with dictionary meaning data. This may require web scraping.
In progress.
Common word/character list
Multiple languages https://en.wiktionary.org/wiki/Appendix:Swadesh_lists
Chinese
- 通用规范汉字表
- https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8
- http://input.foruto.com/ccc/gongbiu/dzijingbiu/index.htm
English https://github.com/HK-SHAO/English-Dictionary/blob/master/word/words.txt
Japanese
- 常用漢字
- 人名用漢字
Korean https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217
Thai https://github.com/nv23/thai-wordlist
Vietnamese https://www.chunom.org/
Frequency list
Multiple languages https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists
Chinese
- https://humanum.arts.cuhk.edu.hk/Lexis/chifreq/
- https://humanum.arts.cuhk.edu.hk/Lexis/lexi-can/faq.php
- https://lingua.mtsu.edu/chinese-computing/statistics/index.html
- http://technology.chtsai.org/charfreq/
Kannada https://github.com/kakashi/kannada_IN_dictionary
Graded list
Chinese
Korean
- https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800
- https://www.topikguide.com/korean-frequency-list-top-6000-words/
Spell checker
The word lists are in the spell checkers' source code: CWL files in GNU Aspell, can be opened with TeXstudio; DIC files in Hunspell, can be opened with a text editor.
Multiple languages
- https://addons.mozilla.org/en-US/firefox/language-tools/
- https://ftp.gnu.org/gnu/aspell/dict/0index.html
- https://wiki.documentfoundation.org/Language_support_of_LibreOffice
Croatian https://github.com/spideyfusion/elasticsearch-croatian
Indonesian https://github.com/shuLhan/hunspell-id