Difference between revisions of "Language/Multiple-languages/Culture/Licensed-Free-Databases"

From Polyglot Club WIKI
Jump to navigation Jump to search
m (Quick edit)
 
(124 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[Category:Free-Resources]]
<div class="pg_page_title">Licensed-Free Databases Around Languages</div>
The listed are databases, not applications. That is to say, if you don't know programming, maybe they won't help you so much.
[[File:best-licensed-free-databases-polyglotclub.jpg|thumb]]
Hi polyglots! 😀
 
➡ On this page we have listed free databases related to languages.
 
* Those mentioned on [[Language/Multiple-languages/Culture/Internet-Dictionaries|Internet Dictionaries]] will not be mentioned again here.
 
* The listed items are data, so if you don't know programming, this page might not be of much help to you.
 
== Main ==


== Multiple languages ==
=== Multiple languages ===
====https://www.ethnologue.com/codes/download-code-tables<nowiki/>====
LanguageCodes.tab lists the 7,400+ distinct language identifiers used in the current Ethnologue database.


=== https://dumps.wikimedia.org/ ===
==== https://dumps.wikimedia.org/ ====
License: https://dumps.wikimedia.org/legal.html
License: https://dumps.wikimedia.org/legal.html


Some of its users: [https://www.wikimedia.org/ Wikimedia]
Some of its users: https://www.wikimedia.org/


Wikimedia.
Wikimedia.


=== https://iate.europa.eu/download-iate/ ===
==== https://tatoeba.org/eng/downloads/ ====
License: https://iate.europa.eu/download-iate/
License: https://tatoeba.org/eng/downloads/


Some of its users: [https://iate.europa.eu/download-iate/ IATE]
Some of its users: https://tatoeba.org/, http://www.listeningpractice.org/, https://jisho.org/


Terminology dictionary of the EU.
Parallel corpora. In common words, collections about a sentence in different languages.


=== https://tatoeba.org/eng/downloads/ ===
==== https://wiki.documentfoundation.org/Language_support_of_LibreOffice ====
License: https://tatoeba.org/eng/downloads/
License: https://wiki.documentfoundation.org/Language_support_of_LibreOffice


Some of its users: [https://tatoeba.org/ Tatoeba], [http://www.listeningpractice.org/ ListeningPractice.org], [https://jisho.org/ Jisho.org]
Some of its users: https://www.libreoffice.org/


Parallel corpora. In common words, collections about a sentence in different languages.
You can find the “Spell check dictionaries” and other useful things.


=== http://www.gutenberg.org/wiki/Gutenberg:Information_About_Robot_Access_to_our_Pages ===
==== http://www.gutenberg.org/wiki/Gutenberg:Information_About_Robot_Access_to_our_Pages ====
License: http://www.gutenberg.org/wiki/Gutenberg:Terms_of_Use
License: http://www.gutenberg.org/wiki/Gutenberg:Terms_of_Use


Some of its users: [http://www.gutenberg.org/ Gutenberg Project], [https://librivox.org/ LibriVox]
Some of its users: http://www.gutenberg.org/, https://librivox.org/ LibriVox


Ebooks.
Ebooks.


=== https://librivox.org/pages/about-librivox/ ===
==== https://librivox.org/pages/about-librivox/ ====
License: https://librivox.org/pages/about-librivox/
License: https://librivox.org/pages/about-librivox/


Some of its users: [https://librivox.org/ LibriVox], [http://www.listeningpractice.org/ ListeningPractice.org]
Some of its users: https://librivox.org/, http://www.listeningpractice.org/


Audio books
Audio books.


=== https://freedict.org/downloads/ ===
==== http://www.omegawiki.org/Help:Downloading_the_data ====
License: https://freedict.org/about/
License: http://www.omegawiki.org/Meta:Main_Page


Some of its users: [http://aarddict.org/ Aard 2]
Some of its users: http://www.omegawiki.org/Meta:Main_Page, http://dictionarymid.sourceforge.net/


Dictionaries.
Dictionaries.


=== http://www.omegawiki.org/Help:Downloading_the_data ===
==== https://ltrc.iiit.ac.in/onlineServices/Dictionaries/Dict_Frame.html ====
License: http://www.omegawiki.org/Meta:Main_Page
License: https://ltrc.iiit.ac.in/onlineServices/Dictionaries/GPLHelp.html
 
Some of its users: [http://www.omegawiki.org/Meta:Main_Page OmegaWiki], [http://dictionarymid.sourceforge.net/ DictionaryForMIDs]


Dictionaries.
Dictionaries for South Asian languages and English.


=== http://compling.hss.ntu.edu.sg/omw/ ===
==== http://compling.hss.ntu.edu.sg/omw/ ====
License: http://compling.hss.ntu.edu.sg/omw/
License: http://compling.hss.ntu.edu.sg/omw/


Some of its users: [http://compling.hss.ntu.edu.sg/omw/cgi-bin/wn-gridx.cgi?gridmode=grid Open Multilingual Wordnet]
Some of its users: http://compling.hss.ntu.edu.sg/omw/cgi-bin/wn-gridx.cgi?gridmode=grid


Wordnets.
Wordnets.


=== http://www.dicto.org.ru/xdxf.html ===
==== http://www.dicto.org.ru/xdxf.html ====
License: http://dicto.org.ru/license.html
License: http://dicto.org.ru/license.html


Some of its users: [http://dicto.org.ru/ Dicto]
Some of its users: http://dicto.org.ru/


Repository of dictionaries (from elsewhere).
Repository of dictionaries (from elsewhere).


=== http://shtooka.net/download.php ===
==== http://shtooka.net/download.php ====
License: http://shtooka.net/
License: http://shtooka.net/


Collections of audio.
Collections of audio.


=== https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists ===
==== https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists ====
License: https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists
License: https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists


Frequency lists.
Frequency lists.


=== https://lego.linguistlist.org/about#contact ===
==== https://lego.linguistlist.org/about#contact ====
License: https://lego.linguistlist.org/about#copyright
License: https://lego.linguistlist.org/about#copyright


Some of its users: [https://lego.linguistlist.org/ LEGO]
Some of its users: https://lego.linguistlist.org/


Lexicon. No download link on the website.
Lexicon. No download link on the website.


=== https://panlex.org/source-list/ ===
==== https://panlex.org/source-list/ ====
License: https://panlex.org/license/
License: https://panlex.org/license/


Some of its users: [https://glosbe.com Glosbe]
Some of its users: https://glosbe.com


Lexical database links.
Lexical database links.


=== http://cjklib.org/0.3/ ===
==== https://github.com/cburgmer/cjklib ====
License: http://cjklib.org/0.3/
License: https://github.com/cburgmer/cjklib/blob/master/COPYING


Some of its users: [https://www.skishore.me/makemeahanzi/ Make Me a Hanzi]
Some of its users: https://www.skishore.me/makemeahanzi/


Data about Han script.
Data about Han script.


=== https://www.radio-browser.info/gui/#!/ ===
==== https://www.radio-browser.info/gui/#!/ ====
License: https://www.radio-browser.info/gui/#!/
License: https://www.radio-browser.info/gui/#!/


Some of its users: [https://github.com/segler-alex/RadioDroid RadioDroid]
Some of its users: https://github.com/segler-alex/RadioDroid


Database of radio stations.
Database of radio stations.


=== https://help.archive.org/hc/en-us/articles/360017781111-How-to-download-files- ===
==== https://help.archive.org/hc/en-us/articles/360017781111-How-to-download-files- ====
License: https://www.archive.org/about/terms.php
License: https://www.archive.org/about/terms.php


Some of its users: [https://www.archive.org/ Internet Archive]
Some of its users: https://www.archive.org/


Archived Internet content.
Archived Internet content.


== Japanese ==
==== https://www.fandom.com/ ====
License: https://www.fandom.com/licensing


=== http://www.edrdg.org/wiki/index.php/Main_Page ===
Fan-made wiki.
License: https://www.edrdg.org/edrdg/licence.html


Some of its users: [https://jisho.org/ Jisho.org], [https://www.tagaini.net/ Tagaini Jisho]
=== Chinese ===


Japanese dictionaries.
==== http://lingua.mtsu.edu/chinese-computing/ ====
License: http://lingua.mtsu.edu/chinese-computing/copyright.html


=== https://www.wadoku.de/wiki/display/WAD/Downloads+und+Links ===
Character frequency lists.
License: https://www.wadoku.de/wiki/pages/viewpage.action?pageId=357


Some of its users: [https://www.wadoku.de/ 和独辞典]
==== https://github.com/gwinterstein/Cifu ====
License: https://github.com/gwinterstein/Cifu/blob/master/LICENSE


Japanese-German dictionary.
Word frequency list for Yue Chinese.


=== https://github.com/KanjiVG/kanjivg/releases/ ===
==== https://www.tanos.co.uk/hsk/ ====
License: http://kanjivg.tagaini.net/
License: https://www.tanos.co.uk/jlpt/sharing/


Some of its users: [https://www.tagaini.net/ Tagaini Jisho], [https://jisho.org/ Jisho.org]
HSK data.


Kanji strokes.
==== http://www.hskhsk.com/resources.html ====
License: http://www.hskhsk.com/resources.html


=== https://joyokanji.info/ ===
HSK data.
License: http://www.bunka.go.jp/bunkacho_homepage/index.html, http://www.mext.go.jp/b_menu/about_link.htm, http://www.moj.go.jp/term.html


Some of its users: [https://joyokanji.info/ 常用漢字情報サイト]
==== https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8 ====
License: https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8


[http://www.bunka.go.jp/seisaku/bunkashingikai/kokugo/kokugo/kokugo_45/pdf/jouyoukanjihyou_h22.pdf Jōyō Kanji], [http://www.moj.go.jp/content/001131003.pdf Jinmeiyō Kanji], [http://www.mext.go.jp/a_menu/shotou/new-cs/youryou/syo/koku/001.htm Kyōiku Kanji] in a easy-to-copy form.
Frequent characters.


== Chinese ==
==== https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8 ====
License: https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8


=== https://resources.publicense.moe.edu.tw/index.html ===
Frequent characters.
License: https://resources.publicense.moe.edu.tw/index.html


Some of its users: [https://resources.publicense.moe.edu.tw/index.html 教育部國語辭典公眾授權網], [https://www.moedict.tw/ 萌典]
==== http://input.foruto.com/ccc/gongbiu/index.htm ====
License:  


Dictionaries of ROC Mandarin Chinese written in ROC Mandarin Chinese.
Frequent characters.


=== https://cc-cedict.org/editor/editor.php ===
=== English ===
License: https://cc-cedict.org/wiki/


Some of its users: [https://www.mdbg.net/chinese/dictionary MDBG], [https://www.pleco.com/ Pleco]
==== http://gcide.gnu.org.ua/download ====
License: http://gcide.gnu.org.ua/license


Mandarin-English dictionary.
Some of its users: http://gcide.gnu.org.ua/


=== https://chine.in/mandarin/dictionnaire/CFDICT/ ===
Dictionary of definition.
License: https://chine.in/mandarin/dictionnaire/CFDICT/


Some of its users: [https://chine.in/ Chine Informations], [https://www.pleco.com/ Pleco]
==== https://foldoc.org/source.html ====
License: https://foldoc.org/Free+On-line+Dictionary


Mandarin-French dictionary.
Some of its users: https://foldoc.org/


=== http://www.handedict.de/chinesisch_deutsch.php ===
Dictionary about computing.
License: http://www.handedict.de/chinesisch_deutsch.php?mode=dl&sid=51394be2b6d9cba75946e929b5477d55


Some of its users: [http://www.handedict.de/ HanDeDict], [https://www.pleco.com/ Pleco]
==== https://github.com/tony-mak/Eng-Chi-Dictionary/tree/master/app/src/main/assets/databases ====
License: https://github.com/tony-mak/Eng-Chi-Dictionary/blob/master/LICENSE


Mandarin-German dictionary.
Dictionary.


=== https://chdict.zydeo.net/en/download/ ===
==== https://github.com/linuxkathirvel/eng2tamildictionary/blob/master/dictionary.json ====
License: https://chdict.zydeo.net/en/download/
License: https://github.com/linuxkathirvel/eng2tamildictionary/blob/master/License.txt


Some of its users: [https://chdict.zydeo.net/hu/ CHDICT]
Dictionary.


Mandarin-Hungarian dictionary.
==== https://github.com/derekchuank/high-frequency-vocabulary ====
License: https://github.com/derekchuank/high-frequency-vocabulary/blob/master/LICENSE


=== http://cantonese.org/download.html ===
Dictionary.
License: http://cantonese.org/download.html


Some of its users: [http://cantonese.org/ CC-Canto], [https://www.pleco.com/ Pleco]
==== https://github.com/kujirahand/EJDict/tree/master/src ====
License: https://github.com/kujirahand/EJDict/blob/master/LICENSE


Cantonese-English dictionary.
Dictionary.


=== https://twblg.dict.edu.tw/holodict_new/compile1_6_1.jsp ===
=== Hindi ===
License: https://twblg.dict.edu.tw/holodict_new/compile1_6_1.jsp


Some of its users: [https://twblg.dict.edu.tw/holodict_new/default.jsp 臺灣閩南語常用詞辭典], [https://www.moedict.tw/ 萌典]
==== http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/downloaderInfo.php ====
License: http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/index.php


Taiwanese-Endlish dictionary. It can be requested through email.
Some of its users: http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/dict_search_user.php


=== http://www.taiwanesedictionary.org/ ===
Dictionary. Application is required.
License: http://www.taiwanesedictionary.org/
 
Taiwanese-English dictionary.
 
=== http://lingua.mtsu.edu/chinese-computing/ ===
License: http://lingua.mtsu.edu/chinese-computing/copyright.html
 
Frequency lists.
 
== Korean ==
 
=== https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt ===
License: https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt
 
Words in Hangul and Hanja.
 
There is a page of introduction: https://wiki.kldp.org/wiki.php/libhangul.
 
=== https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800 ===
License: http://www.suneung.re.kr/sub/info.do?m=0601&s=suneung
 
[http://www.suneung.re.kr/boardCnts/fileDown.do?fileSeq=59692112e521efa80d2af27916704082 Hanmun gyoyukyong gicho Hanja] in a easy-to-copy form.
 
== Esperanto ==
 
=== http://reta-vortaro.de/tgz/index.html ===
License: http://reta-vortaro.de/tgz/index.html
 
Some of its users: [http://reta-vortaro.de/ Reta Vortaro], [http://www.busydoingnothing.co.uk/prevo/ PReVo]
 
Dictionary in several languages.


=== http://www.denisowski.org/Esperanto/ESPDIC/espdic_readme.html ===
=== Icelandic ===
License: http://www.denisowski.org/Esperanto/ESPDIC/espdic_readme.html


Some of its users: [http://www.denisowski.org/Esperanto/ESPDIC/espdic_readme.html ESPDIC]
==== https://www.ling.upenn.edu/~kurisuto/germanic/oi_cleasbyvigfusson_about.html ====
License: http://lexicon.ff.cuni.cz/txt/oi_cleasbyvigfusson.txt


Dictionary.
Dictionary.


=== https://komputeko.net/elsxutejo-en.php ===
=== Interlingue ===
License: https://komputeko.net/index_en.php


Some of its users: [https://komputeko.net/index_en.php Komputeko]
==== https://github.com/Carmina16/hunspell-ie ====
License: https://github.com/Carmina16/hunspell-ie/blob/master/LICENSE


Computer terminology dictionary.
Spell checker with dictionary.


== Vietnamese ==
=== Japanese ===


=== http://www.informatik.uni-leipzig.de/~duc/Dict/install.html ===
==== https://github.com/KanjiVG/kanjivg/releases/ ====
License: http://www.informatik.uni-leipzig.de/~duc/Dict/install.html
License: http://kanjivg.tagaini.net/


Some of its users: [http://www.informatik.uni-leipzig.de/~duc/Dict/install.html TuDienHND]
Some of its users: https://www.tagaini.net/, https://jisho.org/


Dictionaries in several languages.
Kanji strokes.


There is a page of introduction: https://vi.wiktionary.org/wiki/Wiktionary:Ngu%E1%BB%93n_g%E1%BB%91c/FVDP
==== https://github.com/mifunetoshiro/kanjium ====
License: https://github.com/mifunetoshiro/kanjium/blob/master/LICENSE.txt


=== http://www.denisowski.org/Vietnamese/vnedict_readme.htm ===
Kanji data.
License: http://www.denisowski.org/Vietnamese/vnedict_readme.htm


Some of its users: [http://www.denisowski.org/Vietnamese/vnedict_readme.htm VNEDICT]
==== https://www.tanos.co.uk/jlpt/ ====
License: https://www.tanos.co.uk/jlpt/sharing/


Dictionary.
JLPT data.


=== https://github.com/duyetdev/vietnamese-wordlist ===
==== https://ja.wiktionary.org/wiki/%E4%BB%98%E9%8C%B2:%E5%B8%B8%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7 ====
License: https://github.com/duyetdev/vietnamese-wordlist/blob/master/LICENSE
License: http://www.bunka.go.jp/bunkacho_homepage/index.html


Word list.
Frequent characters.


=== https://github.com/duyetdev/vietnamese-namedb ===
==== https://ja.wiktionary.org/wiki/Wiktionary:%E4%BA%BA%E5%90%8D%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7 ====
License: https://github.com/duyetdev/vietnamese-namedb/blob/master/LICENSE
License: http://www.moj.go.jp/term.html


Name list.
Frequent characters for names.


== English ==
==== https://ja.wikipedia.org/wiki/%E5%AD%A6%E5%B9%B4%E5%88%A5%E6%BC%A2%E5%AD%97%E9%85%8D%E5%BD%93%E8%A1%A8 ====
License: http://www.mext.go.jp/b_menu/about_link.htm


=== http://gcide.gnu.org.ua/download ===
Frequent characters according to school grades.
License: http://gcide.gnu.org.ua/license


Some of its users: [http://gcide.gnu.org.ua/ GCIDE]
=== Korean ===


Dictionary of definition.
==== https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt ====
License: https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt


=== https://foldoc.org/source.html ===
Words in Hangul and Hanja.
License: https://foldoc.org/Free+On-line+Dictionary


Some of its users: [https://foldoc.org/ FOLDOC]
There is a page of introduction: https://wiki.kldp.org/wiki.php/libhangul.


Dictionary about computing.
==== https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800 ====
License: http://www.suneung.re.kr/sub/info.do?m=0601&s=suneung


== German ==
http://www.suneung.re.kr/boardCnts/fileDown.do?fileSeq=59692112e521efa80d2af27916704082 in a easy-to-copy form.


=== https://www.openthesaurus.de/about/download/ ===
==== https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217 ====
License: https://www.openthesaurus.de/about/download/
License: https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110702


Some of its users: [https://www.openthesaurus.de/about/download/ OpenThesaurus]
Word list of TOPIK.


Thesaurus.
=== Lithuanian ===


== Iranian Persian ==
==== https://github.com/ispell-lt/ispell-lt ====
License: https://github.com/ispell-lt/ispell-lt/blob/master/COPYING


=== https://github.com/amirshnll/English-Persian-Word-Database ===
Spell checker with dictionary.
License: https://github.com/amirshnll/English-Persian-Word-Database/blob/master/LICENSE


Dictionary.
=== Sanskrit ===


== Assamese ==
==== https://github.com/hemanth/sanskrit-dict/blob/master/dict.js ====
 
License: https://github.com/hemanth/sanskrit-dict/blob/master/license
=== http://www.xobdo.org/downloads/ ===
License: http://www.xobdo.org/downloads/
 
Some of its users: [http://www.xobdo.org/ XOBDO.ORG]


Dictionary.
Dictionary.


== American Sign Language ==
=== Slovak ===


=== http://www.asl-lex.org/ ===
==== http://sk-spell.sk.cx/hunspell-sk ====
License: http://www.asl-lex.org/
License: http://sk-spell.sk.cx/hunspell-sk


Lexicon.
Spell checker with dictionary.


== Interlingua ==
=== Vietnamese ===


=== http://www.denisowski.org/Interlingua/IEDICT/iedict_readme.html ===
==== https://github.com/duyetdev/vietnamese-wordlist ====
License: http://www.denisowski.org/Interlingua/IEDICT/iedict_readme.html
License: https://github.com/duyetdev/vietnamese-wordlist/blob/master/LICENSE


Some of its users: [http://www.denisowski.org/Interlingua/IEDICT/iedict_readme.html IEDICT]
Word list.


Dictionary.
==== https://github.com/duyetdev/vietnamese-namedb ====
License: https://github.com/duyetdev/vietnamese-namedb/blob/master/LICENSE


== Klingon ==
Name list.
 
=== http://klingonska.org/dict/dict.zdb ===
License: http://klingonska.org/dict/
 
Some of its users: [http://klingonska.org/dict/ Klingon Pocket Dictionary]
 
Dictionary.
 
== Hindi ==
 
=== http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/downloaderInfo.php ===
License: http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/index.php
 
Some of its users: [http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/dict_search_user.php Universal Word]
 
Dictionary. Application is required.


== Non-language ==
=== Non-language ===


=== https://unicode.org/ucd/ ===
==== https://unicode.org/ucd/ ====
License: https://www.unicode.org/copyright.html
License: https://www.unicode.org/copyright.html


Some of its users: [https://wiki.gnome.org/action/show/Apps/Gucharmap Gucharmap], [http://www.decodeunicode.org/ decodeunicode], [https://unicode-table.com/en/ Unicode Table], [https://www.fontspace.com/ FontSpace]
Some of its users: https://wiki.gnome.org/action/show/Apps/Gucharmap, http://www.decodeunicode.org/, https://unicode-table.com/en/, https://www.fontspace.com/


Unicode.
Unicode.


=== https://www.cia.gov/library/publications/download/ ===
==== https://www.cia.gov/library/publications/download/ ====
License: https://www.cia.gov/library/publications/the-world-factbook/docs/contributor_copyright.html
License: https://www.cia.gov/library/publications/the-world-factbook/docs/contributor_copyright.html


Some of its users: [https://www.cia.gov/library/publications/resources/the-world-factbook/ The World Factbook]
Some of its users: https://www.cia.gov/library/publications/resources/the-world-factbook/


General facts about countries and regions.
General facts about countries and regions.


=== https://www.geonames.org/ ===
==== https://www.geonames.org/ ====
License: https://www.geonames.org/
License: https://www.geonames.org/


Gazetteer and postal code data for free.
Gazetteer and postal code data for free.


=== https://iso639-3.sil.org/code_tables/download_tables/ ===
==== https://iso639-3.sil.org/code_tables/download_tables/ ====
License: https://iso639-3.sil.org/code_tables/download_tables/
License: https://iso639-3.sil.org/code_tables/download_tables/


Some of its users: [https://iso639-3.sil.org/code_tables/639/data SIL], [https://polyglotclub.com/ Polyglot Club]
Some of its users: https://iso639-3.sil.org/code_tables/639/data, https://polyglotclub.com/


ISO 639-3 tables. It assigns each language a code and is updated every year.
ISO 639-3 tables. It assigns each language a code and is updated every year.


=== https://www.unicode.org/iso15924/codelists.html ===
==== https://www.unicode.org/iso15924/codelists.html ====
License: https://www.unicode.org/copyright.html
License: https://www.unicode.org/copyright.html


Some of its users: [http://www.unicode.org/iso15924/codelists.html Unicode]
Some of its users: http://www.unicode.org/iso15924/codelists.html


ISO 15924 lists. Codes for scripts.
ISO 15924 lists. Codes for scripts.


=== https://www.unece.org/cefact/locode/welcome.html ===
==== https://www.unece.org/cefact/locode/welcome.html ====
License: https://www.unece.org/cefact/locode/locode_since1981.html
License: https://www.unece.org/cefact/locode/locode_since1981.html


UN/LOCODE, an alternative to ISO 3166-2. It is updated twice a year.
UN/LOCODE, an alternative to ISO 3166-2. It is updated twice a year.


=== http://www.nationalanthems.info/ ===
==== http://www.nationalanthems.info/ ====
License: http://www.nationalanthems.info/
License: http://www.nationalanthems.info/


National anthems.
National anthems.
== Formats ==
=== Sheet ===
{| class="wikitable"
!database name with link
!file name
!field separator
!
!field 1
!field 2
!field 3
!field 4
!field 5
!field 6
!field 7
!field 8
!field 9
!field 10
!field 11
!field 12
!field 13
|-
! colspan="15" |dictionary
!
!
|-
|[https://github.com/tomcumming/tocfl-word-list An ordered and extended TOCFL word-list]
|tocfl.tsv
|<tab>
|
|Word
|Pinyin
|OtherPinyin
|Level
|First Translation
|Other Translation
|
|
|
|
|
|
|
|-
|[https://cantonese.org/download.html CC-Canto]
|cccanto-webdist.txt
|<space>
|
|Traditional
|Simplified
|[pin1 yin1]
|{jyut6 ping3}
|/English equivalent 1/equivalent 2/
|
|
|
|
|
|
|
|
|-
|[https://cc-cedict.org/editor/editor.php?handler=Download CC-CEDICT]
|cedict_ts.u8
|<space>
|
|Traditional
|Simplified
|[pin1 yin1]
|/English equivalent 1/equivalent 2/
|
|
|
|
|
|
|
|
|
|-
|[https://chine.in/mandarin/dictionnaire/CFDICT/ CFDICT]
|CFDICT.u8
|<space>
|
|Traditionnel
|Simplifié
|[pin1 yin1]
|/traduction 1/traduction2/
|
|
|
|
|
|
|
|
|
|-
|[https://chdict.zydeo.net/en/download/ CHDICT]
|CHDICT.u8
|<space>
|
|Tradicionális
|Egyszerűsített
|[pin1 yin1]
|/magyar egyenérték 1/ egyenérték 2
|
|
|
|
|
|
|
|
|
|-
|[https://github.com/skywind3000/ECDICT ECDICT]
|ecdict.csv
|,
|
|word
|phonetic
|definition
|translation
|pos
|collins
|oxford
|tag
|bnc
|frq
|exchange
|detail
|audio
|-
|[https://github.com/amirshnll/English-Persian-Word-Database English Persian Word Database]
|EnglishPersianWordDatabase.xlsx
|
|
|EnglishWord
|PersianWord
|
|
|
|
|
|
|
|
|
|
|
|-
|[http://www.denisowski.org/Esperanto/ESPDIC/espdic_readme.html ESPDIC]
|espdict.txt
| :
|
|Esperanto
|English
|
|
|
|
|
|
|
|
|
|
|
|-
|[http://www.handedict.de/chinesisch_deutsch.php?mode=dl&sid=d80e36eefdb05750bd130ae1f322ca09 HanDeDict]
|handedict.u8
|<space>
|
|Traditionel
|Vereinfacht
|[pin1 yin1]
|/deutsche Entsprechung 1 /Entsprechung 2/
|
|
|
|
|
|
|
|
|
|-
|[https://github.com/libhangul/libhangul/tree/master/data/hanja libhangul]
|hanja.txt
|<nowiki>:</nowiki>
|
|Hangul
|Hanja
|note
|
|
|
|
|
|
|
|
|
|
|-
|[http://www.denisowski.org/Interlingua/IEDICT/iedict_readme.html IEDICT]
|iedict.txt
| :
|
|Interlingua
|English
|
|
|
|
|
|
|
|
|
|
|
|-
|[https://www.eki.ee/litsents/vaba/dl.cgi?D=ies Inglise-eesti sõnaraamat]
|eestiinglise.txt
|<tab>
|
|eeste
|inglise
|
|
|
|
|
|
|
|
|
|
|
|-
|[https://www.tanos.co.uk/jlpt/skills/vocab/ JLPT Vocabulary]
|VocabList.N1.doc
VocabList.N2.doc
VocabList.N3.doc
VocabList.N4.doc
VocabList.N5.doc
|
|
|Kanji
|Hiragana
|English
|
|
|
|
|
|
|
|
|
|
|-
|[https://github.com/garfieldnate/kengdic kengdic]
|kengdic_2011.tsv
|<tab>
|
|wordid
|word
|?
|def
|?
|?
|submitter
|doe
|?
|hanja
|?
|?
|
|-
|[http://www.taiwanesedictionary.org/ The Maryknoll Taiwanese-English Dictionary & English-Taiwanese Dictionary 2013 edition]
|Mkdictionary.xls
|
|
|Sort
|Taiwanese
|Chinese
|English
|
|
|
|
|
|
|
|
|
|-
|[http://www.denisowski.org/Vietnamese/vnedict_readme.htm VNEDICT]
|vnedict.txt
| :
|
|Vietnamese
|English
|
|
|
|
|
|
|
|
|
|
|
|-
! colspan="15" |word list
|
|
|-
|[https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217 한국어능력시험 어휘목록]
|토픽 어휘 목록_공개 목록.xlsx
|
|
|수준
|어휘
|길잡이말
|품사
|
|
|
|
|
|
|
|
|
|-
|[https://lingua.mtsu.edu/chinese-computing/statistics/index.html 古汉语单字字频: Character frequency list of Classical Chinese]
|CharFreq-Classical.xls
|
|
|Serial number; 序号
|Character; 汉字
|
|
|
|
|
|
|
|
|
|
|
|-
|[https://lingua.mtsu.edu/chinese-computing/statistics/index.html 现代汉语单字字频: Character frequency list of Modern Chinese]
|CharFreq.txt
|<tab>
|
|Serial number; 序号
|Character; 汉字
|Individual raw frequency; 频率
|Cumulative frequency in percentile; 累计频率
|Pinyin; 拼音
|English translation; 英文翻译
|
|
|
|
|
|
|
|-
|[https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8 通用规范汉字表]
|
|
|
|编号
|字形
|
|
|
|
|
|
|
|
|
|
|
|-
|[https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8 常用國字標準字體表]
|
|
|
|流水序
|教育部字號
|Unicode
|常用字
|
|
|
|
|
|
|
|
|
|-
|[http://www.chinesetest.cn/godownload.do 新汉语水平考试(HSK)词汇(2012年修订版)]
|HSK-2012.xls
|
|
|单词(等级)
|
|
|
|
|
|
|
|
|
|
|
|
|}
==== Manually convert to TSV ====
{| class="wikitable"
!file name
!process (on Linux)
|-
|cccanto-webdist.txt
|
# [https://stackoverflow.com/questions/8206280/delete-all-lines-beginning-with-a-from-a-file Delete lines starting with '#'];
# [https://stackoverflow.com/questions/47010412/replace-first-space-on-each-line-by-a-tab Replace the first ' ' in each line with '\t'];
# Replace the first ' [' in each line with '\t';
# Replace '] {' with '\t';
# Replace '} /' with '\t';
# Replace ' # adapted from cc-cedict' with <nowiki>''</nowiki>;
# Replace '/\n' with '\n';
# Add 'Traditional\tSimplified\tpin1 yin1\tjyut6 ping3\tEnglish equivalent 1/equivalent 2\n' at the beginning;
|-
|cedict_ts.u8
|
# Delete lines starting with '#';
# Replace the first ' ' in each line with '\t';
# Replace the first ' [' in each line with '\t';
# Replace '] /' with '\t';
# Replace '/\n' with '\n';
# Add 'Traditional\tSimplified\tpin1 yin1\tEnglish equivalent 1/equivalent 2\n' at the beginning;
|-
|CharFreq.txt
|
# Delete lines starting with '/';
# [https://stackoverflow.com/questions/15361632/delete-a-column-with-awk-or-sed Delete fields 3, 4];
# Add '序列号\t汉字\t拼音\t英文翻译' at the beginning;
|-
|CharFreq-Classical.xls
|
# Delete the first row;
# Delete fields 3, 4;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|-
|CHDICT.u8
|
# Delete lines starting with '#';
# Replace '\n\n' with '\n';
# Replace the first ' ' in each line with '\t';
# Replace the first ' [' in each line with '\t';
# Replace '] /' with '\t';
# Replace '/\n' with '\n';
# Add 'Tradicionális\tEgyszerűsített\tpin1 yin1\tmagyar egyenérték 1/ egyenérték 2\n' at the beginning;
|-
|ecdict.csv
|
# Open with a spreadsheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|-
|eestiinglise.txt
|
# Add 'eeste\tinglise\n' at the beginning;
|-
|EnglishPersianWordDatabase.xlsx
|
# Open with a spreadsheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|-
|espdict.txt
|
# Delete the line starting with '#';
# Replace ' : ' with '\t';
# Add 'Esperanto\tEnglish\n' at the beginning;
|-
|handedict.u8
|
# Delete lines starting with '#';
# Replace '\n\n' with '\n';
# Replace the first ' ' in each line with '\t';
# Replace the first ' [' in each line with '\t';
# Replace '] /' with '\t';
# Replace '/\n' with '\n';
# Add 'Traditionel\tVereinfacht\tpin1 yin1\tdeutsche Entsprechung 1/Entsprechung 2\n' at the beginning;
|-
|hanja.txt
|
# Delete lines starting with ' #';
# Replace '<nowiki>:</nowiki>' with '\t';
# Add 'Hangul\tHanja\tnote' at the beginning;
|-
|HSK-2012.xls
|
# Open with a spreadsheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
# Open the new file;
# Replace '(' with '\t';
# Replace ')' with <nowiki>''</nowiki>;
# Add '单词\t等级\n' at the beginning;
|-
|iedict.txt
|
# Delete the line starting with ' #';
# Replace ' : ' with '\t';
# Add 'Interlingua\tEnglish\n' at the beginning;
|-
|kengdic_2011.tsv
|
# Delete fields 1, 3, 5, 6, 7, 8, 9, 11, 12;
# Add 'word\tdef\hanja\n' at the beginning;
|-
|Mkdictionary.xls
|
# Open with a spreadsheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|-
|tocfl.tsv
|
# Replace '"\t"' with '\t';
# Replace '"\n"' with '\n';
# Replace the first '"' with <nowiki>''</nowiki>;
# Replace the last '"' with <nowiki>''</nowiki>;
|-
|vnedict.txt
|
# Delete the line starting with '#';
# Replace ' : ' with '\t';
# Add 'Vietnamese\tEnglish\n' at the beginning;
|-
|토픽 어휘 목록_공개 목록.xlsx
|
# Open with a spreadssheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
# Click on the other tab of sheet;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|}
=== Others ===
{| class="wikitable"
!database name with link
!format
|-
|[https://freedict.org/downloads/ FreeDict]
|slob
|-
|[http://www.informatik.uni-leipzig.de/~duc/Dict/install.html Free Vietnamese Dictionary Project]
|dict.dz
|-
|[http://www.xobdo.org/downloads/ XOBDO.ORG]
|db
|}
[[Category:Free-Resources]]
[[Category:Computer-Knowledge]]
==Other Lessons==
* [[Language/Multiple-languages/Culture/Wiki-Notice-Board|Wiki Notice Board]]
* [[Language/Multiple-languages/Culture/Cultural-differences-by-country|Cultural differences by country]]
* [[Language/Multiple-languages/Culture/Most-Famous-Non–Contemporary-Artists|Most Famous Non–Contemporary Artists]]
* [[Language/Multiple-languages/Culture/IRFP-in-brief|IRFP in brief]]
* [[Language/Multiple-languages/Culture/Introduction-to-Sci–Tech-Index|Introduction to Sci–Tech Index]]
* [[Language/Multiple-languages/Culture/Online-Specialized-Dictionaries|Online Specialized Dictionaries]]
* [[Language/Multiple-languages/Culture/How-to-contribute-to-wiki-lessons-(FAQ)|How to contribute to wiki lessons (FAQ)]]
* [[Language/Multiple-languages/Culture/Cities-with-the-best-quality-of-life|Cities with the best quality of life]]
* [[Language/Multiple-languages/Culture/Techniques-for-learning-languages|Techniques for learning languages]]
* [[Language/Multiple-languages/Culture/Countries-and-Flag-Emoji-by-Languages|Countries and Flag Emoji by Languages]]
<span links></span>

Latest revision as of 22:17, 26 March 2023

Licensed-Free Databases Around Languages
Best-licensed-free-databases-polyglotclub.jpg

Hi polyglots! 😀

➡ On this page we have listed free databases related to languages.

  • The listed items are data, so if you don't know programming, this page might not be of much help to you.

Main[edit | edit source]

Multiple languages[edit | edit source]

https://www.ethnologue.com/codes/download-code-tables[edit | edit source]

LanguageCodes.tab lists the 7,400+ distinct language identifiers used in the current Ethnologue database.

https://dumps.wikimedia.org/[edit | edit source]

License: https://dumps.wikimedia.org/legal.html

Some of its users: https://www.wikimedia.org/

Wikimedia.

https://tatoeba.org/eng/downloads/[edit | edit source]

License: https://tatoeba.org/eng/downloads/

Some of its users: https://tatoeba.org/, http://www.listeningpractice.org/, https://jisho.org/

Parallel corpora. In common words, collections about a sentence in different languages.

https://wiki.documentfoundation.org/Language_support_of_LibreOffice[edit | edit source]

License: https://wiki.documentfoundation.org/Language_support_of_LibreOffice

Some of its users: https://www.libreoffice.org/

You can find the “Spell check dictionaries” and other useful things.

http://www.gutenberg.org/wiki/Gutenberg:Information_About_Robot_Access_to_our_Pages[edit | edit source]

License: http://www.gutenberg.org/wiki/Gutenberg:Terms_of_Use

Some of its users: http://www.gutenberg.org/, https://librivox.org/ LibriVox

Ebooks.

https://librivox.org/pages/about-librivox/[edit | edit source]

License: https://librivox.org/pages/about-librivox/

Some of its users: https://librivox.org/, http://www.listeningpractice.org/

Audio books.

http://www.omegawiki.org/Help:Downloading_the_data[edit | edit source]

License: http://www.omegawiki.org/Meta:Main_Page

Some of its users: http://www.omegawiki.org/Meta:Main_Page, http://dictionarymid.sourceforge.net/

Dictionaries.

https://ltrc.iiit.ac.in/onlineServices/Dictionaries/Dict_Frame.html[edit | edit source]

License: https://ltrc.iiit.ac.in/onlineServices/Dictionaries/GPLHelp.html

Dictionaries for South Asian languages and English.

http://compling.hss.ntu.edu.sg/omw/[edit | edit source]

License: http://compling.hss.ntu.edu.sg/omw/

Some of its users: http://compling.hss.ntu.edu.sg/omw/cgi-bin/wn-gridx.cgi?gridmode=grid

Wordnets.

http://www.dicto.org.ru/xdxf.html[edit | edit source]

License: http://dicto.org.ru/license.html

Some of its users: http://dicto.org.ru/

Repository of dictionaries (from elsewhere).

http://shtooka.net/download.php[edit | edit source]

License: http://shtooka.net/

Collections of audio.

https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists[edit | edit source]

License: https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists

Frequency lists.

https://lego.linguistlist.org/about#contact[edit | edit source]

License: https://lego.linguistlist.org/about#copyright

Some of its users: https://lego.linguistlist.org/

Lexicon. No download link on the website.

https://panlex.org/source-list/[edit | edit source]

License: https://panlex.org/license/

Some of its users: https://glosbe.com

Lexical database links.

https://github.com/cburgmer/cjklib[edit | edit source]

License: https://github.com/cburgmer/cjklib/blob/master/COPYING

Some of its users: https://www.skishore.me/makemeahanzi/

Data about Han script.

https://www.radio-browser.info/gui/#!/[edit | edit source]

License: https://www.radio-browser.info/gui/#!/

Some of its users: https://github.com/segler-alex/RadioDroid

Database of radio stations.

https://help.archive.org/hc/en-us/articles/360017781111-How-to-download-files-[edit | edit source]

License: https://www.archive.org/about/terms.php

Some of its users: https://www.archive.org/

Archived Internet content.

https://www.fandom.com/[edit | edit source]

License: https://www.fandom.com/licensing

Fan-made wiki.

Chinese[edit | edit source]

http://lingua.mtsu.edu/chinese-computing/[edit | edit source]

License: http://lingua.mtsu.edu/chinese-computing/copyright.html

Character frequency lists.

https://github.com/gwinterstein/Cifu[edit | edit source]

License: https://github.com/gwinterstein/Cifu/blob/master/LICENSE

Word frequency list for Yue Chinese.

https://www.tanos.co.uk/hsk/[edit | edit source]

License: https://www.tanos.co.uk/jlpt/sharing/

HSK data.

http://www.hskhsk.com/resources.html[edit | edit source]

License: http://www.hskhsk.com/resources.html

HSK data.

https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8[edit | edit source]

License: https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8

Frequent characters.

https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8[edit | edit source]

License: https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8

Frequent characters.

http://input.foruto.com/ccc/gongbiu/index.htm[edit | edit source]

License:

Frequent characters.

English[edit | edit source]

http://gcide.gnu.org.ua/download[edit | edit source]

License: http://gcide.gnu.org.ua/license

Some of its users: http://gcide.gnu.org.ua/

Dictionary of definition.

https://foldoc.org/source.html[edit | edit source]

License: https://foldoc.org/Free+On-line+Dictionary

Some of its users: https://foldoc.org/

Dictionary about computing.

https://github.com/tony-mak/Eng-Chi-Dictionary/tree/master/app/src/main/assets/databases[edit | edit source]

License: https://github.com/tony-mak/Eng-Chi-Dictionary/blob/master/LICENSE

Dictionary.

https://github.com/linuxkathirvel/eng2tamildictionary/blob/master/dictionary.json[edit | edit source]

License: https://github.com/linuxkathirvel/eng2tamildictionary/blob/master/License.txt

Dictionary.

https://github.com/derekchuank/high-frequency-vocabulary[edit | edit source]

License: https://github.com/derekchuank/high-frequency-vocabulary/blob/master/LICENSE

Dictionary.

https://github.com/kujirahand/EJDict/tree/master/src[edit | edit source]

License: https://github.com/kujirahand/EJDict/blob/master/LICENSE

Dictionary.

Hindi[edit | edit source]

http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/downloaderInfo.php[edit | edit source]

License: http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/index.php

Some of its users: http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/dict_search_user.php

Dictionary. Application is required.

Icelandic[edit | edit source]

https://www.ling.upenn.edu/~kurisuto/germanic/oi_cleasbyvigfusson_about.html[edit | edit source]

License: http://lexicon.ff.cuni.cz/txt/oi_cleasbyvigfusson.txt

Dictionary.

Interlingue[edit | edit source]

https://github.com/Carmina16/hunspell-ie[edit | edit source]

License: https://github.com/Carmina16/hunspell-ie/blob/master/LICENSE

Spell checker with dictionary.

Japanese[edit | edit source]

https://github.com/KanjiVG/kanjivg/releases/[edit | edit source]

License: http://kanjivg.tagaini.net/

Some of its users: https://www.tagaini.net/, https://jisho.org/

Kanji strokes.

https://github.com/mifunetoshiro/kanjium[edit | edit source]

License: https://github.com/mifunetoshiro/kanjium/blob/master/LICENSE.txt

Kanji data.

https://www.tanos.co.uk/jlpt/[edit | edit source]

License: https://www.tanos.co.uk/jlpt/sharing/

JLPT data.

https://ja.wiktionary.org/wiki/%E4%BB%98%E9%8C%B2:%E5%B8%B8%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7[edit | edit source]

License: http://www.bunka.go.jp/bunkacho_homepage/index.html

Frequent characters.

https://ja.wiktionary.org/wiki/Wiktionary:%E4%BA%BA%E5%90%8D%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7[edit | edit source]

License: http://www.moj.go.jp/term.html

Frequent characters for names.

https://ja.wikipedia.org/wiki/%E5%AD%A6%E5%B9%B4%E5%88%A5%E6%BC%A2%E5%AD%97%E9%85%8D%E5%BD%93%E8%A1%A8[edit | edit source]

License: http://www.mext.go.jp/b_menu/about_link.htm

Frequent characters according to school grades.

Korean[edit | edit source]

https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt[edit | edit source]

License: https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt

Words in Hangul and Hanja.

There is a page of introduction: https://wiki.kldp.org/wiki.php/libhangul.

https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800[edit | edit source]

License: http://www.suneung.re.kr/sub/info.do?m=0601&s=suneung

http://www.suneung.re.kr/boardCnts/fileDown.do?fileSeq=59692112e521efa80d2af27916704082 in a easy-to-copy form.

https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217[edit | edit source]

License: https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110702

Word list of TOPIK.

Lithuanian[edit | edit source]

https://github.com/ispell-lt/ispell-lt[edit | edit source]

License: https://github.com/ispell-lt/ispell-lt/blob/master/COPYING

Spell checker with dictionary.

Sanskrit[edit | edit source]

https://github.com/hemanth/sanskrit-dict/blob/master/dict.js[edit | edit source]

License: https://github.com/hemanth/sanskrit-dict/blob/master/license

Dictionary.

Slovak[edit | edit source]

http://sk-spell.sk.cx/hunspell-sk[edit | edit source]

License: http://sk-spell.sk.cx/hunspell-sk

Spell checker with dictionary.

Vietnamese[edit | edit source]

https://github.com/duyetdev/vietnamese-wordlist[edit | edit source]

License: https://github.com/duyetdev/vietnamese-wordlist/blob/master/LICENSE

Word list.

https://github.com/duyetdev/vietnamese-namedb[edit | edit source]

License: https://github.com/duyetdev/vietnamese-namedb/blob/master/LICENSE

Name list.

Non-language[edit | edit source]

https://unicode.org/ucd/[edit | edit source]

License: https://www.unicode.org/copyright.html

Some of its users: https://wiki.gnome.org/action/show/Apps/Gucharmap, http://www.decodeunicode.org/, https://unicode-table.com/en/, https://www.fontspace.com/

Unicode.

https://www.cia.gov/library/publications/download/[edit | edit source]

License: https://www.cia.gov/library/publications/the-world-factbook/docs/contributor_copyright.html

Some of its users: https://www.cia.gov/library/publications/resources/the-world-factbook/

General facts about countries and regions.

https://www.geonames.org/[edit | edit source]

License: https://www.geonames.org/

Gazetteer and postal code data for free.

https://iso639-3.sil.org/code_tables/download_tables/[edit | edit source]

License: https://iso639-3.sil.org/code_tables/download_tables/

Some of its users: https://iso639-3.sil.org/code_tables/639/data, https://polyglotclub.com/

ISO 639-3 tables. It assigns each language a code and is updated every year.

https://www.unicode.org/iso15924/codelists.html[edit | edit source]

License: https://www.unicode.org/copyright.html

Some of its users: http://www.unicode.org/iso15924/codelists.html

ISO 15924 lists. Codes for scripts.

https://www.unece.org/cefact/locode/welcome.html[edit | edit source]

License: https://www.unece.org/cefact/locode/locode_since1981.html

UN/LOCODE, an alternative to ISO 3166-2. It is updated twice a year.

http://www.nationalanthems.info/[edit | edit source]

License: http://www.nationalanthems.info/

National anthems.

Formats[edit | edit source]

Sheet[edit | edit source]

database name with link file name field separator field 1 field 2 field 3 field 4 field 5 field 6 field 7 field 8 field 9 field 10 field 11 field 12 field 13
dictionary
An ordered and extended TOCFL word-list tocfl.tsv <tab> Word Pinyin OtherPinyin Level First Translation Other Translation
CC-Canto cccanto-webdist.txt <space> Traditional Simplified [pin1 yin1] {jyut6 ping3} /English equivalent 1/equivalent 2/
CC-CEDICT cedict_ts.u8 <space> Traditional Simplified [pin1 yin1] /English equivalent 1/equivalent 2/
CFDICT CFDICT.u8 <space> Traditionnel Simplifié [pin1 yin1] /traduction 1/traduction2/
CHDICT CHDICT.u8 <space> Tradicionális Egyszerűsített [pin1 yin1] /magyar egyenérték 1/ egyenérték 2
ECDICT ecdict.csv , word phonetic definition translation pos collins oxford tag bnc frq exchange detail audio
English Persian Word Database EnglishPersianWordDatabase.xlsx EnglishWord PersianWord
ESPDIC espdict.txt : Esperanto English
HanDeDict handedict.u8 <space> Traditionel Vereinfacht [pin1 yin1] /deutsche Entsprechung 1 /Entsprechung 2/
libhangul hanja.txt : Hangul Hanja note
IEDICT iedict.txt : Interlingua English
Inglise-eesti sõnaraamat eestiinglise.txt <tab> eeste inglise
JLPT Vocabulary VocabList.N1.doc

VocabList.N2.doc

VocabList.N3.doc

VocabList.N4.doc

VocabList.N5.doc

Kanji Hiragana English
kengdic kengdic_2011.tsv <tab> wordid word ? def ? ? submitter doe ? hanja ? ?
The Maryknoll Taiwanese-English Dictionary & English-Taiwanese Dictionary 2013 edition Mkdictionary.xls Sort Taiwanese Chinese English
VNEDICT vnedict.txt : Vietnamese English
word list
한국어능력시험 어휘목록 토픽 어휘 목록_공개 목록.xlsx 수준 어휘 길잡이말 품사
古汉语单字字频: Character frequency list of Classical Chinese CharFreq-Classical.xls Serial number; 序号 Character; 汉字
现代汉语单字字频: Character frequency list of Modern Chinese CharFreq.txt <tab> Serial number; 序号 Character; 汉字 Individual raw frequency; 频率 Cumulative frequency in percentile; 累计频率 Pinyin; 拼音 English translation; 英文翻译
通用规范汉字表 编号 字形
常用國字標準字體表 流水序 教育部字號 Unicode 常用字
新汉语水平考试(HSK)词汇(2012年修订版) HSK-2012.xls 单词(等级)

Manually convert to TSV[edit | edit source]

file name process (on Linux)
cccanto-webdist.txt
  1. Delete lines starting with '#';
  2. Replace the first ' ' in each line with '\t';
  3. Replace the first ' [' in each line with '\t';
  4. Replace '] {' with '\t';
  5. Replace '} /' with '\t';
  6. Replace ' # adapted from cc-cedict' with '';
  7. Replace '/\n' with '\n';
  8. Add 'Traditional\tSimplified\tpin1 yin1\tjyut6 ping3\tEnglish equivalent 1/equivalent 2\n' at the beginning;
cedict_ts.u8
  1. Delete lines starting with '#';
  2. Replace the first ' ' in each line with '\t';
  3. Replace the first ' [' in each line with '\t';
  4. Replace '] /' with '\t';
  5. Replace '/\n' with '\n';
  6. Add 'Traditional\tSimplified\tpin1 yin1\tEnglish equivalent 1/equivalent 2\n' at the beginning;
CharFreq.txt
  1. Delete lines starting with '/';
  2. Delete fields 3, 4;
  3. Add '序列号\t汉字\t拼音\t英文翻译' at the beginning;
CharFreq-Classical.xls
  1. Delete the first row;
  2. Delete fields 3, 4;
  3. Save as TSV file or save as CSV file and select '<tab>' as field separator;
CHDICT.u8
  1. Delete lines starting with '#';
  2. Replace '\n\n' with '\n';
  3. Replace the first ' ' in each line with '\t';
  4. Replace the first ' [' in each line with '\t';
  5. Replace '] /' with '\t';
  6. Replace '/\n' with '\n';
  7. Add 'Tradicionális\tEgyszerűsített\tpin1 yin1\tmagyar egyenérték 1/ egyenérték 2\n' at the beginning;
ecdict.csv
  1. Open with a spreadsheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
eestiinglise.txt
  1. Add 'eeste\tinglise\n' at the beginning;
EnglishPersianWordDatabase.xlsx
  1. Open with a spreadsheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
espdict.txt
  1. Delete the line starting with '#';
  2. Replace ' : ' with '\t';
  3. Add 'Esperanto\tEnglish\n' at the beginning;
handedict.u8
  1. Delete lines starting with '#';
  2. Replace '\n\n' with '\n';
  3. Replace the first ' ' in each line with '\t';
  4. Replace the first ' [' in each line with '\t';
  5. Replace '] /' with '\t';
  6. Replace '/\n' with '\n';
  7. Add 'Traditionel\tVereinfacht\tpin1 yin1\tdeutsche Entsprechung 1/Entsprechung 2\n' at the beginning;
hanja.txt
  1. Delete lines starting with ' #';
  2. Replace ':' with '\t';
  3. Add 'Hangul\tHanja\tnote' at the beginning;
HSK-2012.xls
  1. Open with a spreadsheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
  3. Open the new file;
  4. Replace '(' with '\t';
  5. Replace ')' with '';
  6. Add '单词\t等级\n' at the beginning;
iedict.txt
  1. Delete the line starting with ' #';
  2. Replace ' : ' with '\t';
  3. Add 'Interlingua\tEnglish\n' at the beginning;
kengdic_2011.tsv
  1. Delete fields 1, 3, 5, 6, 7, 8, 9, 11, 12;
  2. Add 'word\tdef\hanja\n' at the beginning;
Mkdictionary.xls
  1. Open with a spreadsheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
tocfl.tsv
  1. Replace '"\t"' with '\t';
  2. Replace '"\n"' with '\n';
  3. Replace the first '"' with '';
  4. Replace the last '"' with '';
vnedict.txt
  1. Delete the line starting with '#';
  2. Replace ' : ' with '\t';
  3. Add 'Vietnamese\tEnglish\n' at the beginning;
토픽 어휘 목록_공개 목록.xlsx
  1. Open with a spreadssheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
  3. Click on the other tab of sheet;
  4. Save as TSV file or save as CSV file and select '<tab>' as field separator;

Others[edit | edit source]

database name with link format
FreeDict slob
Free Vietnamese Dictionary Project dict.dz
XOBDO.ORG db

Other Lessons[edit | edit source]