Editing Language/Multiple-languages/Culture/Licensed-Free-Databases

Jump to navigation Jump to search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
<div class="pg_page_title">Licensed-Free Databases Around Languages</div>
[[Category:Free-Resources]]
[[File:best-licensed-free-databases-polyglotclub.jpg|thumb]]
The listed are databases, not applications. That is to say, if you don't know programming, maybe they won't help you so much.
Hi polyglots! 😀


➡ On this page we have listed free databases related to languages.
== Multiple languages ==


* Those mentioned on [[Language/Multiple-languages/Culture/Internet-Dictionaries|Internet Dictionaries]] will not be mentioned again here.
=== https://unicode.org/ucd/ ===
License address: https://www.unicode.org/copyright.html


* The listed items are data, so if you don't know programming, this page might not be of much help to you.
Some of its users: [https://wiki.gnome.org/action/show/Apps/Gucharmap Gucharmap], [http://www.decodeunicode.org/ decodeunicode]


== Main ==
Unicode.
 
=== Multiple languages ===
====https://www.ethnologue.com/codes/download-code-tables<nowiki/>====
LanguageCodes.tab lists the 7,400+ distinct language identifiers used in the current Ethnologue database.


==== https://dumps.wikimedia.org/ ====
=== https://dumps.wikimedia.org/ ===
License: https://dumps.wikimedia.org/legal.html
License address: https://dumps.wikimedia.org/legal.html


Some of its users: https://www.wikimedia.org/
Some of its users: [https://www.wikimedia.org/ Wikimedia]


Wikimedia.
Wikimedia.


==== https://tatoeba.org/eng/downloads/ ====
=== https://iate.europa.eu/download-iate/ ===
License: https://tatoeba.org/eng/downloads/
License address: https://iate.europa.eu/download-iate/
 
Some of its users: https://tatoeba.org/, http://www.listeningpractice.org/, https://jisho.org/


Parallel corpora. In common words, collections about a sentence in different languages.
Terminology dictionary of the EU.


==== https://wiki.documentfoundation.org/Language_support_of_LibreOffice ====
=== https://tatoeba.org/eng/downloads/ ===
License: https://wiki.documentfoundation.org/Language_support_of_LibreOffice
License address: https://tatoeba.org/eng/downloads/


Some of its users: https://www.libreoffice.org/
Some of its users: [http://www.listeningpractice.org/ ListeningPractice.org], [https://jisho.org/ Jisho.org]


You can find the “Spell check dictionaries” and other useful things.
Parallel corpora. In common words, collection about a sentence in different languages.


==== http://www.gutenberg.org/wiki/Gutenberg:Information_About_Robot_Access_to_our_Pages ====
=== http://www.gutenberg.org/wiki/Gutenberg:Information_About_Robot_Access_to_our_Pages ===
License: http://www.gutenberg.org/wiki/Gutenberg:Terms_of_Use
License address: http://www.gutenberg.org/wiki/Gutenberg:Terms_of_Use


Some of its users: http://www.gutenberg.org/, https://librivox.org/ LibriVox
Some of its users: [https://librivox.org/ LibriVox]


Ebooks.
Ebooks.


==== https://librivox.org/pages/about-librivox/ ====
=== https://librivox.org/pages/about-librivox/ ===
License: https://librivox.org/pages/about-librivox/
License address: https://librivox.org/pages/about-librivox/


Some of its users: https://librivox.org/, http://www.listeningpractice.org/
Some of its users: [http://www.listeningpractice.org/ ListeningPractice.org]


Audio books.
Audio books


==== http://www.omegawiki.org/Help:Downloading_the_data ====
=== http://www.dicto.org.ru/xdxf.html ===
License: http://www.omegawiki.org/Meta:Main_Page
License address: http://dicto.org.ru/license.html


Some of its users: http://www.omegawiki.org/Meta:Main_Page, http://dictionarymid.sourceforge.net/
Some of its users: [http://dicto.org.ru/ Dicto]


Dictionaries.
Dictionaries.


==== https://ltrc.iiit.ac.in/onlineServices/Dictionaries/Dict_Frame.html ====
=== http://www.omegawiki.org/Help:Downloading_the_data ===
License: https://ltrc.iiit.ac.in/onlineServices/Dictionaries/GPLHelp.html
License address: http://www.omegawiki.org/Meta:Main_Page


Dictionaries for South Asian languages and English.
Dictionaries.


==== http://compling.hss.ntu.edu.sg/omw/ ====
=== http://compling.hss.ntu.edu.sg/omw/ ===
License: http://compling.hss.ntu.edu.sg/omw/
License address: http://compling.hss.ntu.edu.sg/omw/


Some of its users: http://compling.hss.ntu.edu.sg/omw/cgi-bin/wn-gridx.cgi?gridmode=grid
Dictionaries.


Wordnets.
=== https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists ===
 
License address: https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists
==== http://www.dicto.org.ru/xdxf.html ====
License: http://dicto.org.ru/license.html
 
Some of its users: http://dicto.org.ru/
 
Repository of dictionaries (from elsewhere).
 
==== http://shtooka.net/download.php ====
License: http://shtooka.net/
 
Collections of audio.
 
==== https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists ====
License: https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists


Frequency lists.
Frequency lists.


==== https://lego.linguistlist.org/about#contact ====
=== https://lego.linguistlist.org/ ===
License: https://lego.linguistlist.org/about#copyright
License address: https://lego.linguistlist.org/
 
Some of its users: https://lego.linguistlist.org/
 
Lexicon. No download link on the website.
 
==== https://panlex.org/source-list/ ====
License: https://panlex.org/license/
 
Some of its users: https://glosbe.com
 
Lexical database links.
 
==== https://github.com/cburgmer/cjklib ====
License: https://github.com/cburgmer/cjklib/blob/master/COPYING
 
Some of its users: https://www.skishore.me/makemeahanzi/
 
Data about Han script.
 
==== https://www.radio-browser.info/gui/#!/ ====
License: https://www.radio-browser.info/gui/#!/


Some of its users: https://github.com/segler-alex/RadioDroid
Lexicon.


Database of radio stations.
=== https://panlex.org/source-list/ ===


==== https://help.archive.org/hc/en-us/articles/360017781111-How-to-download-files- ====
Lexical database links. Not every link has a license.
License: https://www.archive.org/about/terms.php


Some of its users: https://www.archive.org/
=== https://help.archive.org/hc/en-us/articles/360017781111-How-to-download-files- ===
License address: https://www.archive.org/about/terms.php


Archived Internet content.
Archived Internet content.


==== https://www.fandom.com/ ====
== Japanese ==
License: https://www.fandom.com/licensing


Fan-made wiki.
=== http://www.edrdg.org/wiki/index.php/Main_Page ===
License address: https://www.edrdg.org/edrdg/licence.html


=== Chinese ===
Some of its users: [https://jisho.org/ Jisho.org], [https://www.tagaini.net/ Tagaini Jisho]


==== http://lingua.mtsu.edu/chinese-computing/ ====
Japanese dictionaries.
License: http://lingua.mtsu.edu/chinese-computing/copyright.html


Character frequency lists.
=== https://kanjivg.tagaini.net/ ===
License address: http://kanjivg.tagaini.net/


==== https://github.com/gwinterstein/Cifu ====
Some of its users: [https://www.tagaini.net/ Tagaini Jisho], [https://jisho.org/ Jisho.org]
License: https://github.com/gwinterstein/Cifu/blob/master/LICENSE


Word frequency list for Yue Chinese.
Kanji strokes.


==== https://www.tanos.co.uk/hsk/ ====
=== https://joyokanji.info/ ===
License: https://www.tanos.co.uk/jlpt/sharing/
License address: http://www.bunka.go.jp/bunkacho_homepage/index.html, http://www.mext.go.jp/b_menu/about_link.htm, http://www.moj.go.jp/term.html


HSK data.
[http://www.bunka.go.jp/seisaku/bunkashingikai/kokugo/kokugo/kokugo_45/pdf/jouyoukanjihyou_h22.pdf Jōyō Kanji], [http://www.moj.go.jp/content/001131003.pdf Jinmeiyō Kanji], [http://www.mext.go.jp/a_menu/shotou/new-cs/youryou/syo/koku/001.htm Kyōiku Kanji] in a easy-to-copy form.


==== http://www.hskhsk.com/resources.html ====
== Chinese ==
License: http://www.hskhsk.com/resources.html


HSK data.
=== https://resources.publicense.moe.edu.tw/index.html ===
License address: https://resources.publicense.moe.edu.tw/index.html


==== https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8 ====
Some of its users: [https://www.moedict.tw/ 萌典]
License: https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8


Frequent characters.
Dictionaries of ROC Mandarin Chinese written in ROC Mandarin Chinese.


==== https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8 ====
=== https://cc-cedict.org/editor/editor.php ===
License: https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8
License address: https://cc-cedict.org/wiki/


Frequent characters.
Some of its users: [https://www.mdbg.net/chinese/dictionary MDBG], [https://www.pleco.com/ Pleco]


==== http://input.foruto.com/ccc/gongbiu/index.htm ====
Mandarin-English dictionary.
License:


Frequent characters.
=== https://chine.in/mandarin/dictionnaire/CFDICT/ ===
License address: https://chine.in/mandarin/dictionnaire/CFDICT/


=== English ===
Some of its users: [https://chine.in/ Chine Informations], [https://www.pleco.com/ Pleco]


==== http://gcide.gnu.org.ua/download ====
Mandarin-French dictionary.
License: http://gcide.gnu.org.ua/license


Some of its users: http://gcide.gnu.org.ua/
=== http://www.handedict.de/chinesisch_deutsch.php ===
License address: http://www.handedict.de/chinesisch_deutsch.php?mode=dl&sid=51394be2b6d9cba75946e929b5477d55


Dictionary of definition.
Some of its users: [http://www.handedict.de/ HanDeDict], [https://www.pleco.com/ Pleco]


==== https://foldoc.org/source.html ====
Mandarin-German dictionary.
License: https://foldoc.org/Free+On-line+Dictionary


Some of its users: https://foldoc.org/
=== https://chdict.zydeo.net/en/download/ ===
License address: https://chdict.zydeo.net/en/download/


Dictionary about computing.
Some of its users: [https://chdict.zydeo.net/hu/ CHDICT]


==== https://github.com/tony-mak/Eng-Chi-Dictionary/tree/master/app/src/main/assets/databases ====
Mandarin-Hungarian dictionary.
License: https://github.com/tony-mak/Eng-Chi-Dictionary/blob/master/LICENSE


Dictionary.
=== http://cantonese.org/download.html ===
License address: http://cantonese.org/download.html


==== https://github.com/linuxkathirvel/eng2tamildictionary/blob/master/dictionary.json ====
Some of its users: [http://cantonese.org/ CC-Canto], [https://www.pleco.com/ Pleco]
License: https://github.com/linuxkathirvel/eng2tamildictionary/blob/master/License.txt


Dictionary.
Cantonese-English dictionary.


==== https://github.com/derekchuank/high-frequency-vocabulary ====
=== https://twblg.dict.edu.tw/holodict_new/compile1_6_1.jsp ===
License: https://github.com/derekchuank/high-frequency-vocabulary/blob/master/LICENSE
License address: https://twblg.dict.edu.tw/holodict_new/compile1_6_1.jsp


Dictionary.
Some of its users: [https://www.moedict.tw/ 萌典]


==== https://github.com/kujirahand/EJDict/tree/master/src ====
Taiwanese-Endlish dictionary. It requires to file an application to download.
License: https://github.com/kujirahand/EJDict/blob/master/LICENSE


Dictionary.
=== http://www.taiwanesedictionary.org/ ===
License address: http://www.taiwanesedictionary.org/


=== Hindi ===
Taiwanese-English dictionary.


==== http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/downloaderInfo.php ====
=== http://lingua.mtsu.edu/chinese-computing/ ===
License: http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/index.php
License address: http://lingua.mtsu.edu/chinese-computing/copyright.html


Some of its users: http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/dict_search_user.php
Frequency lists.
 
Dictionary. Application is required.
 
=== Icelandic ===


==== https://www.ling.upenn.edu/~kurisuto/germanic/oi_cleasbyvigfusson_about.html ====
== English ==
License: http://lexicon.ff.cuni.cz/txt/oi_cleasbyvigfusson.txt


Dictionary.
=== https://wordnet.princeton.edu/download/current-version/ ===
License address: https://wordnet.princeton.edu/license-and-commercial-use/


=== Interlingue ===
Lexical database.


==== https://github.com/Carmina16/hunspell-ie ====
== Korean ==
License: https://github.com/Carmina16/hunspell-ie/blob/master/LICENSE


Spell checker with dictionary.
=== https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt ===
License address: https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt


=== Japanese ===
Words of Hangul and Hanja.
 
==== https://github.com/KanjiVG/kanjivg/releases/ ====
License: http://kanjivg.tagaini.net/
 
Some of its users: https://www.tagaini.net/, https://jisho.org/
 
Kanji strokes.
 
==== https://github.com/mifunetoshiro/kanjium ====
License: https://github.com/mifunetoshiro/kanjium/blob/master/LICENSE.txt
 
Kanji data.
 
==== https://www.tanos.co.uk/jlpt/ ====
License: https://www.tanos.co.uk/jlpt/sharing/
 
JLPT data.
 
==== https://ja.wiktionary.org/wiki/%E4%BB%98%E9%8C%B2:%E5%B8%B8%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7 ====
License: http://www.bunka.go.jp/bunkacho_homepage/index.html
 
Frequent characters.
 
==== https://ja.wiktionary.org/wiki/Wiktionary:%E4%BA%BA%E5%90%8D%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7 ====
License: http://www.moj.go.jp/term.html
 
Frequent characters for names.
 
==== https://ja.wikipedia.org/wiki/%E5%AD%A6%E5%B9%B4%E5%88%A5%E6%BC%A2%E5%AD%97%E9%85%8D%E5%BD%93%E8%A1%A8 ====
License: http://www.mext.go.jp/b_menu/about_link.htm
 
Frequent characters according to school grades.
 
=== Korean ===
 
==== https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt ====
License: https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt
 
Words in Hangul and Hanja.


There is a page of introduction: https://wiki.kldp.org/wiki.php/libhangul.
There is a page of introduction: https://wiki.kldp.org/wiki.php/libhangul.


==== https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800 ====
=== https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800 ===
License: http://www.suneung.re.kr/sub/info.do?m=0601&s=suneung
License address: http://www.suneung.re.kr/sub/info.do?m=0601&s=suneung


http://www.suneung.re.kr/boardCnts/fileDown.do?fileSeq=59692112e521efa80d2af27916704082 in a easy-to-copy form.
[http://www.suneung.re.kr/boardCnts/fileDown.do?fileSeq=59692112e521efa80d2af27916704082 Hanmun gyoyukyong gicho Hanja] in a easy-to-copy form.


==== https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217 ====
== Vietnamese ==
License: https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110702


Word list of TOPIK.
=== http://www.informatik.uni-leipzig.de/~duc/Dict/install.html ===
License address: http://www.informatik.uni-leipzig.de/~duc/Dict/install.html


=== Lithuanian ===
Some of its users: [http://www.informatik.uni-leipzig.de/~duc/Dict/install.html TuDienHND]


==== https://github.com/ispell-lt/ispell-lt ====
Dictionaries in several languages: English, Vietnamese, French, German, Norwegian.
License: https://github.com/ispell-lt/ispell-lt/blob/master/COPYING
 
Spell checker with dictionary.
 
=== Sanskrit ===
 
==== https://github.com/hemanth/sanskrit-dict/blob/master/dict.js ====
License: https://github.com/hemanth/sanskrit-dict/blob/master/license
 
Dictionary.
 
=== Slovak ===
 
==== http://sk-spell.sk.cx/hunspell-sk ====
License: http://sk-spell.sk.cx/hunspell-sk
 
Spell checker with dictionary.
 
=== Vietnamese ===
 
==== https://github.com/duyetdev/vietnamese-wordlist ====
License: https://github.com/duyetdev/vietnamese-wordlist/blob/master/LICENSE
 
Word list.
 
==== https://github.com/duyetdev/vietnamese-namedb ====
License: https://github.com/duyetdev/vietnamese-namedb/blob/master/LICENSE
 
Name list.
 
=== Non-language ===
 
==== https://unicode.org/ucd/ ====
License: https://www.unicode.org/copyright.html
 
Some of its users: https://wiki.gnome.org/action/show/Apps/Gucharmap, http://www.decodeunicode.org/, https://unicode-table.com/en/, https://www.fontspace.com/
 
Unicode.
 
==== https://www.cia.gov/library/publications/download/ ====
License: https://www.cia.gov/library/publications/the-world-factbook/docs/contributor_copyright.html
 
Some of its users: https://www.cia.gov/library/publications/resources/the-world-factbook/
 
General facts about countries and regions.
 
==== https://www.geonames.org/ ====
License: https://www.geonames.org/
 
Gazetteer and postal code data for free.
 
==== https://iso639-3.sil.org/code_tables/download_tables/ ====
License: https://iso639-3.sil.org/code_tables/download_tables/
 
Some of its users: https://iso639-3.sil.org/code_tables/639/data, https://polyglotclub.com/
 
ISO 639-3 tables. It assigns each language a code and is updated every year.
 
==== https://www.unicode.org/iso15924/codelists.html ====
License: https://www.unicode.org/copyright.html
 
Some of its users: http://www.unicode.org/iso15924/codelists.html
 
ISO 15924 lists. Codes for scripts.
 
==== https://www.unece.org/cefact/locode/welcome.html ====
License: https://www.unece.org/cefact/locode/locode_since1981.html
 
UN/LOCODE, an alternative to ISO 3166-2. It is updated twice a year.
 
==== http://www.nationalanthems.info/ ====
License: http://www.nationalanthems.info/
 
National anthems.
 
== Formats ==
 
=== Sheet ===
{| class="wikitable"
!database name with link
!file name
!field separator
!
!field 1
!field 2
!field 3
!field 4
!field 5
!field 6
!field 7
!field 8
!field 9
!field 10
!field 11
!field 12
!field 13
|-
! colspan="15" |dictionary
!
!
|-
|[https://github.com/tomcumming/tocfl-word-list An ordered and extended TOCFL word-list]
|tocfl.tsv
|<tab>
|
|Word
|Pinyin
|OtherPinyin
|Level
|First Translation
|Other Translation
|
|
|
|
|
|
|
|-
|[https://cantonese.org/download.html CC-Canto]
|cccanto-webdist.txt
|<space>
|
|Traditional
|Simplified
|[pin1 yin1]
|{jyut6 ping3}
|/English equivalent 1/equivalent 2/
|
|
|
|
|
|
|
|
|-
|[https://cc-cedict.org/editor/editor.php?handler=Download CC-CEDICT]
|cedict_ts.u8
|<space>
|
|Traditional
|Simplified
|[pin1 yin1]
|/English equivalent 1/equivalent 2/
|
|
|
|
|
|
|
|
|
|-
|[https://chine.in/mandarin/dictionnaire/CFDICT/ CFDICT]
|CFDICT.u8
|<space>
|
|Traditionnel
|Simplifié
|[pin1 yin1]
|/traduction 1/traduction2/
|
|
|
|
|
|
|
|
|
|-
|[https://chdict.zydeo.net/en/download/ CHDICT]
|CHDICT.u8
|<space>
|
|Tradicionális
|Egyszerűsített
|[pin1 yin1]
|/magyar egyenérték 1/ egyenérték 2
|
|
|
|
|
|
|
|
|
|-
|[https://github.com/skywind3000/ECDICT ECDICT]
|ecdict.csv
|,
|
|word
|phonetic
|definition
|translation
|pos
|collins
|oxford
|tag
|bnc
|frq
|exchange
|detail
|audio
|-
|[https://github.com/amirshnll/English-Persian-Word-Database English Persian Word Database]
|EnglishPersianWordDatabase.xlsx
|
|
|EnglishWord
|PersianWord
|
|
|
|
|
|
|
|
|
|
|
|-
|[http://www.denisowski.org/Esperanto/ESPDIC/espdic_readme.html ESPDIC]
|espdict.txt
| :
|
|Esperanto
|English
|
|
|
|
|
|
|
|
|
|
|
|-
|[http://www.handedict.de/chinesisch_deutsch.php?mode=dl&sid=d80e36eefdb05750bd130ae1f322ca09 HanDeDict]
|handedict.u8
|<space>
|
|Traditionel
|Vereinfacht
|[pin1 yin1]
|/deutsche Entsprechung 1 /Entsprechung 2/
|
|
|
|
|
|
|
|
|
|-
|[https://github.com/libhangul/libhangul/tree/master/data/hanja libhangul]
|hanja.txt
|<nowiki>:</nowiki>
|
|Hangul
|Hanja
|note
|
|
|
|
|
|
|
|
|
|
|-
|[http://www.denisowski.org/Interlingua/IEDICT/iedict_readme.html IEDICT]
|iedict.txt
| :
|
|Interlingua
|English
|
|
|
|
|
|
|
|
|
|
|
|-
|[https://www.eki.ee/litsents/vaba/dl.cgi?D=ies Inglise-eesti sõnaraamat]
|eestiinglise.txt
|<tab>
|
|eeste
|inglise
|
|
|
|
|
|
|
|
|
|
|
|-
|[https://www.tanos.co.uk/jlpt/skills/vocab/ JLPT Vocabulary]
|VocabList.N1.doc
VocabList.N2.doc
 
VocabList.N3.doc
 
VocabList.N4.doc
 
VocabList.N5.doc
|
|
|Kanji
|Hiragana
|English
|
|
|
|
|
|
|
|
|
|
|-
|[https://github.com/garfieldnate/kengdic kengdic]
|kengdic_2011.tsv
|<tab>
|
|wordid
|word
|?
|def
|?
|?
|submitter
|doe
|?
|hanja
|?
|?
|
|-
|[http://www.taiwanesedictionary.org/ The Maryknoll Taiwanese-English Dictionary & English-Taiwanese Dictionary 2013 edition]
|Mkdictionary.xls
|
|
|Sort
|Taiwanese
|Chinese
|English
|
|
|
|
|
|
|
|
|
|-
|[http://www.denisowski.org/Vietnamese/vnedict_readme.htm VNEDICT]
|vnedict.txt
| :
|
|Vietnamese
|English
|
|
|
|
|
|
|
|
|
|
|
|-
! colspan="15" |word list
|
|
|-
|[https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217 한국어능력시험 어휘목록]
|토픽 어휘 목록_공개 목록.xlsx
|
|
|수준
|어휘
|길잡이말
|품사
|
|
|
|
|
|
|
|
|
|-
|[https://lingua.mtsu.edu/chinese-computing/statistics/index.html 古汉语单字字频: Character frequency list of Classical Chinese]
|CharFreq-Classical.xls
|
|
|Serial number; 序号
|Character; 汉字
|
|
|
|
|
|
|
|
|
|
|
|-
|[https://lingua.mtsu.edu/chinese-computing/statistics/index.html 现代汉语单字字频: Character frequency list of Modern Chinese]
|CharFreq.txt
|<tab>
|
|Serial number; 序号
|Character; 汉字
|Individual raw frequency; 频率
|Cumulative frequency in percentile; 累计频率
|Pinyin; 拼音
|English translation; 英文翻译
|
|
|
|
|
|
|
|-
|[https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8 通用规范汉字表]
|
|
|
|编号
|字形
|
|
|
|
|
|
|
|
|
|
|
|-
|[https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8 常用國字標準字體表]
|
|
|
|流水序
|教育部字號
|Unicode
|常用字
|
|
|
|
|
|
|
|
|
|-
|[http://www.chinesetest.cn/godownload.do 新汉语水平考试(HSK)词汇(2012年修订版)]
|HSK-2012.xls
|
|
|单词(等级)
|
|
|
|
|
|
|
|
|
|
|
|
|}
 
==== Manually convert to TSV ====
{| class="wikitable"
!file name
!process (on Linux)
|-
|cccanto-webdist.txt
|
# [https://stackoverflow.com/questions/8206280/delete-all-lines-beginning-with-a-from-a-file Delete lines starting with '#'];
# [https://stackoverflow.com/questions/47010412/replace-first-space-on-each-line-by-a-tab Replace the first ' ' in each line with '\t'];
# Replace the first ' [' in each line with '\t';
# Replace '] {' with '\t';
# Replace '} /' with '\t';
# Replace ' # adapted from cc-cedict' with <nowiki>''</nowiki>;
# Replace '/\n' with '\n';
# Add 'Traditional\tSimplified\tpin1 yin1\tjyut6 ping3\tEnglish equivalent 1/equivalent 2\n' at the beginning;
|-
|cedict_ts.u8
|
# Delete lines starting with '#';
# Replace the first ' ' in each line with '\t';
# Replace the first ' [' in each line with '\t';
# Replace '] /' with '\t';
# Replace '/\n' with '\n';
# Add 'Traditional\tSimplified\tpin1 yin1\tEnglish equivalent 1/equivalent 2\n' at the beginning;
|-
|CharFreq.txt
|
# Delete lines starting with '/';
# [https://stackoverflow.com/questions/15361632/delete-a-column-with-awk-or-sed Delete fields 3, 4];
# Add '序列号\t汉字\t拼音\t英文翻译' at the beginning;
|-
|CharFreq-Classical.xls
|
# Delete the first row;
# Delete fields 3, 4;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|-
|CHDICT.u8
|
# Delete lines starting with '#';
# Replace '\n\n' with '\n';
# Replace the first ' ' in each line with '\t';
# Replace the first ' [' in each line with '\t';
# Replace '] /' with '\t';
# Replace '/\n' with '\n';
# Add 'Tradicionális\tEgyszerűsített\tpin1 yin1\tmagyar egyenérték 1/ egyenérték 2\n' at the beginning;
|-
|ecdict.csv
|
# Open with a spreadsheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|-
|eestiinglise.txt
|
# Add 'eeste\tinglise\n' at the beginning;
|-
|EnglishPersianWordDatabase.xlsx
|
# Open with a spreadsheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|-
|espdict.txt
|
# Delete the line starting with '#';
# Replace ' : ' with '\t';
# Add 'Esperanto\tEnglish\n' at the beginning;
|-
|handedict.u8
|
# Delete lines starting with '#';
# Replace '\n\n' with '\n';
# Replace the first ' ' in each line with '\t';
# Replace the first ' [' in each line with '\t';
# Replace '] /' with '\t';
# Replace '/\n' with '\n';
# Add 'Traditionel\tVereinfacht\tpin1 yin1\tdeutsche Entsprechung 1/Entsprechung 2\n' at the beginning;
|-
|hanja.txt
|
# Delete lines starting with ' #';
# Replace '<nowiki>:</nowiki>' with '\t';
# Add 'Hangul\tHanja\tnote' at the beginning;
|-
|HSK-2012.xls
|
# Open with a spreadsheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
# Open the new file;
# Replace '(' with '\t';
# Replace ')' with <nowiki>''</nowiki>;
# Add '单词\t等级\n' at the beginning;
|-
|iedict.txt
|
# Delete the line starting with ' #';
# Replace ' : ' with '\t';
# Add 'Interlingua\tEnglish\n' at the beginning;
|-
|kengdic_2011.tsv
|
# Delete fields 1, 3, 5, 6, 7, 8, 9, 11, 12;
# Add 'word\tdef\hanja\n' at the beginning;
|-
|Mkdictionary.xls
|
# Open with a spreadsheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|-
|tocfl.tsv
|
# Replace '"\t"' with '\t';
# Replace '"\n"' with '\n';
# Replace the first '"' with <nowiki>''</nowiki>;
# Replace the last '"' with <nowiki>''</nowiki>;
|-
|vnedict.txt
|
# Delete the line starting with '#';
# Replace ' : ' with '\t';
# Add 'Vietnamese\tEnglish\n' at the beginning;
|-
|토픽 어휘 목록_공개 목록.xlsx
|
# Open with a spreadssheet program;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
# Click on the other tab of sheet;
# Save as TSV file or save as CSV file and select '<tab>' as field separator;
|}
 
=== Others ===
{| class="wikitable"
!database name with link
!format
|-
|[https://freedict.org/downloads/ FreeDict]
|slob
|-
|[http://www.informatik.uni-leipzig.de/~duc/Dict/install.html Free Vietnamese Dictionary Project]
|dict.dz
|-
|[http://www.xobdo.org/downloads/ XOBDO.ORG]
|db
|}
 
[[Category:Free-Resources]]
[[Category:Computer-Knowledge]]


==Other Lessons==
There is a page of introduction: https://vi.wiktionary.org/wiki/Wiktionary:Ngu%E1%BB%93n_g%E1%BB%91c/FVDP
* [[Language/Multiple-languages/Culture/Wiki-Notice-Board|Wiki Notice Board]]
* [[Language/Multiple-languages/Culture/Cultural-differences-by-country|Cultural differences by country]]
* [[Language/Multiple-languages/Culture/Most-Famous-Non–Contemporary-Artists|Most Famous Non–Contemporary Artists]]
* [[Language/Multiple-languages/Culture/IRFP-in-brief|IRFP in brief]]
* [[Language/Multiple-languages/Culture/Introduction-to-Sci–Tech-Index|Introduction to Sci–Tech Index]]
* [[Language/Multiple-languages/Culture/Online-Specialized-Dictionaries|Online Specialized Dictionaries]]
* [[Language/Multiple-languages/Culture/How-to-contribute-to-wiki-lessons-(FAQ)|How to contribute to wiki lessons (FAQ)]]
* [[Language/Multiple-languages/Culture/Cities-with-the-best-quality-of-life|Cities with the best quality of life]]
* [[Language/Multiple-languages/Culture/Techniques-for-learning-languages|Techniques for learning languages]]
* [[Language/Multiple-languages/Culture/Countries-and-Flag-Emoji-by-Languages|Countries and Flag Emoji by Languages]]
<span links></span>

Please note that all contributions to Polyglot Club WIKI may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see PolyglotClub-WIKI:Copyrights for details). Do not submit copyrighted work without permission!

Cancel Editing help (opens in new window)