Actions

index.php

From Polyglot Club WIKI

< Language‎ | Multiple-languages‎ | Culture
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
[[Category:Free-Resources]]
 
[[Category:Free-Resources]]
On this page we have listed free databases providing useful data related to languages (dictionaries, scripts, resources...).
+
On this page we have listed free language databases (organized collection of data related to languages).
  
The listed are databases, not applications. That is to say, if you don't know programming, maybe they won't help you so much.
+
The listed items are data sources, not sofwares able to use this data (like database-management systems). Therefore if you don't know programming, this page might not be of much help to you.
  
 
== Main ==
 
== Main ==
  
 
=== Multiple languages ===
 
=== Multiple languages ===
 +
====https://www.ethnologue.com/codes/download-code-tables====
 +
LanguageCodes.tab lists the 7,400+ distinct language identifiers used in the current Ethnologue database.
  
 
==== https://dumps.wikimedia.org/ ====
 
==== https://dumps.wikimedia.org/ ====

Latest revision as of 15:00, 30 June 2020

On this page we have listed free language databases (organized collection of data related to languages).

The listed items are data sources, not sofwares able to use this data (like database-management systems). Therefore if you don't know programming, this page might not be of much help to you.

Contents

Main[edit | edit source]

Multiple languages[edit | edit source]

https://www.ethnologue.com/codes/download-code-tables[edit | edit source]

LanguageCodes.tab lists the 7,400+ distinct language identifiers used in the current Ethnologue database.

https://dumps.wikimedia.org/[edit | edit source]

License: https://dumps.wikimedia.org/legal.html

Some of its users: https://www.wikimedia.org/

Wikimedia.

https://iate.europa.eu/download-iate/[edit | edit source]

License: https://iate.europa.eu/download-iate/

Some of its users: https://iate.europa.eu/download-iate/

Terminology dictionary of the EU.

https://tatoeba.org/eng/downloads/[edit | edit source]

License: https://tatoeba.org/eng/downloads/

Some of its users: https://tatoeba.org/, http://www.listeningpractice.org/, https://jisho.org/

Parallel corpora. In common words, collections about a sentence in different languages.

https://wiki.documentfoundation.org/Language_support_of_LibreOffice[edit | edit source]

License: https://wiki.documentfoundation.org/Language_support_of_LibreOffice

Some of its users: https://www.libreoffice.org/

You can find the “Spell check dictionaries” and other useful things.

http://www.gutenberg.org/wiki/Gutenberg:Information_About_Robot_Access_to_our_Pages[edit | edit source]

License: http://www.gutenberg.org/wiki/Gutenberg:Terms_of_Use

Some of its users: http://www.gutenberg.org/, https://librivox.org/ LibriVox

Ebooks.

https://librivox.org/pages/about-librivox/[edit | edit source]

License: https://librivox.org/pages/about-librivox/

Some of its users: https://librivox.org/, http://www.listeningpractice.org/

Audio books.

https://freedict.org/downloads/[edit | edit source]

License: https://freedict.org/about/

Some of its users: http://aarddict.org/

Dictionaries.

http://www.omegawiki.org/Help:Downloading_the_data[edit | edit source]

License: http://www.omegawiki.org/Meta:Main_Page

Some of its users: http://www.omegawiki.org/Meta:Main_Page, http://dictionarymid.sourceforge.net/

Dictionaries.

http://www.xobdo.org/downloads/[edit | edit source]

License: http://www.xobdo.org/downloads/

Some of its users: http://www.xobdo.org/

Dictionaries for South Asian languages and English.

https://ltrc.iiit.ac.in/onlineServices/Dictionaries/Dict_Frame.html[edit | edit source]

License: https://ltrc.iiit.ac.in/onlineServices/Dictionaries/GPLHelp.html

Dictionaries for South Asian languages and English.

http://compling.hss.ntu.edu.sg/omw/[edit | edit source]

License: http://compling.hss.ntu.edu.sg/omw/

Some of its users: http://compling.hss.ntu.edu.sg/omw/cgi-bin/wn-gridx.cgi?gridmode=grid

Wordnets.

http://www.dicto.org.ru/xdxf.html[edit | edit source]

License: http://dicto.org.ru/license.html

Some of its users: http://dicto.org.ru/

Repository of dictionaries (from elsewhere).

http://shtooka.net/download.php[edit | edit source]

License: http://shtooka.net/

Collections of audio.

https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists[edit | edit source]

License: https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists

Frequency lists.

https://lego.linguistlist.org/about#contact[edit | edit source]

License: https://lego.linguistlist.org/about#copyright

Some of its users: https://lego.linguistlist.org/

Lexicon. No download link on the website.

https://panlex.org/source-list/[edit | edit source]

License: https://panlex.org/license/

Some of its users: https://glosbe.com

Lexical database links.

https://github.com/cburgmer/cjklib[edit | edit source]

License: https://github.com/cburgmer/cjklib/blob/master/COPYING

Some of its users: https://www.skishore.me/makemeahanzi/

Data about Han script.

https://www.radio-browser.info/gui/#!/[edit | edit source]

License: https://www.radio-browser.info/gui/#!/

Some of its users: https://github.com/segler-alex/RadioDroid

Database of radio stations.

https://help.archive.org/hc/en-us/articles/360017781111-How-to-download-files-[edit | edit source]

License: https://www.archive.org/about/terms.php

Some of its users: https://www.archive.org/

Archived Internet content.

https://www.fandom.com/[edit | edit source]

License: https://www.fandom.com/licensing

Fan-made wiki.

American Sign Language[edit | edit source]

http://www.asl-lex.org/[edit | edit source]

License: http://www.asl-lex.org/

Lexicon.

Burmese[edit | edit source]

https://github.com/saturngod/ornagai-V2[edit | edit source]

License: https://github.com/saturngod/ornagai-V2/blob/master/License

Some of its users: https://www.ornagai.com/#/

Dictionary.

Catalan[edit | edit source]

http://www.catalandictionary.org/en/search/[edit | edit source]

License: http://www.catalandictionary.org/en/search/

Dictionary. Font of license is too small.

Chinese[edit | edit source]

https://resources.publicense.moe.edu.tw/index.html[edit | edit source]

License: https://resources.publicense.moe.edu.tw/index.html

Some of its users: https://resources.publicense.moe.edu.tw/index.html, https://www.moedict.tw/

Dictionaries of ROC Mandarin Chinese written in ROC Mandarin Chinese.

https://cc-cedict.org/editor/editor.php[edit | edit source]

License: https://cc-cedict.org/wiki/

Some of its users: https://www.mdbg.net/chinese/dictionary, https://www.pleco.com/

Mandarin-English dictionary.

https://chine.in/mandarin/dictionnaire/CFDICT/[edit | edit source]

License: https://chine.in/mandarin/dictionnaire/CFDICT/

Some of its users: https://chine.in/, https://www.pleco.com/

Mandarin-French dictionary.

https://handedict.zydeo.net/de/download[edit | edit source]

License: https://handedict.zydeo.net/de/download

Some of its users: https://www.pleco.com/

Mandarin-German dictionary.

https://chdict.zydeo.n.et/en/download/[edit | edit source]

License: https://chdict.zydeo.net/en/download/

Some of its users: https://chdict.zydeo.net/hu/

Mandarin-Hungarian dictionary.

http://cantonese.org/download.html[edit | edit source]

License: http://cantonese.org/download.html

Some of its users: http://cantonese.org/, https://www.pleco.com/

Cantonese-English dictionary.

https://twblg.dict.edu.tw/holodict_new/compile1_6_1.jsp[edit | edit source]

License: https://twblg.dict.edu.tw/holodict_new/compile1_6_1.jsp

Some of its users: https://twblg.dict.edu.tw/holodict_new/default.jsp, https://www.moedict.tw/

Taiwanese-Endlish dictionary. It can be requested through email.

http://www.taiwanesedictionary.org/[edit | edit source]

License: http://www.taiwanesedictionary.org/

Taiwanese-English dictionary.

http://lingua.mtsu.edu/chinese-computing/[edit | edit source]

License: http://lingua.mtsu.edu/chinese-computing/copyright.html

Frequency lists.

https://www.tanos.co.uk/hsk/[edit | edit source]

License: https://www.tanos.co.uk/jlpt/sharing/

HSK data.

http://www.hskhsk.com/resources.html[edit | edit source]

License: http://www.hskhsk.com/resources.html

HSK data.

https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8[edit | edit source]

License: https://zh.wikisource.org/wiki/%E9%80%9A%E7%94%A8%E8%A7%84%E8%8C%83%E6%B1%89%E5%AD%97%E8%A1%A8

Frequent characters.

https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8[edit | edit source]

License: https://zh.wikisource.org/wiki/%E5%B8%B8%E7%94%A8%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94%E8%A1%A8

Frequent characters.

http://input.foruto.com/ccc/gongbiu/index.htm[edit | edit source]

License:

Frequent characters.

Esperanto[edit | edit source]

http://reta-vortaro.de/tgz/index.html[edit | edit source]

License: http://reta-vortaro.de/tgz/index.html

Some of its users: http://reta-vortaro.de/, http://www.busydoingnothing.co.uk/prevo/

Dictionary in several languages.

http://www.denisowski.org/Esperanto/ESPDIC/espdic_readme.html[edit | edit source]

License: http://www.denisowski.org/Esperanto/ESPDIC/espdic_readme.html

Some of its users: http://www.denisowski.org/Esperanto/ESPDIC/espdic_readme.html

Dictionary.

https://komputeko.net/elsxutejo-en.php[edit | edit source]

License: https://komputeko.net/index_en.php

Some of its users: https://komputeko.net/index_en.php

Computer terminology dictionary.

German Sign Language[edit | edit source]

https://signdict.org/[edit | edit source]

License: https://signdict.org/about

Some of its users: https://signdict.org/

Dictionary.

English[edit | edit source]

http://gcide.gnu.org.ua/download[edit | edit source]

License: http://gcide.gnu.org.ua/license

Some of its users: http://gcide.gnu.org.ua/

Dictionary of definition.

https://foldoc.org/source.html[edit | edit source]

License: https://foldoc.org/Free+On-line+Dictionary

Some of its users: https://foldoc.org/

Dictionary about computing.

https://github.com/skywind3000/ECDICT[edit | edit source]

License: https://github.com/skywind3000/ECDICT/blob/master/LICENSE

Some of its users: https://github.com/program-in-chinese/webextension_english_chinese_dictionary

Dictionary.

https://github.com/tony-mak/Eng-Chi-Dictionary/tree/master/app/src/main/assets/databases[edit | edit source]

License: https://github.com/tony-mak/Eng-Chi-Dictionary/blob/master/LICENSE

Dictionary.

https://github.com/linuxkathirvel/eng2tamildictionary/blob/master/dictionary.json[edit | edit source]

License: https://github.com/linuxkathirvel/eng2tamildictionary/blob/master/License.txt

Dictionary.

https://github.com/derekchuank/high-frequency-vocabulary[edit | edit source]

License: https://github.com/derekchuank/high-frequency-vocabulary/blob/master/LICENSE

Dictionary.

https://github.com/kujirahand/EJDict/tree/master/src[edit | edit source]

License: https://github.com/kujirahand/EJDict/blob/master/LICENSE

Dictionary.

Estonian[edit | edit source]

https://www.eki.ee/litsents/[edit | edit source]

License: https://www.eki.ee/litsents/

Some of its users: http://portaal.eki.ee/sonaraamatud.html

Dictionaries. Actually only 2 are available.

German[edit | edit source]

https://www.openthesaurus.de/about/download/[edit | edit source]

License: https://www.openthesaurus.de/about/download/

Some of its users: https://www.openthesaurus.de/about/download/

Thesaurus.

Hindi[edit | edit source]

http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/downloaderInfo.php[edit | edit source]

License: http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/index.php

Some of its users: http://www.cfilt.iitb.ac.in/~hdict/webinterface_user/dict_search_user.php

Dictionary. Application is required.

Icelandic[edit | edit source]

https://www.ling.upenn.edu/~kurisuto/germanic/oi_cleasbyvigfusson_about.html[edit | edit source]

License: http://lexicon.ff.cuni.cz/txt/oi_cleasbyvigfusson.txt

Dictionary.

Interlingua[edit | edit source]

http://www.denisowski.org/Interlingua/IEDICT/iedict_readme.html[edit | edit source]

License: http://www.denisowski.org/Interlingua/IEDICT/iedict_readme.html

Some of its users: http://www.denisowski.org/Interlingua/IEDICT/iedict_readme.html

Dictionary.

Interlingue[edit | edit source]

https://github.com/Carmina16/hunspell-ie[edit | edit source]

License: https://github.com/Carmina16/hunspell-ie/blob/master/LICENSE

Spell checker with dictionary.

Iranian Persian[edit | edit source]

https://github.com/amirshnll/English-Persian-Word-Database[edit | edit source]

License: https://github.com/amirshnll/English-Persian-Word-Database/blob/master/LICENSE

Dictionary.

Japanese[edit | edit source]

http://www.edrdg.org/wiki/index.php/Main_Page[edit | edit source]

License: https://www.edrdg.org/edrdg/licence.html

Some of its users: https://jisho.org/, https://www.tagaini.net/

Japanese dictionaries.

https://github.com/KanjiVG/kanjivg/releases/[edit | edit source]

License: http://kanjivg.tagaini.net/

Some of its users: https://www.tagaini.net/, https://jisho.org/

Kanji strokes.

http://dico.fj.free.fr/dico.php[edit | edit source]

License: http://dico.fj.free.fr/copyright.php

Some of its users: http://dico.fj.free.fr/traduction/index.php

Japanese-French dictionary.

https://github.com/mifunetoshiro/kanjium[edit | edit source]

License: https://github.com/mifunetoshiro/kanjium/blob/master/LICENSE.txt

Kanji data.

https://www.tanos.co.uk/jlpt/[edit | edit source]

License: https://www.tanos.co.uk/jlpt/sharing/

JLPT data.

https://ja.wiktionary.org/wiki/%E4%BB%98%E9%8C%B2:%E5%B8%B8%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7[edit | edit source]

License: http://www.bunka.go.jp/bunkacho_homepage/index.html

Frequent characters.

https://ja.wiktionary.org/wiki/Wiktionary:%E4%BA%BA%E5%90%8D%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7[edit | edit source]

License: http://www.moj.go.jp/term.html

Frequent characters for names.

https://ja.wikipedia.org/wiki/%E5%AD%A6%E5%B9%B4%E5%88%A5%E6%BC%A2%E5%AD%97%E9%85%8D%E5%BD%93%E8%A1%A8[edit | edit source]

License: http://www.mext.go.jp/b_menu/about_link.htm

Frequent characters according to school grades.

Jeju[edit | edit source]

https://jeju.go.kr/culture/dialect/dictionary.htm[edit | edit source]

License: https://jeju.go.kr/help/policy/copyright.htm

Some of its users: https://jeju.go.kr/culture/dialect/dictionary.htm

Dictionary.

Klingon[edit | edit source]

http://klingonska.org/dict/dict.zdb[edit | edit source]

License: http://klingonska.org/dict/

Some of its users: http://klingonska.org/dict/

Dictionary.

Korean[edit | edit source]

https://krdict.korean.go.kr/mainAction[edit | edit source]

License: https://krdict.korean.go.kr/kboardPolicy/copyRightTermsInfo

Some of its users: https://krdict.korean.go.kr/mainAction

Dictionary. Download link is unknown.

https://opendict.korean.go.kr/main[edit | edit source]

License: https://opendict.korean.go.kr/service/copyrightPolicy

Some of its users: https://opendict.korean.go.kr/main

Dictionary. Download link is unknown.

https://stdict.korean.go.kr/main/main.do[edit | edit source]

License: https://stdict.korean.go.kr/join/copyrightPolicy.do

Some of its users: https://stdict.korean.go.kr/main/main.do

Dictionary. Download link is unknown.

https://github.com/garfieldnate/kengdic[edit | edit source]

License: https://github.com/garfieldnate/kengdic

Some if its users: http://www.toktogi.com/

Dictionary.

https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt[edit | edit source]

License: https://github.com/libhangul/libhangul/blob/master/data/hanja/hanja.txt

Words in Hangul and Hanja.

There is a page of introduction: https://wiki.kldp.org/wiki.php/libhangul.

https://ko.wiktionary.org/wiki/%EB%B6%80%EB%A1%9D:%ED%95%9C%EB%AC%B8_%EA%B5%90%EC%9C%A1%EC%9A%A9_%EA%B8%B0%EC%B4%88_%ED%95%9C%EC%9E%90_1800[edit | edit source]

License: http://www.suneung.re.kr/sub/info.do?m=0601&s=suneung

http://www.suneung.re.kr/boardCnts/fileDown.do?fileSeq=59692112e521efa80d2af27916704082 in a easy-to-copy form.

https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110503&boardSeq=64217[edit | edit source]

License: https://www.topik.go.kr/usr/cmm/subLocation.do?menuSeq=2110702

Word list of TOPIK.

https://github.com/mhagiwara/cc-kedict[edit | edit source]

License: https://github.com/mhagiwara/cc-kedict

Dictionary.

Lithuanian[edit | edit source]

https://github.com/ispell-lt/ispell-lt[edit | edit source]

License: https://github.com/ispell-lt/ispell-lt/blob/master/COPYING

Spell checker with dictionary.

Nepali[edit | edit source]

https://github.com/nirooj56/Nepdict[edit | edit source]

License: https://github.com/nirooj56/Nepdict/blob/master/LICENSE

Dictionary.

Russian[edit | edit source]

https://en.openrussian.org/dictionary[edit | edit source]

License: https://en.openrussian.org/dictionary

Some of its users: https://en.openrussian.org/

Dictionary.

Sanskrit[edit | edit source]

https://github.com/hemanth/sanskrit-dict/blob/master/dict.js[edit | edit source]

License: https://github.com/hemanth/sanskrit-dict/blob/master/license

Dictionary.

Slovak[edit | edit source]

http://sk-spell.sk.cx/hunspell-sk[edit | edit source]

License: http://sk-spell.sk.cx/hunspell-sk

Spell checker with dictionary.

Vietnamese[edit | edit source]

http://www.informatik.uni-leipzig.de/~duc/Dict/install.html[edit | edit source]

License: http://www.informatik.uni-leipzig.de/~duc/Dict/install.html

Some of its users: https://www.informatik.uni-leipzig.de/~duc/Dict/

Dictionaries in several languages.

There is a page of introduction: https://vi.wiktionary.org/wiki/Wiktionary:Ngu%E1%BB%93n_g%E1%BB%91c/FVDP

http://www.denisowski.org/Vietnamese/vnedict_readme.htm[edit | edit source]

License: http://www.denisowski.org/Vietnamese/vnedict_readme.htm

Some of its users: http://www.denisowski.org/Vietnamese/vnedict_readme.htm

Dictionary.

https://github.com/duyetdev/vietnamese-wordlist[edit | edit source]

License: https://github.com/duyetdev/vietnamese-wordlist/blob/master/LICENSE

Word list.

https://github.com/duyetdev/vietnamese-namedb[edit | edit source]

License: https://github.com/duyetdev/vietnamese-namedb/blob/master/LICENSE

Name list.

Non-language[edit | edit source]

https://unicode.org/ucd/[edit | edit source]

License: https://www.unicode.org/copyright.html

Some of its users: https://wiki.gnome.org/action/show/Apps/Gucharmap, http://www.decodeunicode.org/, https://unicode-table.com/en/, https://www.fontspace.com/

Unicode.

https://www.cia.gov/library/publications/download/[edit | edit source]

License: https://www.cia.gov/library/publications/the-world-factbook/docs/contributor_copyright.html

Some of its users: https://www.cia.gov/library/publications/resources/the-world-factbook/

General facts about countries and regions.

https://www.geonames.org/[edit | edit source]

License: https://www.geonames.org/

Gazetteer and postal code data for free.

https://iso639-3.sil.org/code_tables/download_tables/[edit | edit source]

License: https://iso639-3.sil.org/code_tables/download_tables/

Some of its users: https://iso639-3.sil.org/code_tables/639/data, https://polyglotclub.com/

ISO 639-3 tables. It assigns each language a code and is updated every year.

https://www.unicode.org/iso15924/codelists.html[edit | edit source]

License: https://www.unicode.org/copyright.html

Some of its users: http://www.unicode.org/iso15924/codelists.html

ISO 15924 lists. Codes for scripts.

https://www.unece.org/cefact/locode/welcome.html[edit | edit source]

License: https://www.unece.org/cefact/locode/locode_since1981.html

UN/LOCODE, an alternative to ISO 3166-2. It is updated twice a year.

http://www.nationalanthems.info/[edit | edit source]

License: http://www.nationalanthems.info/

National anthems.

Formats[edit | edit source]

Sheet[edit | edit source]

database name with link file name field separator field 1 field 2 field 3 field 4 field 5 field 6 field 7 field 8 field 9 field 10 field 11 field 12 field 13
dictionary
An ordered and extended TOCFL word-list tocfl.tsv <tab> Word Pinyin OtherPinyin Level First Translation Other Translation
CC-Canto cccanto-webdist.txt <space> Traditional Simplified [pin1 yin1] {jyut6 ping3} /English equivalent 1/equivalent 2/
CC-CEDICT cedict_ts.u8 <space> Traditional Simplified [pin1 yin1] /English equivalent 1/equivalent 2/
CFDICT CFDICT.u8 <space> Traditionnel Simplifié [pin1 yin1] /traduction 1/traduction2/
CHDICT CHDICT.u8 <space> Tradicionális Egyszerűsített [pin1 yin1] /magyar egyenérték 1/ egyenérték 2
ECDICT ecdict.csv , word phonetic definition translation pos collins oxford tag bnc frq exchange detail audio
English Persian Word Database EnglishPersianWordDatabase.xlsx EnglishWord PersianWord
ESPDIC espdict.txt  : Esperanto English
HanDeDict handedict.u8 <space> Traditionel Vereinfacht [pin1 yin1] /deutsche Entsprechung 1 /Entsprechung 2/
libhangul hanja.txt : Hangul Hanja note
IEDICT iedict.txt  : Interlingua English
Inglise-eesti sõnaraamat eestiinglise.txt <tab> eeste inglise
JLPT Vocabulary VocabList.N1.doc

VocabList.N2.doc

VocabList.N3.doc

VocabList.N4.doc

VocabList.N5.doc

Kanji Hiragana English
kengdic kengdic_2011.tsv <tab> wordid word ? def ? ? submitter doe ? hanja ? ?
The Maryknoll Taiwanese-English Dictionary & English-Taiwanese Dictionary 2013 edition Mkdictionary.xls Sort Taiwanese Chinese English
VNEDICT vnedict.txt  : Vietnamese English
word list
한국어능력시험 어휘목록 토픽 어휘 목록_공개 목록.xlsx 수준 어휘 길잡이말 품사
古汉语单字字频: Character frequency list of Classical Chinese CharFreq-Classical.xls Serial number; 序号 Character; 汉字
现代汉语单字字频: Character frequency list of Modern Chinese CharFreq.txt <tab> Serial number; 序号 Character; 汉字 Individual raw frequency; 频率 Cumulative frequency in percentile; 累计频率 Pinyin; 拼音 English translation; 英文翻译
通用规范汉字表 编号 字形
常用國字標準字體表 流水序 教育部字號 Unicode 常用字
新汉语水平考试(HSK)词汇(2012年修订版) HSK-2012.xls 单词(等级)

Manually convert to TSV[edit | edit source]

file name process (on Linux)
cccanto-webdist.txt
  1. Delete lines starting with '#';
  2. Replace the first ' ' in each line with '\t';
  3. Replace the first ' [' in each line with '\t';
  4. Replace '] {' with '\t';
  5. Replace '} /' with '\t';
  6. Replace ' # adapted from cc-cedict' with '';
  7. Replace '/\n' with '\n';
  8. Add 'Traditional\tSimplified\tpin1 yin1\tjyut6 ping3\tEnglish equivalent 1/equivalent 2\n' at the beginning;
cedict_ts.u8
  1. Delete lines starting with '#';
  2. Replace the first ' ' in each line with '\t';
  3. Replace the first ' [' in each line with '\t';
  4. Replace '] /' with '\t';
  5. Replace '/\n' with '\n';
  6. Add 'Traditional\tSimplified\tpin1 yin1\tEnglish equivalent 1/equivalent 2\n' at the beginning;
CharFreq.txt
  1. Delete lines starting with '/';
  2. Delete fields 3, 4;
  3. Add '序列号\t汉字\t拼音\t英文翻译' at the beginning;
CharFreq-Classical.xls
  1. Delete the first row;
  2. Delete fields 3, 4;
  3. Save as TSV file or save as CSV file and select '<tab>' as field separator;
CHDICT.u8
  1. Delete lines starting with '#';
  2. Replace '\n\n' with '\n';
  3. Replace the first ' ' in each line with '\t';
  4. Replace the first ' [' in each line with '\t';
  5. Replace '] /' with '\t';
  6. Replace '/\n' with '\n';
  7. Add 'Tradicionális\tEgyszerűsített\tpin1 yin1\tmagyar egyenérték 1/ egyenérték 2\n' at the beginning;
ecdict.csv
  1. Open with a spreadsheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
eestiinglise.txt
  1. Add 'eeste\tinglise\n' at the beginning;
EnglishPersianWordDatabase.xlsx
  1. Open with a spreadsheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
espdict.txt
  1. Delete the line starting with '#';
  2. Replace ' : ' with '\t';
  3. Add 'Esperanto\tEnglish\n' at the beginning;
handedict.u8
  1. Delete lines starting with '#';
  2. Replace '\n\n' with '\n';
  3. Replace the first ' ' in each line with '\t';
  4. Replace the first ' [' in each line with '\t';
  5. Replace '] /' with '\t';
  6. Replace '/\n' with '\n';
  7. Add 'Traditionel\tVereinfacht\tpin1 yin1\tdeutsche Entsprechung 1/Entsprechung 2\n' at the beginning;
hanja.txt
  1. Delete lines starting with ' #';
  2. Replace ':' with '\t';
  3. Add 'Hangul\tHanja\tnote' at the beginning;
HSK-2012.xls
  1. Open with a spreadsheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
  3. Open the new file;
  4. Replace '(' with '\t';
  5. Replace ')' with '';
  6. Add '单词\t等级\n' at the beginning;
iedict.txt
  1. Delete the line starting with ' #';
  2. Replace ' : ' with '\t';
  3. Add 'Interlingua\tEnglish\n' at the beginning;
kengdic_2011.tsv
  1. Delete fields 1, 3, 5, 6, 7, 8, 9, 11, 12;
  2. Add 'word\tdef\hanja\n' at the beginning;
Mkdictionary.xls
  1. Open with a spreadsheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
tocfl.tsv
  1. Replace '"\t"' with '\t';
  2. Replace '"\n"' with '\n';
  3. Replace the first '"' with '';
  4. Replace the last '"' with '';
vnedict.txt
  1. Delete the line starting with '#';
  2. Replace ' : ' with '\t';
  3. Add 'Vietnamese\tEnglish\n' at the beginning;
토픽 어휘 목록_공개 목록.xlsx
  1. Open with a spreadssheet program;
  2. Save as TSV file or save as CSV file and select '<tab>' as field separator;
  3. Click on the other tab of sheet;
  4. Save as TSV file or save as CSV file and select '<tab>' as field separator;

Others[edit | edit source]

database name with link format
FreeDict slob
Free Vietnamese Dictionary Project dict.dz
XOBDO.ORG db