Logo

The LOPEN project is a open project of Chinese language and knowledge resource maintained by LOPE (Labs of Ontologies, Language Processing and e-Humanity) at National Taiwan University. We believe open resource will enable not only reproduction and creativity of empirical research but also progress of our society.

[繁體中文] [English]

Follow @lopentu

Resource and tools

CWN Sense Tagger

In this project, we aim to solve the Chinese word sense disambiguation problem by state-of-the-art Bert model. It gives us huge performance gains and can score roughly 82% accuracy.

[link] [demo]

Deep Lexicon (DeepLEX)

A large Chinese-centered open lexicon as an alternative resource to atomic lexicon theory.

[link]

Chinese Wordnet (CWN)

CWN aims at constructing a deep semantic and conceptual network. Fine-grained semantic analysis and open relational design are conducive to the structure of langanguage and mine.

[link] [CWN v1] [CWN v2]

HanziAnalysisKit

The Hanzi Glyph Corpus Toolkit (HGCT) and lexicoR facilitate the querying and analysis of Chinese character glyphs within corpora and provides access to various Chinese lexical resources.

[link]

Chinese Word Map (CWM)

CWM is a TSCL-based (Teching Chinese as a Second Language) word sketch engine of lexical knowledge.

[link]

Corpora Open and Search (COPENS)

An open corpus system and query tool. Automatically pre-processing and free annotating.

[link]

PTT Corpus

As a characteristic BBS system in Taiwan, PTT records interesting social and cultural language phenomena. It also provides important empirical information on language contact and evolution.

[link]

Chinese variation

A parellel corpus of Taiwan Mandarin and Mainland Mandarin.

[link]

Toxic Talk

A toxic talk generator trained with comments on Internet

[link]

Collaborative learning

Collabin

A blog of learning notes wriiten by the lab members.

[link]

Open courses

Python for Humanities (2018)

[link] [GitHub]

Corpus Linguistics (2018)

[link]

Hands-on Corpus Linguistics Workshop (2018)

[link] [GitHub]