Knowledge Centres

List CLARIN K-centres with expertise in languages


Click on the full name of the K-centre to go to its landing page, and click on the acronym to see its full organisation details

CKLD

CLARIN Knowledge-Centre for linguistic diversity and language documentation

Areas of competenceThe CLARIN Knowledge-Centre for linguistic diversity and language documentation offers expertise on data and data-related methods, technology and background information on language resources and tools to researchers - including students and native speakers. CKLD provides information and assistance relating to fieldwork and data-related methodological aspects and in particular relating to equipment, digital tools, methods, where to find data and information, whom to contact for specialist information on particular regions or language families.
Audiences served-
Types of servicesInformation materials, guidelines, tutorials, consultancy
Is portal for language(s)-
Other languages covered   Under-researched languages and languages families (linguistic diversity). Expertise in Athabascan, Austronesian, Austro-Asiatic, Dravidian, Finno-Ugric, Papuan, etc
Modalities coveredText, audio-visual recorings of speech
Linguistic topicslanguage documentation, linguistic typology, linguistic fieldwork
Language processing-
Data typesAV collections, typological databases
Resource familiesAV collections of endangered and under-researched languages
Generic topicslinguistic fieldwork
Other keywords-
Tour de CLARIN- -

CLARIN-SMS

Swedish in a Multilingual Setting

Areas of competenceOffers special expertise in the areas of processing of parallel and comparable corpora, including alignment and machine translation, cross-linguistically consistent annotation within the framework of Universal Dependencies, computation and evaluation of measures of text complexity and language technology for Swedish Sign Language.
Audiences served-
Types of services-
Is portal for language(s)-
Other languages covered   Swedish
Modalities coveredText and sign language
Linguistic topics-
Language processingProcessing parallel corpora, machine translation, annotation, evaluation
Data typesparallel corpora
Resource families-
Generic topics-
Other keywords-
Tour de CLARIN- -

CLARIN-SPEECH

CLARIN Knowledge Centre for Speech Analysis

Areas of competenceTechnical advice on speech analysis relating to all aspects of speech technology, including speech science, speech applications, and speech in interaction.
Audiences served-
Types of servicesawareness, tools, mentoring
Is portal for language(s)-
Other languages covered   Swedish, English
Modalities coveredSpeech, biosiglnals, audiovisual data, sensor data
Linguistic topicsphonetics, pathology
Language processingspeech analysis, speech modelling, speech processing
Data typesacoustic and language models, dictionaries, vocabularies, pronunciation data, biosignals related to spoken interaction
Resource familiesoral history, parliamentary records
Generic topicsdeep learning, evaluation, tools, visualization, ASR, legal issue, data management
Other keywords-
Tour de CLARIN- -

CLASSLA

CLARIN Knowledge Centre for South Slavic languages

Areas of competenceOffers expertise on language resources and technologies for South Slavic languages
Audiences served-
Types of servicestools, data, mentoring, dissemination, awareness, web lectures
Is portal for language(s)Slovene, Croatian, Bosnian, Montenegrin, Serbian, Macedonian, Bulgarian
Other languages covered   -
Modalities coveredText
Linguistic topicsApplied linguistics, Dialect studies, Sociolinguistics (for South Slavic languages)
Language processingBasic processing of South Slavic languages
Data typestraining data, language models (for South Slavic languages)
Resource familiesNewspapers, social media, parliamentary records, historical texts, language learner corpora (for South Slavic languages)
Generic topicsdeep learning, evaluation of tools (for South Slavic languages)
Other keywords-
Tour de CLARIN- -

CORLI-K-centre

CORLI French CLARIN Knowledge Centre for Corpora, Languages and Interaction

Areas of competenceCorpus linguistics with a special focus on the French language and the languages of France
Audiences served-
Types of services-
Is portal for language(s)French
Other languages covered   -
Modalities coveredText corpora
Linguistic topics-
Language processing-
Data types-
Resource families-
Generic topics-
Other keywords-
Tour de CLARIN- -

CorpLingCz

Czech CLARIN Knowledge Centre for Corpus Linguistics

Areas of competenceProvides information, consulting and technical assistance on all topics related to corpus linguistics. This includes data formats, annotation, metadata encoding, corpus querying, corpus linguistics methodology, statistical methods etc. Another specialization of the centre is empirical research on the Czech language.
Audiences served-
Types of servicesWe are ready to provide data, tools and technical assistance, share expertise and hold workshops on demand on the topics covered by the K-centre. There is an on-line helpdesk to handle the user requests.
Is portal for language(s)Czech
Other languages covered   -
Modalities coveredText, Speech
Linguistic topicsCorpus linguistics (including methodology and statistics)
Language processingBasic language processing (POS tagging, parsing)
Data typesSpeech corpora, parallel corpora
Resource families-
Generic topics-
Other keywords-
Tour de CLARINIntroduction Interview

DANSK

CLARIN K-Centre DANSK - DANish helpdeSK

Areas of competenceDanish language and Danish sign language, language resources and language technology tools and services for Danish.
Audiences served-
Types of services-
Is portal for language(s)Danish and Danish sign language
Other languages covered   -
Modalities coveredText, sign language
Linguistic topicsmorphology, syntax, semantics, pragmatics
Language processingDanish language processing, tokenisation, PoS tagging, lemmatization, name entity tagging, tree banks, corpus tools, multimodal corpora annotation and processing
Data typescorpora, word net, multimodal annotations
Resource familieshistorical and literary corpora, contemporary domain specific corpora, Hansards, multimodal annotations, NLP tools
Generic topics-
Other keywords-
Tour de CLARIN- -

IMPACT-CKC

IMPACT centre of competence - CLARIN K-centre in digitisation

Areas of competenceIMPACT-CKC (IMPACT centre of competence - CLARIN K-centre in digitisation), as knowledge centre offers expertise and resources to institutions and researchers looking for advice in digitisation and related fields. The IMPACT-CKC resoruces include a demonstrator platform for online testing tools, a collection of high quality images with associated ground truth, historical lexica for 10 languages as well as training materials and registries on tools, initiatives, datasets and competitions.
Audiences served-
Types of servicestools, data, mentoring, dissemination, awareness, tutorials, web lectures.
Is portal for language(s)-
Other languages covered   Spanish, English, Polish, French, Dutch, German, Slovene, Czech, Latin, Bulgarian
Modalities coveredText, AV data
Linguistic topicscorpus linguistics, diachronic language resources, language learning
Language processingbasic language processing, information extraction
Data typeslexical data, language models, linked open data and ontologies
Resource familieshistorical texts, lexical resources, literary texts, newspapers
Generic topicsOCR, digitisation, visualisation, evaluation of tools
Other keywords-
Tour de CLARINIntroduction Interview

K-BLP

CLARIN Knowledge Centre for Belarusian Text and Speech Processing

Areas of competenceKnowledge about text and speech processing of Belarusian and other languages; Knowledge about Belarusian language learning; - Tools and resources for text and speech processing for Belarusian and other languages
Audiences served-
Types of services-
Is portal for language(s)Belarusian
Other languages covered   -
Modalities coveredText and speech
Linguistic topics-
Language processing-
Data types-
Resource families-
Generic topics-
Other keywords-
Tour de CLARIN- -

NLP:EL

CLARIN K-Centre for Natural Language Processing in Greece

Areas of competenceNLP research for Greek, Digital readiness of Greek
Audiences served-
Types of servicesCLARIN K-Centre NLP:EL will operate a helpdesk concerning Natural Language Processing for Greek and/or developed in Greece.; Besides responding to questions on the above issues (reactive activities), it will additionally provide informative material and documentation relevant to these issues (proactive activities); this material includes (but is not limited to) (i) scientific publications and presentations on NLP research and applications for Greek, (ii) guides and tutorials on NLP tools and services for Greek and (iii) direct connection to the CLARIN:EL infrastructure, where the users can find more detailed information and further training and dissemination material.
Is portal for language(s)Greek
Other languages covered   -
Modalities coveredText
Linguistic topics-
Language processing-
Data types-
Resource families-
Generic topics-
Other keywords-
Tour de CLARIN- -

PhA-OeAW

Phonogrammarchiv - Austrian Academy of Sciences

Areas of competenceAs an audio and audiovisual archive with numerous collections of unique research recordings from all across the world, the Phonogrammarchiv offers various services: Besides providing access to its data and metadata resources (remote & onsite), it advises scholars on field research methodology and technologies of audio and audiovisual documentation, supporting them with necessary recording equipment. In addition, it widely shares its broad expertise on topics such as restoration, digitisation, format obsolescence, cataloguing, metadata, long-term preservation and storage.
Audiences served-
Types of servicesIndividual advice, group trainings, workshops and higher education teaching, internships, practical assistance and institutional cooperations. Access to audiovisual data and metadata (remote and onsite).
Is portal for language(s)-
Other languages covered   Audio and audiovisual recordings plus accompanying documentation on a wide variety of languages / dialects from all across the world, covering a timespan of 120 years.
Modalities coveredAudio and audiovisual recordings.
Linguistic topicsField linguistics, interview techniques (social/cultural anthropology, ethnomusicology), language documentation, oral history.
Language processing-
Data typesAudio and audiovisual recordings.
Resource families-
Generic topicsArchiving: physical restoration, digitisation, format migration, cataloguing, metadata, long-term preservation and storage. Research: methods and technologies of audiovisual fieldwork and documentation
Other keywords-
Tour de CLARINIntroduction -

PolLinguaTec

CLARIN Knowledge Centre for Polish Language Technology

Areas of competenceProvides wide knowledge on the methods of natural language analysis with a special emphasis put on the analysis of Polish language. Offers support for all types of applications of Language Technology for Polish, both mono and multilingual ones.
Audiences served-
Types of servicestools, resources, dissemination, awareness, tutorials, helpdesk
Is portal for language(s)Polish
Other languages covered   -
Modalities coveredText
Linguistic topics-
Language processingPolish language processing
Data types-
Resource families-
Generic topics-
Other keywords-
Tour de CLARINIntroduction Interview

PORTULAN

CLARIN Knowledge Centre for the Science and Technology of the Portuguese Language

Areas of competenceThe Science and Technology of the Portuguese Language is the thematic area of this CLARIN Knowledge Centre. Related to the Portuguese language, it covers all topics, from Phonetics to Discourse and Dialogue, considering all language functions, from communicative performance to cultural expression, approached by all disciplines, from Theoretical Linguistics to Language Technology, covering all language variants, from national standard varieties across the world to dialects of professional groups, taking into account all media of representation, from audio to brain imageology recordings.
Audiences served-
Types of services-
Is portal for language(s)Portuguese
Other languages covered   -
Modalities coveredText and speech
Linguistic topics-
Language processingPortuguese language processing
Data types-
Resource families-
Generic topicsBrain image recording
Other keywords-
Tour de CLARIN- -

SAFMORIL

Systems and Frameworks for Morphologically Rich Languages

Areas of competenceSAFMORIL brings together researchers and developers in the area of computational morphology and its NLP applications. The focus of SAFMORIL is actual, working systems and frameworks based on linguistic principles providing linguistically motivated analyses and generation outputs. Such systems are relevant in particular for languages with rich morphologies. SAFMORIL offers online courses for developing morphologies, tokenizers and spell-checkers, and a repository for storing morphologies.
Audiences served-
Types of servicesdata, tools, web demos, web lectures and tutorials
Is portal for language(s)-
Other languages covered   Primarily Nordic and Baltic languages (such as Finnish, Swedish, Norwegian, Latvian, Lithuanian as well as the Sámi languages), but also more generally Fenno-Ugric languages, Inuit languages, Canadian First Nation languages and Babylonian languages
Modalities coveredText
Linguistic topicsMorphology and Morphosyntax
Language processingProcessing of morphologically rich languages
Data typesLexical resources containing inflectional, derivational and compounding information as well as morphosyntactic grammars and language models
Resource familiesMorphological Lexicons, Grammars and Language Models
Generic topicsPrimarily Finite-State Applications, but to some degree also Statistical Methods and Neural Networks
Other keywords-
Tour de CLARIN- -

Spanish-K-Centre

Spanish CLARIN K-Centre

Areas of competenceThe Spanish CLARIN K-Centre aims to provide knowledge, services, consultancy and specialized web services to the Humanities and Social Science research communities. Our web services and consultancy is about how to use and research with basic tools that can handle and exploit textual data at least in the four (co)official languages (Spanish, Catalan, Galician, Basque) and English, which is one of the most important sources of information for many HSC disciplines.
Audiences served-
Types of services1. Tools; 2. Data; 3. Mentoring; 4. Dissemination; 5. Tutorials
Is portal for language(s)Spanish, Basque, Catalan, Galician
Other languages covered   -
Modalities coveredText
Linguistic topics1. general linguistics (phonology, morphology, syntax, semantics,; pragmatics); 2. computational linguistics; 3. corpus linguistics; 4. applied linguistics; 5. stylistics
Language processingSpanish, Catalan, Galician and Euskera language processing:; morphology, syntax, semantics, discourse
Data types1. Lexical databases: general, sentiment, NERC...; 2. Syntax Tree banks; 3. Discourse Tree banks: correference, relational; 4. Spoken databases; 5. Semantic annotation: semantic roles, word sense,; 6. Error annotation; 7. Image bank (wikimedia); 8. Conversational QA
Resource families-
Generic topics1. Grammars; 2. Finite-State Applications; 3. Statistical Methods; 4. Neural Networks
Other keywords-
Tour de CLARIN- -

SWELANG

CLARIN Knowledge Centre for The Languages of Sweden

Areas of competenceInformation service offering advice on the use of digital language resources and tools for the Swedish language, minority languages in Sweden, the Swedish sign language, Swedish dialects, as well as other parts of the intangible cultural heritage of Sweden in text and speech, as well as language policy and planning.
Audiences served-
Types of serviceson-line lexica, q&a database, map interfaces for folk tales and dialects, open data, language consulting (by telephone, email and social media)
Is portal for language(s)Swedish
Other languages covered   Finnish, Meänkieli, Romani, Jiddisch, Swedish sign and other languages in Sweden
Modalities coveredText and sign language, spoken Swedish dialects
Linguistic topicslanguage policy and planning, language infrastructure, language technology, dialect studies, sociolinguistics, folkloristics, plain language and language comprehensibility, terminology, lexicography
Language processingtopic modelling
Data typesspeech recordings, mono- and multilingual lexica and word/term collections
Resource families-
Generic topicslanguage policy and planning
Other keywords-
Tour de CLARINIntroduction Interview