List of all 32 CLARIN K-centres with their organisation details

[This page is no longer updated; please visit our new K-Centre catalogue for up-to-date information about our K-Centres]

Click on the acronym or full name of the K-centre to go to its landing page

ACE	CLARIN Knowledge Centre for Atypical Communication Expertise
Contact name	Henk van den Heuvel
Contact email	Henk.vandenHeuvel@ru.nl
Hosted by	(1) CLST - Centre for Language and Speech Technology at the Radboud University, Nijmegen, Netherlands
City of main hub	Nijmegen
Country of main hub	NL
Date of certifcation	2022-07-29
Area of competence	Atypical communication encompasses language and speech as encountered during (second) language acquisition and development, and in language disorders, but also more broadly in bilingual language development and in sign language. ACE is specialised in this type of research and concomitant infrastructural issues related to data acquisition, processing and sharing, which is typically highly characterised by sensitivity issues. For data storage and access the centre collaborates with MPI's TLA (The Language Archive) which is a CLARIN B Centre and also based in Nijmegen. We publish in English and Dutch but have expertise in many European languages.
Audiences served	- linguists - psychologists - neuroscientists - computer scientists - speech and language therapists - education specialists
Types of services	-- how-to documents - access to document templates - Access to data - Depositing - FAQ - Helpdesk - Technical support
Is portal for language(s)	-
Other languages covered	-
Modalities covered	- Audio: speech - Text - Video: sign language
Linguistic topics	- Language acquisition (L1 and L2) - language disorders - Language learning
Language processing topics	- Language technology - speech technology - automatic speech recognition - LLMs for atypical language
Data Types	-
Resources Families	- Manually annotated corpora - Multimodal corpora - Sign language corpora - Spoken corpora - Lexica - Glossaries
Generic topics	- Critical Data Management - Legal and ethical issues - AI speech models for automatic speech recognition and evaluation
Other keywords	- Language acquisition - sign language - language pathologies
Tour de CLARIN	Introduction Interview
Last update	2024-04-26 15:37:02

CKCMC	CLARIN Knowledge Centre for Computer-Mediated Communication and Social Media Corpora
Contact name	Egon Stemle
Contact email	egon.stemle@eurac.edu
Hosted by	(1) Institute for Applied Linguistics, Eurac Research (IAL), Bolzano, Italy (2) Formal Linguistics Laboratory (LLF), Paris, France (3) Jožef Stefan Institute (IJS), Ljubljana, Slovenia (4) Leibniz-Institute for the German Language (IDS), Mannheim, Germany
City of main hub	Bolzano
Country of main hub	IT
Date of certifcation	2024-06-10
Area of competence	The CLARIN Knowledge Centre for Computer-Mediated Communication and Social Media offers expertise on language resources and technologies for Computer-Mediated Communication and Social Media. Its basic activities are to (1) give researchers, students, and other interested parties information about the available resources, technologies, and community activities, (2) support interested parties in producing, modifying or publishing relevant resources and technologies and (3) organize training activities.
Audiences served	- Computational linguists - Linguists - Language teachers - Sociologists - Citizen scientists
Types of services	-- Access to documentation - FAQ - Helpdesk - User assistance - User forum
Is portal for language(s)	-
Other languages covered	-- English - French - German - Italian - Slovenian - and their respective language families
Modalities covered	- Text - Multimodality
Linguistic topics	- Morphology - Syntax - Semantics - Stylistics
Language processing topics	- Corpus data representation - Basic natural language processing - Proper data deposition
Data Types	-
Resources Families	- Computer-mediated communication corpora (social media)
Generic topics	- Data management - Legal issues - Ethical issues - Standards
Other keywords	- Qualitative and quantitative linguistic analyses - Processing of non-standard language (learner language and web corpora)
Tour de CLARIN	- -
Last update	2024-06-11 12:20:49

CKL2CORPORA	CLARIN Knowledge Centre for Learner Corpora
Contact name	Magali Paquot
Contact email	magali.paquot@uclouvain.be
Hosted by	(1) Linguistic Research Unit (PLIN) - Language and Communication Institute (ILC), UCLouvain, Louvain-la-Neuve, Belgium
City of main hub	Louvain-la-Neuve
Country of main hub	BE
Date of certifcation	2022-11-28
Area of competence	The CLARIN Knowledge Centre for Learner Corpora offers advice and training services on the collection and use of learner corpora (i.e. electronic collections of language data produced by second or foreign language learners) for theoretical and applied purposes.
Audiences served	- Linguists - Language teachers - Computational linguists - Language learners - Citizen scientists
Types of services	-- Access to data - Access to document templates - Access to documentation - Access to tools - FAQ - Helpdesk - How-to documents - Training - Technical support - User assistance
Is portal for language(s)	-
Other languages covered	-- Dutch - English - French - German - Spanish
Modalities covered	- Audio: speech - Text
Linguistic topics	- Applied linguistics - Corpus linguistics - Second language acquisition - Computational linguistics - Translation studies - Language variation - Learner corpus design - Learner corpus compilation - Spoken data transcription - Manual annotation - Learner corpus research - Discourse - Phraseology - Linguistic complexity
Language processing topics	- Part-of-speech tagging - Parsing - Measuring readabity - Automatic annotation - Natural language processing
Data Types	-
Resources Families	- Computer-mediated communication corpora (social media) - Corpora of academic texts - L2 learner corpora - Manually annotated corpora - Newspaper corpora - Parallel corpora - Reference corpora - Spoken corpora
Generic topics	- Ethical issues - Metadata
Other keywords	-
Tour de CLARIN	- -
Last update	2023-08-23 16:44:44

CKLD	CLARIN Knowledge-Centre for linguistic diversity and language documentation
Contact name	Felix Rau
Contact email	f.rau@uni-koeln.de,ckld-internal@uni-koeln.de
Hosted by	(1) DCH: Data Centre for the Humanities, University of Cologne, Germany (2) ELAR: Endangered Languages Archive (Berlin-Brandenburg Academy of Sciences and Humanities), Berlin, Germany (3) HZSK: Hamburg Centre for Language Corpora, University Hamburg, Germany (4) IfL: Department of Linguistics, University of Cologne, Germany (5) ZAS: Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin, Germany
City of main hub	Cologne
Country of main hub	DE
Date of certifcation	2024-12-09
Area of competence	The CLARIN Knowledge-Centre for linguistic diversity and language documentation offers expertise on data and data-related methods, technology and background information on language resources and tools to researchers - including students and native speakers. CKLD provides information and assistance relating to fieldwork and data-related methodological aspects and in particular relating to equipment, digital tools, methods, where to find data and information, whom to contact for specialist information on particular regions or language families.
Audiences served	- Linguists - field linguists - typologists - language communities of endangered languages
Types of services	-- Helpdesk - How-to documents - Training
Is portal for language(s)	-
Other languages covered	-- Under-researched languages and languages families (linguistic diversity). - Athabascan - Austronesian - Austro-Asiatic - Dravidian - Finno-Ugric - Papuan
Modalities covered	- Audio-visual - Text
Linguistic topics	- language documentation - linguistic typology - linguistic fieldwork
Language processing topics	-
Data Types	- audio-visual collections - typological databases
Resources Families	- Manually annotated corpora - Spoken corpora - Dictionaries - Wordlists
Generic topics	- linguistic fieldwork - endangered languages - language typology
Other keywords	-
Tour de CLARIN	- -
Last update	2025-02-18 12:54:54

CLARIN-ELEXIS	CLARIN-ELEXIS Knowledge Centre for Lexicography
Contact name	Simon Krek
Contact email	simon.krek@ijs.si
Hosted by	(1) Jožef Stefan Institute, Ljubljana, Slovenia
City of main hub	Ljubljana
Country of main hub	SI
Date of certifcation	2024-10-28
Area of competence	CLARIN-ELEXIS (European Lexicographic Infrastructure) offers support and expertise on lexicography, in particular on open access data, tools and services for lexicographers. These include: (1) data: semantically annotated corpora, extracted data from corpora, and dictionary data available in the CLARIN.SI repository, (2) online services and tools: Lexonomy, Elexifier, NAISC and BabelNet linker, gaming apps Cross the Word and Game of Words, and Dictionary Matrix. CLARIN-ELEXIS maintains these services and provides help about their use.
Audiences served	- lexicographers - language teachers - linguists (working in the field of semantics, lexicology, lexicography, terminology) - computational linguists (working on topics related to semantic technologies) - corpus linguists - digital humanists (working on historical lexicography) - translators - Language Technology and Natural Language Processing companies
Types of services	-- Lexicographic tools - Lexicographic data - Assistance with the use of lexicographic data and tools - Online training courses on lexicography - Best practice documents on lexicography - Consultancy on lexicography - Standardisation related to lexicography
Is portal for language(s)	-
Other languages covered	-- Any
Modalities covered	- Written text - Speech - Audio-visual data - Sign language
Linguistic topics	- Lexicography, including related subtopics: Morphology, Phonology, Phonetics, Syntax, Semantics, Pragmatics, Stylistics etc. - Language teaching and learning - Translation studies - Diachronic language studies - Terminology - Corpus analysis
Language processing topics	- Word Sense Disambiguation - Word Sense Induction - Entity Linking - Information extraction (multiword expressions, collocations, definitions, etc.) - Gamification (crowdsourcing, language learning)
Data Types	- Dictionaries (all types) - Lexical Conceptual Resources (machine-readable) - Lexicons
Resources Families	- Conceptual resources - Dictionaries - Glossaries - Lexica - Wordlists
Generic topics	- Data management (related to lexicography) - Legal issues (related to lexicography)
Other keywords	-
Tour de CLARIN	- -
Last update	2024-10-28 16:25:07

CLARIN-Learn	CLARIN Knowledge Centre for Language Learning Analysis
Contact name	Brian MacWhinney
Contact email	macw@cmu.edu
Hosted by	(1) Carnegie Mellon University, Pittsburgh, United States
City of main hub	Pittsburgh
Country of main hub	US
Date of certifcation	2024-12-17
Area of competence	Our centre is happy to provide advice on tools, corpora, and methods for the study of first and second language learning, conversational interactions, and a variety of language and developmental disabilities including aphasia, stuttering, TBI, dementia, and ASD.
Audiences served	- linguists - psycholinguistics - developmental psychologists - computer scientists - educators - clinicians
Types of services	-- Access to data - Access to tools - Helpdesk - How-to documents - User forum - Training
Is portal for language(s)	-
Other languages covered	-- 35 languages, with a particularly wide coverage for child language
Modalities covered	- Audio: speech - Video: gestures
Linguistic topics	- language development - conversation - language disorders - child language - second language - bilingualism - conversations
Language processing topics	- speech analysis - archiving - fluency - lexical access
Data Types	- Corpora, both text and multimodal - clinical data
Resources Families	-
Generic topics	-
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2025-02-18 12:56:06

CLARIN-MULTISENS	CLARIN Knowledge Centre for Multimodal and Sensor-based Data
Contact name	Johan Frid
Contact email	johan.frid@humlab.lu.se
Hosted by	(1) Lund University Humanities Lab at Lund University, Sweden
City of main hub	Lund
Country of main hub	SE
Date of certifcation	2024-10-28
Area of competence	CLARIN-MULTISENS provides advice on multimodal and sensor-based methods including EEG (Electroencephalography), eye-tracking, articulography, virtual reality, motion capture, and language related data such as audiovisual recordings and textual representations
Audiences served	- Linguists - Behavioural and cultural scientists
Types of services	-- Depositing - Helpdesk - Technical support - Training - User assistance
Is portal for language(s)	-
Other languages covered	-- Swedish - English - Austroasiatic languages
Modalities covered	- Audio: speech - Audio-visual - Multimodality - Sensor data - Text - Video: gestures
Linguistic topics	- Comparative and phylogenetic linguistics - Dialect studies - Field linguistics - Language diversity - Language learning - Language production - Neurolinguistics - Phonetics - Psycholinguistics
Language processing topics	- Information extraction - Keystroke logging - Named entity recognition - Text mining - Visual prosody
Data Types	- Dictionaries - Language models - Lexical and typological/morphosyntactic databases
Resources Families	- Historical corpora - L2 learner corpora - Manually annotated corpora - Multimodal corpora - Spoken corpora - Oral history corpora
Generic topics	- Multimodal and sensor-based methods - EEG - Eye-tracking - Articulography - Virtual reality - Motion capture - audio-visual-recording - Working with GIS
Other keywords	- Interdisciplinary research - Methodological know-how - E-science
Tour de CLARIN	Introduction Interview
Last update	2025-02-18 12:38:55

CLARIN-SMS	CLARIN Knowledge Centre for Swedish in a Multilingual Setting
Contact name	Arne Jönsson
Contact email	arne.jonsson@liu.se
Hosted by	(1) Linköping University, Sweden (2) Stockholm University, Sweden (3) Uppsala University, Sweden
City of main hub	Linköping
Country of main hub	SE
Date of certifcation	2022-03-08
Area of competence	Language technology and resources for Swedish, Swedish Sign Language, and multilingual settings. Expertise in the processing of parallel corpora including alignment and machine translation, pretrained language models, cross-linguistically consistent annotation within the framework of Universal Dependencies, and computation and evaluation of measures of text complexity.
Audiences served	- Linguists - Phoneticians - Psycholinguists - Historians - Literary scientists - Art historians - Digital humanities researchers - Economists - General public
Types of services	-- Helpdesk - Technical support - Training - User assistance
Is portal for language(s)	-
Other languages covered	-- Swedish - Swedish Sign Language - English - Expertise in linguistic diversity and multilingual applications
Modalities covered	- Audio: speech - Text - Video: sign language
Linguistic topics	- Computational linguistics - Language diversity - Field linguistics - Language learning - Neurolinguistics - Phonetics - Phonology - Pragmatics - Psycholinguistics - Semantics
Language processing topics	- Language processing pipelines - Part-of-speech tagging - Named entity recognition - Syntactic parsing - Machine translation - Processing of discourse relations - Pretrained models - Text simplification - Text complexity
Data Types	- Parallel corpora - Treebanks - Language models - Typological databases - Text - Video - Audio
Resources Families	- Newspaper corpora - Parallel corpora - Sign language resources - Wordlists
Generic topics	- Natural language processing - Corpus annotation - Machine learning - Digital humanities
Other keywords	-
Tour de CLARIN	- -
Last update	2024-05-01 14:19:23

CLARIN-SPEECH	CLARIN Knowledge Centre for Speech Analysis
Contact name	Jens Edlund, David House
Contact email	edlund@speech.kth.se,davidh@kth.se
Hosted by	(1) KTH Royal Institute of Technology, Division of Speech Music and Hearing, Stockholm, Sweden
City of main hub	Stockholm
Country of main hub	SE
Date of certifcation	2024-11-11
Area of competence	Technical advice on speech analysis relating to all aspects of speech technology, including speech science, speech applications, and speech in interaction.
Audiences served	- Archivists - Computational linguists - Computer scientists - Historians - Language teachers - Librarians - Linguists - Phoneticians - Speech Pathologists - Sociologists - Sociolinguists
Types of services	-- Access to Tools - Helpdesk - User assistance - Training
Is portal for language(s)	-
Other languages covered	-- English - Swedish
Modalities covered	- Audio: speech - Audio-visual data - Multimodality - Sensor data - Sensor data: biosignals - Video: gestures
Linguistic topics	- Phonetics - Speech pathology
Language processing topics	- Speech analysis - Speech modelling - Speech processing - Speech recognition - Speech synthesis
Data Types	- Acoustic and language models - Biosignals related to spoken interaction - Dictionaries - Pronunciation data - Vocabularies
Resources Families	- Oral history corpora - Parliamentary corpora
Generic topics	- ASR - Data management - Deep learning - Evaluation - Legal issues - Tools - Visualisation
Other keywords	-
Tour de CLARIN	- -
Last update	2024-10-15 11:46:01

CLASSLA	CLARIN Knowledge Centre for South Slavic languages
Contact name	Nikola Ljubešić, Taja Kuzman
Contact email	nikola.ljubesic@ijs.si,taja.kuzman@ijs.si
Hosted by	(1) Jožef Stefan Institute (CLARIN.SI), Ljubljana, Slovenia (2) Institute of Croatian Language (IHJ), Zagreb, Croatia (3) Institute of Information and Communication Technologies (CLADA-BG), Sofia, Bulgaria
City of main hub	Ljubljana
Country of main hub	SI
Date of certifcation	2022-03-19
Area of competence	CLASSLA offers expertise on language resources and technologies for South Slavic languages. It provides information on freely available lexicons and corpora, which can be used in research in the social sciences and humanities. The CLASSLA-Stanza pipeline allows researchers to perform language processing of their texts to produce their own corpora, while the CLASSLA web corpora as the largest general corpora for all South Slavic languages enable direct language research. The centre provides guidance in how to use the available resources and technologies in research.
Audiences served	- Computational linguists - Computer scientists - Citizen scientists - Historians - Language teachers - Linguists - Sociolinguists - Sociologists
Types of services	-- FAQ - Helpdesk - Technical support - Training
Is portal for language(s)	- Slovenian - Slovene - Croatian - Bosnian - Serbian - Montenegrin - Macedonian - Bulgarian
Other languages covered	-
Modalities covered	- Audio: speech - Text
Linguistic topics	- Applied linguistics - Dialect studies - Sociolinguistics
Language processing topics	- Basic language processing - Information extraction - Language understanding - Named entity recognition - Processing of morphologically rich languages - Speech recognition
Data Types	- Manually annotated datasets - Corpora - Language models - Treebanks
Resources Families	- Computer-mediated communication corpora (social media) - Historical corpora - Literary corpora - Newspaper corpora - Parliamentary corpora - Corpora of academic texts - Manually annotated corpora - Multimodal corpora - Parallel corpora - Reference corpora - Spoken corpora - Language models - Lexica - Normalization - Named entity recognition - Part-of-speech tagging and lemmatization - Tools for sentiment analysis
Generic topics	- Evaluation of tools and models - Machine learning - Deep learning
Other keywords	- Processing of closely related languages - Language variation - Spatial language variation
Tour de CLARIN	Introduction Interview
Last update	2024-07-22 10:43:42

CORLI-K-centre	CORLI French CLARIN Knowledge Centre for Corpora, Languages and Interaction
Contact name	Céline Poudat, Christophe Parisse
Contact email	celine.poudat@univ-cotedazur.fr,cparisse@parisnanterre.fr
Hosted by	(1) Huma-Num / CNRS UMS 3598, Paris, France
City of main hub	Paris
Country of main hub	FR
Date of certifcation	2023-07-13
Area of competence	Corpus linguistics with a special focus on the French language and the languages of France
Audiences served	- Linguists - Sociolinguists - Language teachers - Computational linguists - Literature - Language diachrony
Types of services	-- How-to documents
Is portal for language(s)	- French
Other languages covered	-- LSF (French Sign Language) - Old French
Modalities covered	- Audio: speech - Audio-visual - Text
Linguistic topics	- Format and tools for corpus studies - Multilingual, multimodal corpora
Language processing topics	- Corpus format and tools for creating and analysis corpora
Data Types	- Text - TEI - Spoken language - Video
Resources Families	-
Generic topics	- Legal issues
Other keywords	- Continuous education
Tour de CLARIN	Introduction Interview
Last update	2024-01-26 10:40:36

CorpLingCz	Czech CLARIN Knowledge Centre for Corpus Linguistics
Contact name	Michal Kren
Contact email	michal.kren@ff.cuni.cz
Hosted by	(1) Czech National Corpus, Prague, Czech Republic
City of main hub	Prague
Country of main hub	CZ
Date of certifcation	2022-01-24
Area of competence	Provides information, consulting and technical assistance on all topics related to corpus linguistics. This includes data formats, annotation, corpus querying, corpus linguistics methodology, statistical methods etc. Another specialisation of the centre is empirical research on the Czech language.
Audiences served	- Computational linguists - Computer scientists - Language teachers - Linguists - Psycholinguists - Sociolinguists
Types of services	-- Assistance with the use of data and tools - Consultancy - Data processing on demand - Helpdesk - Technical support - Training courses
Is portal for language(s)	- Czech
Other languages covered	-
Modalities covered	- Speech - Written text
Linguistic topics	- Applied linguistics - Corpus linguistics - Diachronic language studies - Dialect studies - Discourse - Language learning - Lexical studies - Morphology - Syntax - Terminology
Language processing topics	- Basic language processing - Processing of morphologically rich languages
Data Types	-
Resources Families	- Computer-mediated communication corpora (social media) - Historical corpora - L2 learner corpora - Literary corpora - Manually annotated corpora - Newspaper corpora - Parallel corpora - Reference corpora - Spoken corpora
Generic topics	-
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2024-04-26 15:36:09

CROATINA	CLARIN Knowledge Centre for Croatian Language
Contact name	Daša Farkaš
Contact email	dfarkas@ffzg.hr
Hosted by	(1) Institute of Linguistics, Faculty of Humanities and Social Sciences (FFZG), University of Zagreb, Croatia (2) Institute of Croatian Language (IHJ), Zagreb, Croatia
City of main hub	Zagreb
Country of main hub	HR
Date of certifcation	2023-10-30
Area of competence	The CLARIN Knowledge Centre for Croatian language (CROATINA) provides relevant knowledge about Croatian language and promotes the use of language technologies. CROATINA Helpdesk is a service offering information about the variety of topics related to Croatian language, Croatian language learning and use of language resources. CROATINA also offers help in depositing language resources in HR-CLARIN or other repositories. While the existing K-centre CLASSLA is oriented primarily towards the language resources for all South-Slavic languages, CROATINA will offer complementary language resources specific for Croatian only. Additionally, through FAQ and helpdesk CROATINA will cover relevant linguistic information about the Croatian language, its history, structure at all language levels, typological features, sociolinguistic environment, level of the language technological support, and other linguistically and technologically relevant information. Following their research interests and different language resources two K-centres offer, the users should be able to select the relevant K-centre by themselves, or they could approach CROATINA for additional guidance.
Audiences served	- Linguists - Computational linguists - Language teachers - Digital humanists - Language technology developers - Students - Researchers in information and communication sciences
Types of services	-- Access to data - Access to tools - Depositing - User assistance - FAQ - Helpdesk - Video lectures - Training courses
Is portal for language(s)	- Croatian language (ISO 639-3: hrv, glottolog: croa1245) in all its varieties; e.g. including - Kajkavian dialect (ISO 639-3: kjv; glottolog: kajk1237), - Chakavian dialect (glottolog: chak1265), - Burgenland Croatian dialect (glottolog: 1244), - Croatian Molisano/Slavomolisano dialect (glottolog: slav1254), etc.
Other languages covered	-
Modalities covered	- Written text primarily
Linguistic topics	- Phonology - Morphology - Syntax - Semantics - Pragmatics - Discourse - Corpus linguistics - Language resources - Language technology - Lexicography
Language processing topics	- Tokenisation - PoS/MSD tagging - Lemmatisation - Named entity recognition - Corpora creation and management - Lexica creation and management - Morphological processing at inflectional and derivational level - Multimodal annotations - Parsing (syntactic and semantic) - Language models - Croatian language processing
Data Types	- Monolingual corpora - Parallel corpora - Wordnets - Treebanks - Language models - Morphology databases - Lexica
Resources Families	- Computer-mediated communication corpora (social media) - Corpora of academic texts - Historical corpora - L2 learner corpora - Legal corpora - Literary corpora - Manually annotated corpora - Multimodal corpora - Newspaper corpora - Oral history corpora - Parallel corpora - Reference corpora - Spoken corpora - Conceptual resources - Dictionaries - Glossaries - Language models - Lexica - Wordlists - Normalisation - Named entity recognition - Part-of-speech tagging and lemmatisation - Tools for sentiment analysis
Generic topics	- Data management - Metadata - Language Technology platforms and frameworks - Grammars
Other keywords	- Expertise in CALL for Croatian as L2 (incl. L2 corpora) - Expertise in language processing chain for Croatian based on LLM(s) (incl. POS-tagging and lemmatisation)
Tour de CLARIN	- -
Last update	2024-04-29 16:32:09

DANSK	CLARIN K-Centre DANSK - DANish helpdeSK
Contact name	Costanza Navarretta
Contact email	costanza@hum.ku.dk
Hosted by	(1) Department of Nordic Studies and Linguistics (DNSL), University of Copenhagen, Denmark (2) Danish Language Council (DSN), Bogense, Denmark
City of main hub	Copenhagen
Country of main hub	DK
Date of certifcation	2024-12-09
Area of competence	- Danish language and Danish sign language - Danish language resources - language technology tools for Danish - Natural language processing methods
Audiences served	- Linguists - Computational linguists - Sociolinguists - Language and literature researchers - Citizen scientists
Types of services	-- Helpdesk - How-to documents
Is portal for language(s)	- Danish
Other languages covered	-- English - Danish sign language
Modalities covered	- Text - Video: gestures - Video: sign language
Linguistic topics	- Morphology - Syntax - Semantics - Pragmatics - Lexicography - Multimodality
Language processing topics	- Tokenisation - PoS tagging - Lemmatisation - Named entity tagging - Parsing - TEI annotation - Corpus tools - Natural language processing tools
Data Types	- Text corpora - Wordnets - Lexica - Multimodal annotations
Resources Families	- Historical corpora - Literary corpora - Parliamentary corpora - Lexica
Generic topics	- Natural language processing - Basic processing/annotation of corpora - Data management - Standards
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2025-02-18 12:53:53

DiaRes	CLARIN K-centre for Diachronic Language Resources
Contact name	Eva Pettersson
Contact email	eva.pettersson@lingfil.uu.se
Hosted by	(1) Department of Linguistics and Philology, Uppsala University, Sweden (2) Centre for Digital Humanities, University of Gothenburg, Sweden (3) Riksarkivet (Swedish National Archives), Stockholm, Sweden (4) Språkbanken (The Swedish Language Bank), University of Gothenburg, Sweden
City of main hub	Uppsala
Country of main hub	SE
Date of certifcation	2022-03-08
Area of competence	Diachrionic text collections, historical texts, and tools and resources for processing and analysing them
Audiences served	- researchers in the humanities, with an interest in different aspects of historical texts - historians - social scientists - researchers in literature and history of ideas - historical linguists - computational linguists - researchers working in the field of digital humanities
Types of services	-- Access to tools - Technical support - Web-hosting
Is portal for language(s)	- Swedish
Other languages covered	-
Modalities covered	- Text
Linguistic topics	- Diachronic language studies
Language processing topics	- Diachronic language processing
Data Types	-
Resources Families	- Normalisation
Generic topics	- Diachronic corpora - Natural language processing for historical text - spelling normalisation - digital humanities
Other keywords	-
Tour de CLARIN	- -
Last update	2023-11-30 13:38:29

DiPText-KC	CLARIN Knowledge Centre for Digital and Public Textual Scholarship
Contact name	Franz Fischer, Federico Boschetti
Contact email	franz.fischer@unive.it,federico.boschetti@yahoo.com
Hosted by	(1) Venice Centre for Digital and Public Humanities, DSU, Ca’ Foscari University of Venice, Italy (2) Istituto di Linguistica Computazionale “A. Zampolli” (CNR-ILC), Pisa, Italy
City of main hub	Venice
Country of main hub	IT
Date of certifcation	2024-10-28
Area of competence	The CLARIN Knowledge Centre for Digital and Public Textual Scholarship offers expertise on methods, data, instruments and technologies relevant in the field of Philological and Literary Studies, History, Art History and Cultural Heritage. Its actions aim at 1) sharing information with scholars and students about the state of the art in digital scholarly editing and text annotation through Domain-Specific Languages; 2) supporting scholars and students in the creation and publication of digital scholarly editions and resources; 3) organizing training activities (for example webinars, workshops, and summer schools).
Audiences served	- Digital philologists - Computational philologists - Scholars in literary studies - Art historians - Cultural heritage scholars - Computational Linguists
Types of services	- - FAQ - Helpdesk - Training - User assistance
Is portal for language(s)	-
Other languages covered	-- Italian - Ancient Greek - Latin - Arabic - English - French - German
Modalities covered	- Text - Images
Linguistic topics	- Text encoding - Phonology - Morphology - Syntax - Semantics - Lexicography
Language processing topics	- Automated text encoding - Information extraction - Domain-Specific Language parsing
Data Types	- XML-TEI encoded documents - Treebanks - Wordnets - Dictionaries - Ontologies
Resources Families	- Historical corpora - Literary corpora - Manually annotated corpora - Parallel corpora - Lexica - Dictionaries - Conceptual resources
Generic topics	- Text encoding - Linked Open Data - Domain-Specific Languages
Other keywords	- Data modeling in the domain of philological and literary studies - Software engineering in the domain of philological and literary studies
Tour de CLARIN	- -
Last update	2021-12-14 16:51:05

DR-LIB	CLARIN Knowledge Centre for Digital Resources for the Languages in Ireland and Britain
Contact name	Martin Wynne
Contact email	martin.wynne@ling-phil.ox.ac.uk
Hosted by	(1) Faculty of Linguistics, Philology and Phonetics, University of Oxford, United Kingdom (2) CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes / National Corpus of Contemporary Welsh), Prifysgol Caerdydd / Cardiff University, United Kingdom (3) Fiontar & Scoil na Gaeilge, Dublin City University, United Kingdom (4) Language Technology Group, University of Edinburgh, United Kingdom (5) UCREL Research Centre, Lancaster University, United Kingdom
City of main hub	Oxford
Country of main hub	UK
Date of certifcation	2024-09-02
Area of competence	Digital Resources for the Languages in Ireland and Britain provides advice and support to researchers and others who want to find and use software programmes and digital datasets in the native languages of Britain and Ireland, in all their varieties, in contemporary and historic forms, as well as other languages as they are used in this region. The knowledge centre is virtual and distributed, with a central online presence and contact point at https://www.clarin.ac.uk/dr-lib. Information to orient and help users is posted online, and queries will be responded to by a network of experts centred around the CLARIN-UK consortium, plus additional experts in key languages and domains, such as the Irish National Corpus project (see https://www.gaois.ie/en/about/info), and experts across Europe in the CLARIN network. The centre aims to be a source of authoritative answers for questions like "Is there a semantic tagger for Welsh?", "How do I do OCR for Scottish Gaelic?", "Is there a corpus of Irish-language social media posts?", and "Is there an online dictionary for Old English?"
Audiences served	- Computational linguists - Linguists - Historical linguists - Literary scholars - Language teachers - Historians - Librarians - Language activists - Citizen scientists
Types of services	-- Advice - Access to datasets (via OTA and VLO) - Access to tools - Depositing - FAQ - Helpdesk - Technical support - User assistance
Is portal for language(s)	- English - Irish - Scottish Gaelic - Welsh - Scots
Other languages covered	-- Non-native minority languages in Ireland and Britain - Cornish
Modalities covered	- Speech - Text - Computer-mediated communication
Linguistic topics	- Lexis - Syntax - Usage - Dialectology
Language processing topics	- Corpus building - Linguistic annotation - Named entity recognition - Treebanking - Automatic Speech Recognition (speech to text)
Data Types	- Corpora - Lexical resources - Spoken datasets
Resources Families	- Computer-mediated communication corpora (social media) - Corpora of academic texts - Historical corpora - L2 learner corpora - Legal corpora - Literary corpora - Manually annotated corpora - Multimodal corpora - Newspaper corpora - Oral history corpora - Parallel corpora - Parliamentary corpora - Reference corpora - Spoken corpora - Conceptual resources - Dictionaries - Glossaries - Language models - Lexica - Wordlists - Normalization - Named entity recognition - Part-of-speech tagging and lemmatization - Tools for sentiment analysis - Corpus query tools
Generic topics	-
Other keywords	-
Tour de CLARIN	- -
Last update	2024-12-12 12:34:33

IMPACT-CKC	IMPACT centre of competence - CLARIN K-centre in digitisation
Contact name	Isabel Martínez Sempere, Sally Chambers
Contact email	isabel.martinez@digitisation.eu,sally.chambers@dariah.eu
Hosted by	(1) Universidad de Alicante (UA), Alicante, Spain (2) Instituut voor de Nederlandse Taal (INT), Leiden, The Netherlands (3) Koninklijke Bibliotheek (KB), The Hague, The Netherlands (4) Poznan Supercomputing and Networking Center (PSNC), Poznan, Poland (5) The British Library (BL), London, United Kingdom (6) Bibliothèque nationale de France (BnF), Paris, France (7) Universidad de Guadalajara Mexico (UgG), Guadalajara, Mexico
City of main hub	Alicante
Country of main hub	ES
Date of certifcation	2022-01-24
Area of competence	IMPACT-CKC (IMPACT centre of competence - CLARIN K-centre in digitisation), as knowledge centre offers expertise and resources to institutions and researchers looking for advice in digitisation and related fields. The IMPACT-CKC resources include a collection of high quality images with associated ground truth, historical lexica for 10 languages, training materials and registries on tools, initiatives, datasets and competitions relevant to digitisation and related fields.
Audiences served	- researchers - librarians - archivists - digital humanists - computer scientists in topics related to digitisation
Types of services	-- Access to data - Access to tools - Training - User assistance
Is portal for language(s)	-
Other languages covered	-- Spanish - English - Polish - French - Dutch - German - Slovene - Czech - Latin - Bulgarian
Modalities covered	- Images - Multimodality - Text
Linguistic topics	- corpus linguistics - diachronic language resources - language learning
Language processing topics	- basic language processing - information extraction
Data Types	- lexical data - language models - linked open data - ontologies
Resources Families	- Historical corpora - Literary corpora - Manually annotated corpora - Multimodal corpora - Newspaper corpora - Lexica - Glossaries - Normalisation - Named entity recognition - Part-of-speech tagging and lemmatisation
Generic topics	- OCR - digitisation - visualisation - evaluation of tools
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2024-09-30 09:43:52

K-Dutch	CLARIN Knowledge Centre for Dutch
Contact name	Vincent Vandeghinste
Contact email	vincent.vandeghinste@ivdnt.org
Hosted by	(1) Instituut voor de Nederlandse Taal, Leiden, Netherlands
City of main hub	Leiden
Country of main hub	NL
Date of certifcation	2024-06-10
Area of competence	K-Dutch is the place for researchers who want to find out more about the Dutch language: linguistic properties, language advice, available tools and resources, etymology, dialects and many other things. K-Dutch is hosted by the Instituut voor de Nederlandse Taal (Dutch Language Institute), which is also a CLARIN-B centre and host of many tools, lexica and corpora for Dutch, which are, in general, freely available.
Audiences served	- Computational linguists - Linguists - Language teachers - Historians - Library staff - Sociologists - Students - Citizen scientists
Types of services	-- Access to data - Access to documentation - Access to tools - Depositing - FAQ - Helpdesk - Training - User assistance
Is portal for language(s)	- Dutch
Other languages covered	-- Frisian - Afrikaans - Flemish sign language - Dutch sign language (Sign language of the Netherlands)
Modalities covered	- Audio: speech - Text - Video: sign language - Audio-visual
Linguistic topics	- Morphology - Syntax - Semantics - Language learning - Translation studies - Diachronic language studies - Phonology - Terminology - Dialectology - Lexicography - Natural Language Processing
Language processing topics	- Basic language processing (spell check, pos-tagging, lemmatisation) - Deep parsing - Information extraction - Machine translation - Processing of historical variants of Dutch - Speech recognition - Speech synthesis - Text mining - Terminology extraction - Corpus querying - Treebank querying
Data Types	- Language models - Dictionaries / Lexica - Treebanks - Wordnets - Linked open data - Ontologies - Termbanks
Resources Families	- Computer-mediated communication corpora (social media) - Corpora of academic texts - Historical corpora - L2 learner corpora - Literary corpora - Manually annotated corpora - Multimodal corpora - Newspaper corpora - Parallel corpora - Parliamentary corpora - Reference corpora - Spoken corpora - Lexica - Dictionaries - Conceptual resources - Wordlists
Generic topics	- Artificial intelligence - Natural language processing - Machine learning - Data mining - Lexicography - Linked data
Other keywords	-
Tour de CLARIN	- -
Last update	2024-06-11 12:21:43

K-Icelandic	CLARIN Knowledge Centre for Icelandic
Contact name	Ágústa Þorbergsdóttir, Starkardur Barkarson
Contact email	agusta.thorbergsdottir@arnastofnun.is,starkadur.barkarson@arnastofnun.is
Hosted by	(1) The Icelandic Language Council (Íslensk málnefnd), Reykjavik, Iceland (2) The Árni Magnússon Institute for Icelandic Studies (AMI), Reykjavik, Iceland
City of main hub	Reykjavik
Country of main hub	IS
Date of certifcation	2024-04-15
Area of competence	Information service offering information about the variety of topics related to the Icelandic language. We offer information and advice about the Icelandic language, the use of digital language resources and tools for the Icelandic language (both text and speech) as well as language policy and plan.
Audiences served	- Computational linguists - Linguists - Language teachers - Language learners - Sociologists - Citizen scientists
Types of services	-- Helpdesk - Access to data - Access to tools - Depositing - User assistance
Is portal for language(s)	- Icelandic
Other languages covered	-
Modalities covered	- Text
Linguistic topics	- Computational linguistics - Corpus linguistics - Dialect studies - Discourse - Language learning - Language resources - Language technology - Lexicography - Morphology - Phonology - Pragmatics - Semantics - Sociolinguistics - Syntax
Language processing topics	- Basic language processing (Pos-tagging, lemmatization, parsing …) - Named entity recognition - Machine translation - Processing of morphologically rich languages
Data Types	- Language models - Dictionaries - Treebanks
Resources Families	- Legal corpora - Literary corpora - Newspaper corpora - Parallel corpora - Parliamentary corpora - Dictionaries - Language models - Lexica - Wordlists - Named entity recognition - Part-of-speech tagging and lemmatization
Generic topics	- Natural language processing - Corpus annotation
Other keywords	-
Tour de CLARIN	- -
Last update	2024-10-15 11:47:06

NLP:EL	CLARIN K-Centre for Natural Language Processing in Greece
Contact name	Maria Gavrilidou
Contact email	maria@athenarc.gr
Hosted by	(1) ATHENA Research Centre, Institute for Language and Speech Processing (ILSP), Athens, Greece
City of main hub	Athens
Country of main hub	GR
Date of certifcation	2023-05-16
Area of competence	CLARIN K-Centre NLP:EL is an information service offering expertise and advice on (a) Language Technology for Greek, (b) on the issue of digital readiness of Greek (i.e. how ready is Greek for the digital age, as regards digital resources and language processing tools), and (c) on Greek Sign Language. It provides guidance for the development and annotation of language resources, for the findability, accessibility and use of existing language processing tools and web services, for the development of new technologies for language processing, as well as useful information and support regarding the fields of dynamic sign language synthesis, and special communication and interaction interfaces. NLP:EL provides its services through two channels: it operates a helpdesk where users can address questions concerning the above issues. Besides responding to questions on these topics, it also provides informative material and documentation relevant to these issues; this material includes (but is not limited to) (i) scientific publications and presentations on Natural Language Processing (NLP) research and applications for Greek and Greek sign language, (ii) guides and tutorials on language processing tools and services for Greek and Greek sign language, (iii) direct connection to the specialized Sign Language Technologies website and (iv) direct connection to the CLARIN:EL infrastructure, where the users can find a catalogue of digital language resources and language processing tools, more detailed information and further training and dissemination material.
Audiences served	- Computational linguists - Language Technology developers - Language teachers - Linguists - Philologists - Sign language community - Political scientists - Social Scientists - Digital humanists
Types of services	-- Access to documentation - Helpdesk - How-to documents - Training
Is portal for language(s)	- Greek - Greek sign language
Other languages covered	-
Modalities covered	- Text - Video: gestures - Video: sign language
Linguistic topics	- Lexicography - Morphology - Syntax - Terminology - Corpus linguistics
Language processing topics	- Language processing - Processing of morphologically rich languages - Information extraction - Named entity recognition - Text mining - Machine translation - Sign language technologies - Speech recognition - Speech synthesis
Data Types	- Translation memories - Term banks - Treebanks - Lexical conceptual resources - Dictionaries - Ontologies - Thesauri - Language models
Resources Families	- Historical corpora - Computer-mediated communication corpora (social media) - Newspaper corpora - Parallel corpora - Parliamentary corpora
Generic topics	-
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2023-06-16 12:02:38

PhA-OeAW	Phonogrammarchiv / Austrian Academy of Sciences - CLARIN K-Centre
Contact name	Kerstin Klenke
Contact email	kerstin.klenke@oeaw.ac.at
Hosted by	(1) Phonogrammarchiv, PhA, Austrian Academy of Sciences, Vienna, Austria
City of main hub	Vienna
Country of main hub	AT
Date of certifcation	2024-10-28
Area of competence	As an audio and audio-visual archive with numerous collections of unique research recordings from all across the world, covering a time-span of 125 years, the Phonogrammarchiv offers various services: Besides providing access to its rich data and metadata resources (online, remote & onsite), it advises scholars on audio-visual research methodology in the social sciences & humanities and on technologies of audio and audio-visual documentation, including the loan of recording equipment. In addition, it widely shares its broad expertise on topics such as restoration, digitisation, format obsolescence, cataloguing, metadata, long-term preservation and storage as well as legal and ethical issues.
Audiences served	- scholars - source communities - linguists - ethnomusicologists - social / cultural anthropologists - historians - archivists - audio-visual conservators - museums - media - artists - teachers
Types of services	-- Access to data - Access to documentation - Depositing - Technical support - Training - User assistance
Is portal for language(s)	-
Other languages covered	-- languages / dialects worldwide
Modalities covered	- Audio: speech - Audio-visual
Linguistic topics	- field linguistics - dialect studies - corpus linguistics - language documentation - oral history
Language processing topics	-
Data Types	- audio data - audio-visual data
Resources Families	- Spoken corpora
Generic topics	- physical restoration of audio-visual media - digitisation of audio-visual media - format migration of audio-visual media - metadata of audio-visual media - long-term preservation & storage of audio-visual data - audio-visual fieldwork & documentation - legal issues - ethical issues - data management
Other keywords	- ethnomusicology - musicology - linguistics - field linguistics - social / cultural anthropology - history - African studies - conservation - postcolonial studies - sound studies - critical archive studies
Tour de CLARIN	Introduction Interview
Last update	2024-11-18 15:44:26

PolLinguaTec	CLARIN Knowledge Centre for Polish Language Technology
Contact name	Krzysztof Hwaszcz
Contact email	krzysztof.hwaszcz@pwr.edu.pl
Hosted by	(1) CLARIN-PL Language Technology Centre, a part of Department of Computational Intelligence, Faculty of Computer Science and Management, Wroclaw University of Science and Technology, Poland
City of main hub	Wroclaw
Country of main hub	PL
Date of certifcation	2024-10-28
Area of competence	Provides wide knowledge on the methods of natural language analysis with a special emphasis put on the analysis of Polish language. Offers support for all types of applications of Language Technology for Polish, both mono and multilingual ones.
Audiences served	- linguists - computational linguists - economists - sociologists - psychologists - media researchers - researchers of communication - literature researchers
Types of services	-- Helpdesk - Technical support - Training
Is portal for language(s)	- Polish
Other languages covered	-- English - German - Russian - Ukrainian - Bulgarian - Lithuanian - French - Spanish - Hungarian - Hebrew
Modalities covered	- Audio: speech - Text
Linguistic topics	- semantics - morphology - syntax - phonetics - discourse analysis - stylistics - phraseology - lexicography - terminology - translation studies
Language processing topics	- Polish language processing - topic modeling - stylometry - speech recognition - named entity recognition - corpora creation and management - parallel corpora - wordnets - text mining - information extraction - word sense disambiguation
Data Types	- corpora - dictionaries - records of speech - language models - treebanks - wordnets
Resources Families	- Literary corpora - Newspaper corpora - Parliamentary corpora - Parallel corpora - Spoken corpora
Generic topics	- Data management - Legal issues - Machine learning - Metadata - OCR - Standards - Language normalisation - Data acquisition - Support in preparing grants' proposals
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2024-10-11 11:26:17

PORTULAN	CLARIN Knowledge Centre for the Science and Technology of the Portuguese Language
Contact name	António Branco
Contact email	antonio.branco@di.fc.ul.pt
Hosted by	(1) Faculdade de Ciências da Universidade de Lisboa, Portugal (2) Faculdade de Letras da Universidade de Lisboa, Portugal (3) Universidade de Évora, Portugal
City of main hub	Lisbon
Country of main hub	PT
Date of certifcation	2022-07-29
Area of competence	The Science and Technology of the Portuguese Language is the thematic area of this CLARIN Knowledge Centre. Related to the Portuguese language, it covers all topics, from Phonetics to Discourse and Dialogue, considering all language functions, from communicative performance to cultural expression, approached by all disciplines, from Theoretical Linguistics to Language Technology, covering all language variants, from national standard varieties across the world to dialects of professional groups, taking into account all media of representation, from audio to brain imageology recordings.
Audiences served	- Researchers - Innovators - Citizen scientists - Students - Language professionals - Users in general whose activities resort to research results from the Science and Technology of Language
Types of services	-- Access to data - Access to tools - Helpdesk - Technical support
Is portal for language(s)	- Portuguese
Other languages covered	-- Mirandese
Modalities covered	- Audio: speech - Audio-visual - Multimodality - Sensor data: biosignals - Text - Video: sign language
Linguistic topics	- Anthropological Linguistics - Applied Linguistics - Clinical Linguistics - Cognitive Science - Computational Linguistics - Discipline of Linguistics - Discourse Analysis - Forensic Linguistics - General Linguistics - Genetic Classification - Historical Linguistics - History of Linguistics - Language Acquisition - Language Documentation - Lexicography - Linguistic Theories - Morphology - Neurolinguistics - Philosophy of Language - Phonetics - Phonology - Pragmatics - Psycholinguistics - Semantics - Sociolinguistics - Syntax - Text/Corpus Linguistics - Translation - Typology - Writing Systems
Language processing topics	- Language understanding - Language generation - Speech recognition and transcription - Speech synthesis - Muliti-modal processing - Information extraction - Text mining - Conversational interfaces and chatbots - Machine translation - Summarisation - Question answering - Subtitling - Tokenisation - POS tagging - Named entity recognition - Word sense disambiguation - Syntactic analysis - Semantic analysis - Anaphora resolution - Dialogue processing - Speaker detection
Data Types	- Corpora, written, spoken and multi-modal - Word embeddings - Language models - Dictionaries - Ontologies - Term banks - Translation memories - Treebanks - Typological databases - Wordnets - Lexica - Conceptual Resources - Glossaries - Wordlists - Speech databases - Multi-modal databases
Resources Families	- Computer-mediated communication corpora (social media) - Corpora of academic texts - Historical corpora - L2 learner corpora - Literary corpora - Manually annotated corpora - Multimodal corpora - Newspaper corpora - Parallel corpora - Parliamentary corpora - Reference corpora - Spoken corpora
Generic topics	- Data management - Ethical issues - Evaluation of tools - Language use in specific domains (e.g. legal or medical language) - Legal issues - Metadata - Standards - Visualisation
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2023-11-30 13:39:45

RoNLP	CLARIN K-Centre for Romanian Natural Language Processing
Contact name	Ciprian-Octavian Truică, Elena-Simona Apostol
Contact email	ciprian.truica@upb.ro,elena.apostol@upb.ro
Hosted by	(1) National University of Science and Technology Politehnica of Bucharest, Romania
City of main hub	Bucharest
Country of main hub	RO
Date of certifcation	2024-08-12
Area of competence	Information service offering advice on the use of digital language resources and tools for the Romanian language and Romanian dialects, as well as other parts of the intangible cultural heritage of Romanian in textual data (e.g., news articles, literary works, etc.).
Audiences served	- Computational Linguists - Natural Language Processing practitioners - Natural Language Generation practitioners - Natural Language Understanding practitioners - Natural Language Translation practitioners
Types of services	-- Access to data - Access to tools - Data processing models - FAQ - Technical support / user assistance - Hosting researchers for visits to the K-Centre
Is portal for language(s)	- Romanian
Other languages covered	-- English
Modalities covered	- Text
Linguistic topics	- Morphology - Syntax - Semantics - Diachronic language studies - Romanian texts annotations
Language processing topics	- Basic language processing - Text preprocessing and cleaning - Information extraction - Text mining - Natural Language Processing - Natural Language Understanding - Natural Language Generation - Sentiment Analysis - Aspect-Based Sentiment Analysis - Topic Modeling - Diachronic, Semantic Change, and Semantic Shift
Data Types	- Language models - Large language Models - Wordnets - Linked open data - Ontologies
Resources Families	- Corpora of academic texts - Historical corpora - Legal corpora - Literary corpora - Manually annotated corpora - Newspaper corpora - Parliamentary corpora - Language models - Wordlists - Normalization - Named entity recognition - Part-of-speech tagging and lemmatization - Tools for sentiment analysis
Generic topics	- Textual data management - Machine Learning - Deep Learning - Big Data Analysis - Visualization - Natural Language Processing - Sentiment Analysis
Other keywords	- Parallel and Distributed Processing
Tour de CLARIN	- -
Last update	2024-10-04 17:04:21

SAFMORIL	CLARIN Knowledge Centre for Systems and Frameworks for Morphologically Rich Languages
Contact name	Krister Lindén
Contact email	krister.linden@helsinki.fi
Hosted by	(1) University of Helsinki, Finland (2) University of Tromsø, Norway (3) University of Latvia, Riga, Latvia (4) Vytautas Magnus University, Kaunas, Lithuania
City of main hub	Helsinki
Country of main hub	FI
Date of certifcation	2022-10-17
Area of competence	SAFMORIL brings together researchers and developers in the area of computational morphology and its Natural language processing applications. The focus of SAFMORIL is actual, working systems and frameworks based on linguistic principles providing linguistically motivated analyses and generation outputs. Such systems are relevant in particular for languages with rich morphologies. SAFMORIL offers online courses for developing morphologies, tokenizers and spell-checkers, and a repository for storing morphologies.
Audiences served	- Computational linguists - Computer scientists - Language teachers
Types of services	-- Access to data - Access to tools - Helpdesk - Training
Is portal for language(s)	-
Other languages covered	-- Nordic languages - Baltic languages - Finnish - Swedish - Norwegian - Latvian - Lithuanian - Sámi languages - Fenno-Ugric languages - Inuit languages - Canadian First Nation languages - Babylonian languages
Modalities covered	- Text
Linguistic topics	- Morphology - Morphosyntax
Language processing topics	- Processing of morphologically rich languages
Data Types	- Lexical resources containing inflectional, derivational and compounding information - morphosyntactic grammars - language models - Morphological Lexicons - Grammars
Resources Families	-
Generic topics	- Finite-State Applications - Statistical Methods - Neural Networks
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2022-10-20 10:24:24

SIKT-K-centre	CLARIN Knowledge Centre for Data Management at Sikt
Contact name	Cecilie Hopland Jentoft
Contact email	cecilie.jentoft@sikt.no
Hosted by	(1) Sikt - Norwegian Agency for Shared Services in Education and Research, Bergen, Norway
City of main hub	Bergen
Country of main hub	NO
Date of certifcation	2022-03-29
Area of competence	Provides expertise in open data, data management, data sharing, data access, data archiving, and legal and ethical issues related to data protection and GDPR. Sikt has extensive data collections focusing on social sciences, humanities and medical and health research, and offers online services to make it easier to find and share research data.
Audiences served	- Researchers - Students - Social and humanities scientists - Teachers - General public - Citizen scientists - Journalists - Politicians
Types of services	-- Data archiving - Data curation - Access to data - Access to documentation - Access to tools - Data deposit - Help/support - Guidelines and online information resources - FAQ - Training
Is portal for language(s)	-
Other languages covered	-
Modalities covered	-
Linguistic topics	-
Language processing topics	-
Data Types	-
Resources Families	-
Generic topics	- Data archiving - Data curation - Data management - Data management planning - Legal and ethical issues - GDPR - Data dissemination - Metadata - Survey data
Other keywords	- Data Protection Service - Training
Tour de CLARIN	- -
Last update	2024-05-06 17:07:38

Spanish-K-centre	Spanish CLARIN Knowledge Centre
Contact name	Mikel Iruskieta
Contact email	mikel.iruskieta@ehu.eus
Hosted by	(1) HiTZ - Basque Center for Language Technology, University of the Basque Country, Donostia, Spain (2) ILG - Instituto da Lingua Galega, Universidade de Santiago de Compostela, Spain (3) UNED - LINHD: Laboratorio de Innovacion de Humanidades Digitales, Universidad Nacional de Educacion a Distancia, Madrid, Spain (4) SINAI - Sistemas Inteligentes de Acceso a la Información, Universidad de JAEN, Spain (5) IATEXT - Instituto Universitario de Análisis y Aplicaciones Textuales, Universidad de Las Palmas de Gran Canaria, Spain (6) BSC-CNS - Barcelona Super Computing Center - Centro Nacional de Supercomputación, Barcelona, Spain
City of main hub	Donostia
Country of main hub	ES
Date of certifcation	2024-10-28
Area of competence	The Spanish CLARIN K-Centre aims to provide knowledge, services, consultancy and specialized web services to the Humanities and Social Science research communities. Our web services and consultancy is about how to use and research with basic tools that can handle and exploit textual data at least in the four (co)official languages (Spanish, Catalan, Galician, Basque) and English, which is one of the most important sources of information for many HSS disciplines.
Audiences served	- Researchers from Humanities and Social Sciences (and beyond) interested in Spanish, Basque, Catalan and Galician
Types of services	-- Access to data - Access to tools - User assistance - Training
Is portal for language(s)	- Spanish - Basque - Catalan - Galician
Other languages covered	-
Modalities covered	- Text
Linguistic topics	- general linguistics (phonology, morphology, syntax, semantics, pragmatics) - computational linguistics - corpus linguistics - applied linguistics - stylistics
Language processing topics	- morphology - syntax - semantics - discourse
Data Types	- Lexical databases: general, sentiment, NERC - Syntax Tree banks - Discourse Tree banks: correference, relational - Spoken databases - Semantic annotation: semantic roles, word sense - Error annotation - Image bank (wikimedia) - Conversational QA
Resources Families	-
Generic topics	- Grammars - Finite-State Applications - Statistical Methods - Neural Networks
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2024-10-12 12:07:46

SWELANG	CLARIN Knowledge Centre for The Languages of Sweden
Contact name	Rickard Domeij
Contact email	Rickard.Domeij@isof.se
Hosted by	(1) The Language Council of Sweden, Institute of Language and folklore (ISOF), Stockholm, Sweden
City of main hub	Stockholm
Country of main hub	SE
Date of certifcation	2024-10-28
Area of competence	The SWELANG K-centre offers advice on the languages, dialects and other parts of the intangible cultural heritage of Sweden. The focus is on the use of digital language resources and tools for collecting, processing and making language data available for research and development.
Audiences served	- Students, teachers and researchers in social sciences and humanities with an interest in the languages of Sweden - Representatives of cultural heritage institutions - Citizen scientists - More generally those who are interested in the use of NLP tools and services to explore collections of written and spoken material in Swedish.
Types of services	-- Access to data - Access to tools - FAQ - Helpdesk - Access to documentation - Data processing on demand - Depositing - Technical support / user assistance
Is portal for language(s)	- Swedish
Other languages covered	-- Finnish - Meänkieli - Romani - Yiddish - Swedish sign language - other languages in Sweden
Modalities covered	- Audio: speech - Audio-visual - Multimodality - Text - Video: gestures - Video: sign language
Linguistic topics	- Language policy and planning - Language infrastructure - Language resources - Language technology - Dialect studies - Sociolinguistics - Plain language and language comprehensibility - Terminology - Lexicography
Language processing topics	- Basic language processing - Information extraction - Topic modelling
Data Types	- Dictionaries - Term banks - Translation memories
Resources Families	- Historical corpora - Multimodal corpora - Parallel corpora - Spoken corpora - Lexica - Dictionaries - Conceptual resources - Glossaries - Wordlists
Generic topics	- Language policy and planning - Swedish dialects - Folkloristics - Digital humanities - Metadata - OCR - Machine learning - Legal issues
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2021-10-17 13:36:08

Treebanking	CLARIN Knowledge Centre for Treebanking
Contact name	Juliane Marie-Thérèse Tiemann, Pavel Straňák
Contact email	juliane.tiemann@uib.no,stranak@ufal.mff.cuni.cz
Hosted by	(1) INESS at CLARINO Bergen Centre, University of Bergen, Norway (2) LINDAT/CLARIAH-CZ, Charles University, Prague, Czech Republic
City of main hub	Bergen
Country of main hub	NO
Date of certifcation	2021-11-29
Area of competence	We can assist in making treebanks accessible and and searchable on our two websites. We provide documentation, guidelines and user support for the online exploration of the available treebanks. We can assist in the online construction of LFG treebanks as parsed corpora and in the online editing of Universal Dependency treebanks. We disseminate our knowledge through occasional tutorials and workshops on treebanking.
Audiences served	- Research groups and projects wishing to construct and deposit treebanks - Researchers, developers and students wishing to explore treebanks
Types of services	-- Access to data - Access to tools - Helpdesk - How-to documents - User assistance - Training
Is portal for language(s)	-
Other languages covered	-
Modalities covered	- Text - Audio: transcript - Video: sign language
Linguistic topics	- Syntax - Morphology - Semantics - Pragmatics
Language processing topics	- Building treebanks - Editing treebanks - Processing treebanks - Disambiguating treebanks - Parsing corpora - Syntactic annotation - Alignment of treebanks - Integration of treebanks
Data Types	- Treebanks
Resources Families	-
Generic topics	- Formats - Standards
Other keywords	-
Tour de CLARIN	Introduction Interview
Last update	2024-11-18 15:43:24

TRTC	CLARIN K-Centre for Terminology Resources and Translation Corpora
Contact name	Vesna Lusicky
Contact email	vesna.lusicky@univie.ac.at
Hosted by	(1) Centre for Translation Studies at the University of Vienna, Austria
City of main hub	Vienna
Country of main hub	AT
Date of certifcation	2022-03-08
Area of competence	The K-centre provides information and training to users on the preparation and documentation of translation-related resources, in particular terminology resources and translation corpora. This includes inquiries submitted to the Helpdesk related to tools, methods, data, and guidance in seeking further expert support. The service does not focus on language resources in particular languages, but is language independent.
Audiences served	- Terminologists - Translation studies researchers - Linguists - Computational linguists - Citizen scientists
Types of services	-- Helpdesk - How-to documents - Training
Is portal for language(s)	-
Other languages covered	-
Modalities covered	- Text - Text
Linguistic topics	- Terminology - Translation studies
Language processing topics	- Machine translation
Data Types	- Term banks - Translation memories - Translation corpora
Resources Families	-
Generic topics	-
Other keywords	-
Tour de CLARIN	- -
Last update	2022-03-07 17:26:53

UkrNLP-Corpora	CLARIN K-Centre for Ukrainian NLP and Corpora
Contact name	Olha Kanishcheva
Contact email	kanichshevaolga@gmail.com
Hosted by	(1) University of Jena, Germany
City of main hub	Jena
Country of main hub	DE
Date of certifcation	2023-10-03
Area of competence	Information service providing guidance on digital linguistic resources and tools for the Ukrainian language, Ukrainian dialects, other languages in Ukraine and other aspects of Ukraine's rich intangible cultural heritage in both written and spoken form, alongside language policy and strategy. Our information service is dedicated to aiding individuals in navigating the realm of digital language resources and tools specifically tailored for the Ukrainian language.
Audiences served	- Computational linguists - Linguists - Language teachers - Historians - Library staff - Sociologists - Citizen scientists
Types of services	-- Access to data - Access to documentation - Access to tools - Data processing on demand - FAQ - Technical support / user assistance - Training - Webhosting
Is portal for language(s)	- Ukrainian
Other languages covered	-- Ukrainian-Russian mixed speech
Modalities covered	- Text - Audio: speech
Linguistic topics	- Corpus linguistics - Dialectology - Diachronic language studies - Language variation - Morphology - Syntax - Semantics - Language learning
Language processing topics	- Basic language processing - Information Extraction - Machine translation - Processing of morphologically rich languages - Language Identification
Data Types	- Language models (LLMs) - Dictionaries - Treebanks - Linked open data
Resources Families	- Corpora of academic texts - Historical corpora - Legal corpora - Literary corpora - Manually annotated corpora - Newspaper corpora - Oral history corpora - Parallel corpora - Parliamentary corpora - Reference corpora - Spoken corpora - Dictionaries - Glossaries - Language models - Lexica - Wordlists - Normalisation - Named entity recognition - Part-of-speech tagging and lemmatisation
Generic topics	-
Other keywords	-
Tour de CLARIN	- -
Last update	2024-04-26 15:34:44