Knowledge Centres

List of all 23 CLARIN K-centres with their organisation details


Click on the acronym or full name of the K-centre to go to its landing page

ACE

CLARIN Knowledge Centre for Atypical Communication Expertise

Contact nameHenk van den Heuvel
Contact emailH.vandenHeuvel@let.ru.nl
Hosted by(1) CLST - Centre for Language and Speech Technology at the Radboud University:wq, Nijmegen, Netherlands
City of main hubNijmegen
Country of main hubNL
Date of certifcation2019-06-19
Area of competenceAtypical communication encompasses language and speech as encountered during (second) language acquisition and development, and in language disorders, but also more broadly in bilingual language development and in sign language. ACE is specialised in this type of research and concomitant infrastructural issues related to data acquisition, processing and sharing, which is typically highly characterised by sensitivity issues. For data storage and access the centre collaborates with MPI's TLA (The Language Archive) which is a CLARIN B Centre and also based in Nijmegen.
Audiences served- linguists
- psychologists
- neuroscientists
- computer scientists
- speech and language therapists
- education specialists
Types of services-- Information and guidelines
- consent (forms)
- hosting corpora and datasets containing atypical communication
- where to find corpora and datasets containing atypical communication
- including FAQ
- Helpdesk/consultancy for questions on these topics
- Technical assistance for designing, creating, annotating, formatting and metadating these resources
- Outreach: presentations, workshops contributions, etc
Is portal for language(s)-
Other languages covered-
Modalities covered- Text
- speech
- sign language (video)
Linguistic topics- Language acquisition (L1 and L2)
- language disorders
- Language learning
Language processing topics-
Data Types-
Resources Families- Spoken corpora
- Manually annotated corpora
- Multimodal corpora
Generic topics- Critical Data Management
- Legal and ethical issues
Other keywords- Language acquisition
- sign language
- language pathologies
Tour de CLARINIntroduction Interview
Last update2021-01-20 10:48:25

CKLD

CLARIN Knowledge-Centre for linguistic diversity and language documentation

Contact nameFelix Rau
Contact emailf.rau@uni-koeln.de
Hosted by(1) DCH: Data Centre for the Humanities (University of Cologne), Germany
(2) HZSK: Hamburg Centre for Language Corpora (University Hamburg), Germany
(3) INEL: Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages (Academy of Sciences and Humanities in Hamburg), Germany
(4) ELAR: Endangered Languages Archive (SOAS University of London), United Kingdom
(5) SWLI: SOAS World Languages Institute (SOAS University of London), United kingdom
(6) IfL: Department of Linguistics (University of Cologne), Germany
(7) ZAS: Leibniz-Zentrum Allgemeine Sprachwissenschaft, Berlin, Germany
City of main hubCologne
Country of main hubDE
Date of certifcation2018-03-28
Area of competenceThe CLARIN Knowledge-Centre for linguistic diversity and language documentation offers expertise on data and data-related methods, technology and background information on language resources and tools to researchers - including students and native speakers. CKLD provides information and assistance relating to fieldwork and data-related methodological aspects and in particular relating to equipment, digital tools, methods, where to find data and information, whom to contact for specialist information on particular regions or language families.
Audiences served- Linguists
- field linguists
- typologists
- language communities of endangered languages
Types of services-- Information materials
- guidelines
- tutorials
- consultancy
Is portal for language(s)-
Other languages covered-- Under-researched languages and languages families (linguistic diversity).
- Athabascan
- Austronesian
- Austro-Asiatic
- Dravidian
- Finno-Ugric
- Papuan
Modalities covered- Text
- audio-visual recordings of speech
Linguistic topics- language documentation
- linguistic typology
- linguistic fieldwork
Language processing topics-
Data Types- audio-visual collections
- typological databases
Resources Families- Manually annotated corpora
- Spoken corpora
- Dictionaries
- Wordlists
Generic topics- linguistic fieldwork
- endangered languages
- language typology
Other keywords-
Tour de CLARIN- -
Last update2021-01-30 12:15:09

CLARIN-HUMLAB

CLARIN Knowledge Centre of Lund University Humanities Lab

Contact nameJohan Frid
Contact emailjohan.frid@humlab.lu.se
Hosted by(1) Lund University Humanities Lab at Lund University, Sweden
City of main hubLund
Country of main hubSE
Date of certifcation2017-02-12
Area of competenceAdvice on multimodal and sensor-based methods, including EEG, eye-tracking, articulography, virtual reality, motion capture, av-recording, natural language processing
Audiences served- Language archivists
- Linguists
- Behavioural and cultural scientists
Types of services-- Language archiving
- Assistance and mentoring with the use of data and tools
- Consultancy
- Training courses
Is portal for language(s)-
Other languages covered-- Swedish
- English
- Austroasiatic languages
Modalities covered- Audio-visual data
- Gestures
- Multi-modality
- Sensory data
- Speech
- Written text
Linguistic topics- Comparative and phylogenetic linguistics
- Dialect studies
- Field linguistics
- Language diversity
- Language learning
- Language production
- Neurolinguistics
- Phonetics
- Psycholinguistics
Language processing topics- Information extraction
- Keystroke logging
- Named entity recognition
- Text mining
- Visual prosody
Data Types- Dictionaries
- Language models
- Lexical and typological/morphosyntactic databases
Resources Families- Historical corpora
- L2 learner corpora
- Manually annotated corpora
- Multimodal corpora
- Spoken corpora
- Oral history collections
Generic topics- Multimodal and sensor-based methods
- EEG
- Eye-tracking
- Articulography
- Virtual reality
- Motion capture
- audio-visual-recording
- Working with GIS
Other keywords- Interdisciplinary research
- Methodological know-how
- E-science
Tour de CLARINIntroduction Interview
Last update2021-01-20 10:45:11

CLARIN-Learn

CLARIN Knowledge Centre for Language Learning Analysis

Contact nameBrian MacWhinney
Contact emailmacw@cmu.edu
Hosted by(1) Carnegie Mellon University, Pittsburgh, United States
City of main hubPittsburgh
Country of main hubUS
Date of certifcation2016-05-15
Area of competenceOur centre is happy to provide advice on tools, corpora, and methods for the study of first and second language learning, conversational interactions, and a variety of language and developmental disabilities including aphasia, stuttering, TBI, dementia, and ASD.
Audiences served- linguists
- psycholinguistics
- developmental psychologists
- computer scientists
- educators
- clinicians
Types of services-- corpora
- tools
- web screencasts
- manuals
- GoogleGroup mailing lists
- email support
- workshops
Is portal for language(s)-
Other languages covered-- 35 languages, with a particularly wide coverage for child language
Modalities covered- Speech
- gestures
Linguistic topics- language development
- conversation
- language disorders
- child language
- second language
- bilingualism
- conversations
Language processing topics- speech analysis
- archiving
- fluency
- lexical access
Data Types- Corpora, both text and multimodal
- clinical data
Resources Families-
Generic topics-
Other keywords-
Tour de CLARINIntroduction Interview
Last update2021-01-15 14:42:49

CLARIN-SMS

CLARIN Knowledge Centre for Swedish in a Multilingual Setting

Contact nameMats Wirén
Contact emailmats.wiren@ling.su.se
Hosted by(1) Linköping University, Sweden
(2) Stockholm University, Sweden
(3) Uppsala University, Sweden
City of main hubStockholm
Country of main hubSE
Date of certifcation2019-01-16
Area of competenceLanguage technology and resources for Swedish, Swedish Sign Language, and multilingual settings. Expertise in the processing of parallel corpora including alignment and machine translation, pretrained language models, cross-linguistically consistent annotation within the framework of Universal Dependencies, and computation and evaluation of measures of text complexity.
Audiences served- Linguists
- Phoneticians
- Psycholinguists
- Historians
- Literary scientists
- Art historians
- Digital humanities researchers
- Economists
- General public
Types of services-- Assistance with the use of data and tools
- Mentoring
- Consultancy
- Training courses
Is portal for language(s)-
Other languages covered-- Swedish
- Swedish Sign Language
- English
- Expertise in linguistic diversity and multilingual applications
Modalities covered- Written language
- Sign language
- Spoken language
Linguistic topics- Computational linguistics
- Language diversity
- Field linguistics
- Language learning
- Neurolinguistics
- Phonetics
- Phonology
- Pragmatics
- Psycholinguistics
- Semantics
Language processing topics- Language processing pipelines
- Part-of-speech tagging
- Named entity recognition
- Syntactic parsing
- Machine translation
- Processing of discourse relations
- Pretrained models
- Text simplification
Data Types- Parallel corpora
- Treebanks
- Language models
- Typological databases
- Text
- Video
- Audio
Resources Families- L2 learner corpora
Generic topics- NLP
- Corpus annotation
- Machine learning
- Digital humanities
Other keywords-
Tour de CLARIN- -
Last update2021-01-20 10:43:02

CLARIN-SPEECH

CLARIN Knowledge Centre for Speech Analysis

Contact nameDavid House
Contact emaildavidh@speech.kth.se
Hosted by(1) KTH Royal Institute of Technology, Division of Speech Music and Hearing, Stockholm, Sweden
City of main hubStockholm
Country of main hubSE
Date of certifcation2015-12-12
Area of competenceTechnical advice on speech analysis relating to all aspects of speech technology, including speech science, speech applications, and speech in interaction.
Audiences served- Archivists
- Computational linguists
- Computer scientists
- Historians
- Language teachers
- Librarians
- Linguists
- Phoneticians
- Speech Pathologists
- Sociologists
- Sociolinguists
Types of services-- Awareness
- Consultancy
- Mentoring
- Thesis supervision
- Training courses
- Tools
- Workshops
Is portal for language(s)-
Other languages covered-- English
- Swedish
Modalities covered- Audio-visual data
- Biosignals
- Gestures
- Multi-modality
- Sensor data
- Speech
Linguistic topics- Phonetics
- Speech pathology
Language processing topics- Speech analysis
- Speech modelling
- Speech processing
- Speech recognition
- Speech synthesis
Data Types- Acoustic and language models
- Biosignals related to spoken interaction
- Dictionaries
- Pronunciation data
- Vocabularies
Resources Families- Oral history collections
- Parliamentary corpora
Generic topics- ASR
- Data management
- Deep learning
- Evaluation
- Legal issues
- Tools
- Visualization
Other keywords-
Tour de CLARIN- -
Last update2021-01-14 15:28:44

CLASSLA

CLARIN Knowledge Centre for South Slavic languages

Contact nameNikola Ljubešić
Contact emailnikola.ljubesic@ijs.si
Hosted by(1) Jožef Stefan Institute (CLARIN.SI), Ljubljana, Slovenia
(2) Institute of Information and Communication Technologies (CLADA-BG), Sofia, Bulgaria
City of main hubLjubljana
Country of main hubSI
Date of certifcation2019-03-19
Area of competenceOffers expertise on language resources and technologies for South Slavic languages
Audiences served- Computational linguists
- Computer scientists
- Citizen scientists
- Historians
- Language teachers
- Linguists
- Sociolinguists
- Sociologists
Types of services-- Helpdesk
- Consultancy
- Assistance with the use of data and tools
- Frequently Asked Questions documents
- Training courses
- Internships
Is portal for language(s)- Slovenian
- Slovene
- Croatian
- Bosnian
- Serbian
- Montenegrin
- Macedonian
- Bulgarian
Other languages covered-
Modalities covered- Text
- Speech
Linguistic topics- Applied linguistics
- Dialect studies
- Sociolinguistics
Language processing topics- Basic language processing
- Information extraction
- Language understanding
- Named entity recognition
- Processing of morphologically rich languages
- Speech recognition
Data Types- Manually annotated datasets
- Corpora
- Language models
- Treebanks
Resources Families- Computer-mediated communication corpora (social media)
- Historical corpora
- Literary corpora
- Newspaper corpora
- Web corpora
- Parliamentary corpora
Generic topics- Evaluation of tools
- Machine learning
- Deep learning
Other keywords- Processing of closely related languages
- Language variation
- Spatial language variation
Tour de CLARIN- -
Last update2021-01-14 15:37:39

CORLI-K-centre

CORLI French CLARIN Knowledge Centre for Corpora, Languages and Interaction

Contact nameCéline Poudat, Christophe Parisse
Contact emailceline.poudat@univ-cotedazur.fr,cparisse@parisnanterre.fr
Hosted by(1) Huma-Num / CNRS UMS 3598, Paris, France
City of main hubParis
Country of main hubFR
Date of certifcation2020-07-14
Area of competenceCorpus linguistics with a special focus on the French language and the languages of France
Audiences served- Linguists
- Sociolinguists
- Language teachers
- Computational linguists
- Literature
- Language diachrony
Types of services-- Projets groups depending on the issues and needs of the linguistic community
- Good practices for corpus creation and maintenance
Is portal for language(s)- French
Other languages covered-- LSF (French Sign Language)
- Old French
Modalities covered- Text corpora
- Spoken language corpora
- Video/multimodal corpora
Linguistic topics- Format and tools for corpus studies
- Multilingual, multimodal corpora
Language processing topics- Corpus format and tools for creating and analysis corpora
Data Types- Text
- TEI
- Spoken language
- Video
Resources Families-
Generic topics- Legal issues
Other keywords- Continuous education
Tour de CLARIN- -
Last update2021-01-21 11:53:01

CorpLingCz

Czech CLARIN Knowledge Centre for Corpus Linguistics

Contact nameMichal Kren
Contact emailmichal.kren@ff.cuni.cz
Hosted by(1) Czech National Corpus, Prague, Czech Republic
City of main hubPrague
Country of main hubCZ
Date of certifcation2018-12-04
Area of competenceProvides information, consulting and technical assistance on all topics related to corpus linguistics. This includes data formats, annotation, corpus querying, corpus linguistics methodology, statistical methods etc. Another specialization of the centre is empirical research on the Czech language.
Audiences served- Computational linguists
- Computer scientists
- Language teachers
- Linguists
- Psycholinguists
- Sociolinguists
Types of services-- Assistance with the use of data and tools
- Consultancy
- Data processing on demand
- Helpdesk
- Training courses
- Workshops
Is portal for language(s)- Czech
Other languages covered-
Modalities covered- Speech
- Written text
Linguistic topics- Applied linguistics
- Corpus linguistics
- Diachronic language studies
- Dialect studies
- Discourse
- Language learning
Language processing topics- Basic language processing
- Processing of morphologically rich languages
Data Types-
Resources Families- Computer-mediated communication corpora (social media)
- Historical corpora
- L2 learner corpora
- Literary corpora
- Manually annotated corpora
- Newspaper corpora
- Parallel corpora
- Reference corpora
- Spoken corpora
Generic topics-
Other keywords-
Tour de CLARINIntroduction Interview
Last update2021-01-14 15:36:04

DANSK

CLARIN K-Centre DANSK - DANish helpdeSK

Contact nameCostanza Navarretta
Contact emailcostanza@hum.ku.dk
Hosted by(1) Department of Nordic Studies and Linguistics (DNSL)(University of Copenhagen), Denmark
(2) Danish Language Council (DSN), Copenhagen, Denmark
City of main hubCopenhagen
Country of main hubDK
Date of certifcation2016-03-10
Area of competence- Danish language and Danish sign language
- Danish language resources
- language technology tools for Danish
- NLP methods
Audiences served- Linguists
- Computational linguists
- Sociolinguists
- Language and literature researchers
- Citizen scientists
Types of services-- Best practice documents
- Consultancy
- Helpdesk
Is portal for language(s)- Danish
Other languages covered-- English
- Danish sign language
Modalities covered- Text
- Sign language
- Gestures
Linguistic topics- Morphology
- Syntax
- Semantics
- Pragmatics
- Lexicography
- Multimodality
Language processing topics- Tokenization
- PoS tagging
- Lemmatization
- Named entity tagging
- Parsing
- TEI annotation
- Corpus tools
- NLP tools
Data Types- Text corpora
- Wordnets
- Lexica
- Multimodal annotations
Resources Families- Historical corpora
- Literary corpora
- Parliamentary corpora
- Lexica
Generic topics- NLP
- Basic processing/annotation of corpora
- Data management
- Standards
Other keywords-
Tour de CLARIN- -
Last update2021-01-15 14:30:17

DiaRes

CLARIN K-centre for Diachronic Language Resources

Contact nameEva Pettersson
Contact emaileva.pettersson@lingfil.uu.se
Hosted by(1) Centre for Digital Humanities, University of Gothenburg, Sweden
(2) Department of Lingistics and Philology, Uppsala University, Sweden
(3) Riksarkivet (Swedish National Archives), Stockholm, Sweden
(4) Språkbanken (The Swedish Language Bank), University of Gothenburg, Sweden
City of main hubUppsala
Country of main hubSE
Date of certifcation2019-01-16
Area of competenceDiachrionic text collections, historical texts, and tools and resources for processing and analysing them
Audiences served- researchers in the humanities, with an interest in different aspects of historical texts
- historians
- social scientists
- researchers in literature and history of ideas
- historical linguists
- computational linguists
- researchers working in the field of digital humanities
Types of services-- technical assistance
- guiding persons to corpora and tools to be used for analysing diachronic texts
- we host websites for access to diachronic corpora and related tools
- storing and maintaining diachronic corpora
- making such corpora available to the public
- offering tools for processing diachronic (in particular historical) text in different aspects
- offering advice on compilation, annotation and formatting of diachronic corpora
Is portal for language(s)- Swedish
Other languages covered-
Modalities covered- Text
Linguistic topics- Diachronic language studies
Language processing topics- Diachronic language processing
Data Types-
Resources Families- Normalization
Generic topics- Diachronic corpora
- NLP for historical text
- spelling normalization
- digital humanities
Other keywords-
Tour de CLARIN- -
Last update2021-01-18 12:48:40

IMPACT-CKC

IMPACT centre of competence - CLARIN K-centre in digitisation

Contact nameIsabel Martínez Sempere
Contact emailisabel.martinez@digitisation.eu
Hosted by(1) Universidad de Alicante (UA), Alicante, Spain
(2) Instituut voor de Nederlandse Taal (INT), Leiden, The Netherlands
(3) Koninklijke Bibliotheek (KB), The Hague, The Netherlands
(4) Poznan Supercomputing and Networking Center (PSNC), Poznan, Poland
(5) The British Library (BL), London, United Kingdom
(6) Bibliothèque nationale de France (BnF), Paris, France
(7) Universidad de Guadalajara Mexico (UgG), Guadalajara, Mexico
City of main hubAlicante
Country of main hubES
Date of certifcation2018-12-04
Area of competenceIMPACT-CKC (IMPACT centre of competence - CLARIN K-centre in digitisation), as knowledge centre offers expertise and resources to institutions and researchers looking for advice in digitisation and related fields. The IMPACT-CKC resources include a demonstrator platform for online testing tools, a collection of high quality images with associated ground truth, historical lexica for 10 languages as well as training materials and registries on tools, initiatives, datasets and competitions.
Audiences served- researchers
- librarians
- archivists
- digital humanists
- computer scientists in topics related to digitisation
Types of services-- tools
- data
- mentoring
- dissemination
- awareness
- tutorials
- web lectures
Is portal for language(s)-
Other languages covered-- Spanish
- English
- Polish
- French
- Dutch
- German
- Slovene
- Czech
- Latin
- Bulgarian
Modalities covered- Text
- audio-visual data
Linguistic topics- corpus linguistics
- diachronic language resources
- language learning
Language processing topics- basic language processing
- information extraction
Data Types- lexical data
- language models
- linked open data
- ontologies
Resources Families- Historical corpora
- Lexica
- Literary corpora
- Newspaper corpora
Generic topics- OCR
- digitisation
- visualisation
- evaluation of tools
Other keywords-
Tour de CLARINIntroduction Interview
Last update2021-01-14 15:35:00

K-BLP

CLARIN Knowledge Centre for Belarusian Text and Speech Processing

Contact nameYuras Hetsevich
Contact emailyuras.hetsevich@gmail.com
Hosted by(1) The United Institute of Informatics Problems of the National Academy of Sciences of Belarus
City of main hubMinsk
Country of main hubBY
Date of certifcation2020-02-10
Area of competenceKnowledge about text and speech processing of Belarusian and other languages; Knowledge about Belarusian language learning; - Tools and resources for text and speech processing for Belarusian and other languages
Audiences served- Computational linguists
- Computer scientists
- Linguists
- Historians
- Language teachers
- Library staff
- Sociolinguists
- Sociologists
- Programmers
- Archivists
- Citizen scientists
Types of services-- knowledge about tokenization, morphological analysis, voiced electronic grammatical dictionary, part-of-speech tagging, frequency count, spell checking, text classification, and other approaches used in speech and text processing.
- offers special courses in language processing, data analysis, and collecting research data for the fast entrance of humanities and others into the digital world of Belarusian data processing.
- wide-ranging user support, guidelines, and instructions for each service and material.
Is portal for language(s)- Belarusian
Other languages covered-- Russian
- English
- Morphologically rich languages
Modalities covered- Text and speech processing
- Audio-visual data generation
- Multi-modality
- Speech
- Written text
Linguistic topics- Applied linguistics
- Dialect studies
- Field linguistics
- Language learning
- Lexical studies
- Morphology
- Phonetics
- Phonology
- Syntax
- Pragmatics
- Semantics
- Terminology
- Translation studies
Language processing topics- Speech synthesis
- Speech recognition (word, intonation, emotion, pathology)
- Text generation
- Characters and Words counting
- Information extraction
- Language generation
- Language understanding
- Machine translation
- Processing of morphologically rich languages
- Summarization
- Text mining
- Word sense disambiguation
- Basic language processing
- Dictionary Processing
- Lemmatization
- Part-of-Speech Tagging
- Spell-checking
- Transliteration (Convert from Cyrillic to Latin letters)
- Homograph Identification
- Machine translation
Data Types- Dictionaries
- Language models
- Term banks (UDC Codes)
- Typological databases
- Dialectological Maps
Resources Families- Dictionaries
- Part-of-speech tagging and lemmatization
Generic topics- Language use in specific domains (a legal document translation, UDC classification for libraries)
- Working with maps
Other keywords- Online platform
- Natural language processing
Tour de CLARIN- -
Last update2021-01-22 11:03:19

NLP:EL

CLARIN K-Centre for Natural Language Processing in Greece

Contact nameMaria Gavrilidou
Contact emailmaria@athenarc.gr
Hosted by(1) ATHENA RC / ILSP, Athens, Greece
City of main hubAthens
Country of main hubGR
Date of certifcation2020-04-21
Area of competenceCLARIN K-Centre NLP:EL is an information service offering expertise and advice on (a) Language Technology for Greek, (b) on the issue of digital readiness of Greek (i.e. how ready is Greek for the digital age, as regards digital resources and language processing tools), and (c) on Greek Sign Language. It provides guidance for the development and annotation of language resources, for the findability, accessibility and use of existing language processing tools and web services, for the development of new technologies for language processing, as well as useful information and support regarding the fields of dynamic sign language synthesis, and special communication and interaction interfaces.
NLP:EL provides its services through two channels: it operates a helpdesk where users can address questions concerning the above issues. Besides responding to questions on these topics, it also provides informative material and documentation relevant to these issues; this material includes (but is not limited to) (i) scientific publications and presentations on Natural Language Processing (NLP) research and applications for Greek and Greek sign language, (ii) guides and tutorials on language processing tools and services for Greek and Greek sign language, (iii) direct connection to the specialized Sign Language Technologies website and (iv) direct connection to the CLARIN:EL infrastructure, where the users can find a catalogue of digital language resources and language processing tools, more detailed information and further training and dissemination material.
Audiences served- Computational linguists
- Language Technology developers
- Language teachers
- Linguists
- Philologists
- Sign language community
- Political scientists
- Social Scientists
- Digital humanists
Types of services-- Helpdesk
- Scientific publications and presentations
- Guides and tutorials
- Links to catalogues of language resources and tools
- Training courses
Is portal for language(s)- Greek
- Greek sign language
Other languages covered-
Modalities covered- Written text
- Sign language
- Gestures
Linguistic topics- Lexicography
- Morphology
- Syntax
- Terminology
- Corpus linguistics
Language processing topics- Language processing
- Processing of morphologically rich languages
- Information extraction
- Named entity recognition
- Text mining
- Machine translation
- Sign language technologies
- Speech recognition
- Speech synthesis
Data Types- Translation memories
- Term banks
- Treebanks
- Lexical conceptual resources
- Dictionaries
- Ontologies
- Thesauri
- Language models
Resources Families- Historical corpora
- Computer-mediated communication corpora (social media)
- Newspaper corpora
- Parallel corpora
- Parliamentary corpora
Generic topics-
Other keywords-
Tour de CLARIN- -
Last update2021-02-08 15:36:32

NSD-K-centre

CLARIN Knowledge Centre for Data Management at NSD

Contact nameGry Henriksen
Contact emailGry.Henriksen@nsd.no
Hosted by(1) Norwegian Centre for Research Data (NSD), Bergen, Norway
City of main hubBergen
Country of main hubNO
Date of certifcation2018-12-04
Area of competenceProvides expertise in data management, including legal and ethical issues related to privacy and IPR.
Audiences served-
Types of services-- Lectures in Data management, data curation and legal and ethical issues
- Courses in Data management, data curation and legal and ethical issues
Is portal for language(s)-
Other languages covered-
Modalities covered-
Linguistic topics-
Language processing topics-
Data Types-
Resources Families-
Generic topics- Data Management
- Legal and ethical issues
- Data archiving
- Data dissemination
- Data management plan
- Data curation
- Archiving
- Metadata
Other keywords-
Tour de CLARIN- -
Last update2021-01-23 09:34:51

PhA-OeAW

Phonogrammarchiv / Austrian Academy of Sciences - CLARIN K-Centre

Contact nameBenjamin Fischer
Contact emailbenjamin.fischer@oeaw.ac.at
Hosted by(1) Phonogrammarchiv, PhA, Austrian Academy of Sciences, Vienna, Austria
City of main hubVienna
Country of main hubAT
Date of certifcation2015-06-22
Area of competenceAs an audio and audio-visual archive with numerous collections of unique research recordings from all across the world, covering a time-span of 120 years, the Phonogrammarchiv offers various services: Besides providing access to its rich data and metadata resources (remote & onsite), it advises scholars on audio-visual research methodology in the social sciences & humanities and on technologies of audio and audio-visual documentation, supporting them with necessary recording equipment. In addition, it widely shares its broad expertise on topics such as restoration, digitisation, format obsolescence, cataloguing, metadata, long-term preservation and storage.
Audiences served- scholars
- source communities
- linguists
- ethnomusicologists
- social / cultural anthropologists
- historians
- archivists
- audio-visual conservators
- museums
- media
- artists
- teachers
Types of services-- access to audio and audio-visual data and metadata
- individual advice & assistance
- group trainings
- workshops
- higher education teaching
- internships
- institutional cooperations
Is portal for language(s)-
Other languages covered-- languages / dialects worldwide
Modalities covered- audio data
- audio-visual data
Linguistic topics- field linguistics
- dialect studies
- corpus linguistics
- language documentation
- oral history
Language processing topics-
Data Types- audio data
- audio-visual data
Resources Families- Spoken corpora
Generic topics- physical restoration of audio-visual media
- digitisation of audio-visual media
- format migration of audio-visual media
- metadata of audio-visual media
- long-term preservation & storage of audio-visual data
- audio-visual fieldwork & documentation
- legal issues
- ethical issues
- data management
Other keywords- ethnomusicology
- musicology
- linguistics
- field linguistics
- social / cultural anthropology
- history
- African studies
- conservation
- postcolonial studies
- sound studies
Tour de CLARINIntroduction Interview
Last update2021-01-22 09:24:37

PolLinguaTec

CLARIN Knowledge Centre for Polish Language Technology

Contact nameJan Wieczorek
Contact emailjan.wieczorek@pwr.edu.pl
Hosted by(1) CLARIN-PL Language Technology Centre, a part of Department of Computational Intelligence, Faculty of Computer Science and Management, Wroclaw University of Science and Technology, Poland
City of main hubWroclaw
Country of main hubPL
Date of certifcation2017-11-20
Area of competenceProvides wide knowledge on the methods of natural language analysis with a special emphasis put on the analysis of Polish language. Offers support for all types of applications of Language Technology for Polish, both mono and multilingual ones.
Audiences served- linguists
- computational linguists
- economists
- sociologists
- psychologists
- media researchers
- researchers of communication
- literature researchers
Types of services-- Consultancy
- Helpdesk
- Training courses
- Tutorials and training materials
- Assistance with the use of data and tools
Is portal for language(s)- Polish
Other languages covered-- English
- German
- Russian
- Ukrainian
- Bulgarian
- Lithuanian
- French
- Spanish
- Hungarian
- Hebrew
Modalities covered- written text
- speech
Linguistic topics- semantics
- morphology
- syntax
- phonetics
- discourse analysis
- stylistics
- phraseology
- lexicography
- terminology
- translation studies
Language processing topics- Polish language processing
- topic modeling
- stylometry
- speech recognition
- named entity recognition
- corpora creation and management
- parallel corpora
- wordnets
- text mining
- information extraction
- word sense disambiguation
Data Types- corpora
- dictionaries
- records of speech
- language models
- treebanks
- wordnets
Resources Families- Literary corpora
- Newspaper corpora
- Parliamentary corpora
- Parallel corpora
- Spoken corpora
Generic topics- Data management
- Legal issues
- Machine learning
- Metadata
- OCR
- Standards
- Language normalization
- Data acquisition
- Support in preparing grants' proposals
Other keywords-
Tour de CLARINIntroduction Interview
Last update2021-01-14 15:32:58

PORTULAN

CLARIN Knowledge Centre for the Science and Technology of the Portuguese Language

Contact nameAntónio Branco
Contact emailantonio.branco@di.fc.ul.pt
Hosted by(1) Faculdade de Ciências da Universidade de Lisboa, Portugal (coordinator)
(2) Faculdade de Letras da Universidade de Lisboa, Portugal
(3) Universidade de Évora, Portugal
City of main hubLisbon
Country of main hubPT
Date of certifcation2019-06-19
Area of competenceThe Science and Technology of the Portuguese Language is the thematic area of this CLARIN Knowledge Centre. Related to the Portuguese language, it covers all topics, from Phonetics to Discourse and Dialogue, considering all language functions, from communicative performance to cultural expression, approached by all disciplines, from Theoretical Linguistics to Language Technology, covering all language variants, from national standard varieties across the world to dialects of professional groups, taking into account all media of representation, from audio to brain imageology recordings.
Audiences served- Researchers
- Innovators
- Citizen scientists
- Students
- Language professionals
- Users in general whose activities resort to research results from the Science and Technology of Language
Types of services-- Consultancy on language science and technology matters for companies, organizations, public services, etc.
- Technical support to use and explore the remaining services for research and education
- Online and offline language processing tools - e.g. conjugators, terminology extractors, concordancers, part-of-speech taggers, deep linguistic processing grammars, etc.
- Online and offline research support applications - e.g. lemma frequency extractors, treebank annotators, etc.
- Online access to data sets - e.g. linguistically interpreted corpora, terminology data bases, EEG records of neurolinguistic experiments, collections of literary texts, etc.
- Online access to data samples - e.g. video recording of sign language, words for concepts in the Organization sub-ontology, etc.
Is portal for language(s)- Portuguese
Other languages covered-
Modalities covered- Written text
- Speech
- Sign language
- Neuro-cognitive records
- Audio-visual data
- Multi-modal data
Linguistic topics- Anthropological Linguistics
- Applied Linguistics
- Clinical Linguistics
- Cognitive Science
- Computational Linguistics
- Discipline of Linguistics
- Discourse Analysis
- Forensic Linguistics
- General Linguistics
- Genetic Classification
- Historical Linguistics
- History of Linguistics
- Language Acquisition
- Language Documentation
- Lexicography
- Linguistic Theories
- Morphology
- Neurolinguistics
- Philosophy of Language
- Phonetics
- Phonology
- Pragmatics
- Psycholinguistics
- Semantics
- Sociolinguistics
- Syntax
- Text/Corpus Linguistics
- Translation
- Typology
- Writing Systems
Language processing topics- Language understanding
- Language generation
- Speech recognition and transcription
- Speech synthesis
- Muliti-modal processing
- Information extraction
- Text mining
- Conversational interfaces and chatbots
- Machine translation
- Summarization
- Question answering
- Subtitling
- Tokenization
- POS tagging
- Named entity recognition
- Word sense disambiguation
- Syntactic analysis
- Semantic analysis
- Anaphora resolution
- Dialogue processing
- Speaker detection
Data Types- Corpora, written, spoken and multi-modal
- Word embeddings
- Language models
- Dictionaries
- Ontologies
- Term banks
- Translation memories
- Treebanks
- Typological databases
- Wordnets
- Lexica
- Conceptual Resources
- Glossaries
- Wordlists
- Speech databases
- Multi-modal databases
Resources Families- Computer-mediated communication corpora (social media)
- Corpora of academic texts
- Historical corpora
- L2 learner corpora
- Literary corpora
- Manually annotated corpora
- Multimodal corpora
- Newspaper corpora
- Parallel corpora
- Parliamentary corpora
- Reference corpora
- Spoken corpora
Generic topics- Data management
- Ethical issues
- Evaluation of tools
- Language use in specific domains (e.g. legal or medical language)
- Legal issues
- Metadata
- Standards
- Visualisation
Other keywords-
Tour de CLARIN- -
Last update2021-01-14 15:38:50

SAFMORIL

Systems and Frameworks for Morphologically Rich Languages

Contact nameKrister Lindén
Contact emailkrister.linden@helsinki.fi
Hosted by(1) University of Helsinki, Finland
(2) University of Tromsø, Norway
(3) University of Latvia, Riga, Latvia
(4) Vytautas Magnus University, Kaunas, Lithuania
City of main hubHelsinki
Country of main hubFI
Date of certifcation2019-10-17
Area of competenceSAFMORIL brings together researchers and developers in the area of computational morphology and its NLP applications. The focus of SAFMORIL is actual, working systems and frameworks based on linguistic principles providing linguistically motivated analyses and generation outputs. Such systems are relevant in particular for languages with rich morphologies. SAFMORIL offers online courses for developing morphologies, tokenizers and spell-checkers, and a repository for storing morphologies.
Audiences served- Computational linguists
- Computer scientists
- Language teachers
Types of services-- helpdesk
- data
- tools
- web demos
- web lectures
- tutorials
Is portal for language(s)-
Other languages covered-- Nordic languages
- Baltic languages
- Finnish
- Swedish
- Norwegian
- Latvian
- Lithuanian
- Sámi languages
- Fenno-Ugric languages
- Inuit languages
- Canadian First Nation languages
- Babylonian languages
Modalities covered- Text
Linguistic topics- Morphology
- Morphosyntax
Language processing topics- Processing of morphologically rich languages
Data Types- Lexical resources containing inflectional, derivational and compounding information
- morphosyntactic grammars
- language models
- Morphological Lexicons
- Grammars
Resources Families-
Generic topics- Finite-State Applications
- Statistical Methods
- Neural Networks
Other keywords-
Tour de CLARIN- -
Last update2021-01-15 15:05:08

Spanish-K-centre

Spanish CLARIN K-centre

Contact nameNuria Bel and Mikel Iruskieta
Contact emailnuria.bel@upf.edu,mikel.iruskieta@ehu.eus
Hosted by(1) CLARIN Competence Centre IULA-UPF and HDLab@UPF (Universitat Pompeu Fabra, Barcelona), Spain
(2) HiTZ - Basque Center for Language Technology (University of the Basque Country, Donostia), Spain
(3) UVIGO - TALG Group (Universidade de Vigo, Vigo), Spain
(4) UNED - LINHD: Laboratorio de Innovacion de Humanidades Digitales (Universidad Nacional de Educacion a Distancia, Madrid), Spain
City of main hubBarcelona
Country of main hubES
Date of certifcation2017-05-27
Area of competenceThe Spanish CLARIN K-Centre aims to provide knowledge, services, consultancy and specialized web services to the Humanities and Social Science research communities. Our web services and consultancy is about how to use and research with basic tools that can handle and exploit textual data at least in the four (co)official languages (Spanish, Catalan, Galician, Basque) and English, which is one of the most important sources of information for many HSC disciplines.
Audiences served- Researchers from Humanities and Social Sciences (and beyond) interested in Spanish, Basque, Catalan and Galician
Types of services-- Tools
- Data
- Mentoring
- Dissemination
- Tutorials
Is portal for language(s)- Spanish
- Basque
- Catalan
- Galician
Other languages covered-
Modalities covered- Written text
Linguistic topics- general linguistics (phonology, morphology, syntax, semantics, pragmatics)
- computational linguistics
- corpus linguistics
- applied linguistics
- stylistics
Language processing topics- morphology
- syntax
- semantics
- discourse
Data Types- Lexical databases: general, sentiment, NERC
- Syntax Tree banks
- Discourse Tree banks: correference, relational
- Spoken databases
- Semantic annotation: semantic roles, word sense
- Error annotation
- Image bank (wikimedia)
- Conversational QA
Resources Families-
Generic topics- Grammars
- Finite-State Applications
- Statistical Methods
- Neural Networks
Other keywords-
Tour de CLARINIntroduction Interview
Last update2021-01-14 15:26:23

SWELANG

CLARIN Knowledge Centre for The Languages of Sweden

Contact nameRickard Domeij
Contact emailRickard.Domeij@isof.se
Hosted by(1) The Language Council of Sweden, Institute of Language and folklore(ISOF), Stockholm, Sweden
City of main hubStockholm
Country of main hubSE
Date of certifcation2016-05-27
Area of competenceInformation service offering advice on the use of digital language resources and tools for the Swedish language, minority languages in Sweden, the Swedish sign language, Swedish dialects, as well as other parts of the intangible cultural heritage of Sweden in text and speech, as well as language policy and planning.
Audiences served- Students, teachers and researchers in social sciences and humanities
- The interested public
Types of services-- On-line lexica
- Q&A database
- Map interfaces for folk tales and dialects
- Open data download
- Language consulting (by telephone, email and social media)
Is portal for language(s)- Swedish
Other languages covered-- Finnish
- Meänkieli
- Romani
- Jiddisch
- Swedish sign language
- other languages in Sweden
Modalities covered- Written text
- Speech
- Gestures
- Audiovisual data
- Multimodality
Linguistic topics- Language policy and planning
- Language infrastructure
- Language resources
- Language technology
- Dialect studies
- Sociolinguistics
- Plain language and language comprehensibility
- Terminology
- Lexicography
Language processing topics- Basic language processing
- Information extraction
- Topic modelling
Data Types- Dictionaries
- Term banks
- Translation memories
Resources Families
- Historical corpora
- Multimodal corpora
- Parallel corpora
- Spoken corpora
- Lexica
- Dictionaries
- Conceptual resources
- Glossaries
- Wordlists
Generic topics- Language policy and planning
- Swedish dialects
- Folkloristics
- Digital humanities
- Metadata
- OCR
Other keywords-
Tour de CLARINIntroduction Interview
Last update2021-01-19 12:26:34

Treebanking

CLARIN Knowledge Centre for Treebanking

Contact nameKoenraad De Smedt, Pavel Straňák
Contact emaildesmedt@uib.no,stranak@ufal.mff.cuni.cz
Hosted by(1) INESS at CLARINO Bergen Centre, University of Bergen, Bergen, Norway
(2) LINDAT/CLARIAH-CZ, Charles University, Prague, Czech Republic
City of main hubBergen
Country of main hubNO
Date of certifcation2015-06-25
Area of competenceWe can assist in making treebanks accessible and and searchable on our two websites. We provide documentation, guidelines and user support for the online exploration of the available treebanks. We can assist in the online construction of LFG treebanks as parsed corpora and in the online editing of Universal Dependency treebanks. We disseminate our knowledge through occasional tutorials and workshops on treebanking.
Audiences served- Research groups and projects wishing to construct and deposit treebanks
- Researchers, developers and students wishing to explore treebanks
Types of services-- Project assistance
- Documentation
- Help line for user support
- Tutorials
- Workshops
- Tools
- Data
Is portal for language(s)-
Other languages covered-
Modalities covered- text
- transcribed speech or sign language
Linguistic topics- Syntax
- Morphology
- Semantics
- Pragmatics
Language processing topics- Building treebanks
- Editing treebanks
- Processing treebanks
- Disambiguating treebanks
- Parsing corpora
- Syntactic annotation
- Alignment of treebanks
- Integration of treebanks
Data Types- Treebanks
Resources Families-
Generic topics- Formats
- Standards
Other keywords-
Tour de CLARINIntroduction Interview
Last update2021-02-04 11:26:12

TRTC

Terminology Resources and Translation Corpora

Contact nameVesna Lusicky
Contact emailvesna.lusicky@univie.ac.at
Hosted by(1) Centre for Translation Studies at the University of Vienna, Austria
City of main hubVienna
Country of main hubAT
Date of certifcation2019-01-16
Area of competenceThe K-centre provides information and training to users on the preparation and documentation of translation-related resources, in particular terminology resources and translation corpora. This includes inquiries submitted to the Helpdesk related to tools, methods, data, and guidance in seeking further expert support. The service does not focus on language resources in particular languages, but is language independent.
Audiences served- Terminologists
- Translation studies researchers
- Linguists
- Computational linguists
- Citizen scientists
Types of services-- Helpdesk
- Best practice documents
- Consultancy
- Training courses
Is portal for language(s)-
Other languages covered-
Modalities covered- Text
Linguistic topics- Terminology
- Translation studies
Language processing topics- Machine translation
Data Types- Term banks
- Translation memories
- Translation corpora
Resources Families-
Generic topics-
Other keywords-
Tour de CLARIN- -
Last update2021-01-14 15:36:53