The SSH Training Discovery Toolkit provides an inventory of training materials relevant for the Social Sciences and Humanities.

Use the search bar to discover materials or browse through the collections. The filters will help you identify your area of interest.

 

en

Item
Title Body
HZSK

The HZSK is a CLARIN centre that accepts corpora and other linguistic resources from research projects and other contexts in order to make these available mainly to the academic community for research and teaching purposes. The focus of the HZSK is on spoken, multilingual and multimodal corpora, and (spoken) corpora in other languages than German, especially of lesser-recourced or endangered languages.

Bavarian Archive for Speech Signals

Depositing service for corpora of spoken languages which contain a minimum of at least one measured signal that is based on the physical processes of speech production (e.g. acoustic signals, videos, series of measurements, series of pictures).

FIN-CLARIN

Depositing service for language resources related to Finnish, Finland Swedish and the Fenno-Ugric languages, as well as other language resources created in Finland.

CLARIN-DK-UCPH

The mission of CLARIN-DK is to provide easy and sustainable access for scholars in the humanities and social sciences to digital language data (in written, spoken, video or multimodal form) and to provide advanced tools for discovering, exploring, exploiting, annotating, and analyzing them. CLARIN-DK also shares knowledge on Danish language technology and resources and is the Danish node in the European CLARIN-ERIC. The objective of the CLARIN Centre at the University of Copenhagen is to fulfill the CLARIN-DK mission. The centre provides data management consultation and support in connection with depositing and reuse of research data.

LINDAT/CLARIAH-CZ

Depositing service for any linguistic and/or NLP data and tools: corpora, treebanks, lexica, but also trained language models, parsers, taggers, machine translation systems, web services, etc.

CLARIN Centre Vienna

ARCHE (A Resource Centre for the HumanitiEs) is a service that offers stable and persistent hosting as well as the dissemination of digital research data and resources for the Austrian humanities community. ARCHE welcomes data from all humanities fields.

Tools for named entity recognition

This is a list of tools for named entity recognition that are available as part of the CLARIN Resource Families initiative.

Named entity recognition (NER) is an information extraction task which identifies mentions of various named entities in unstructured text and classifies them into predetermined categories, such as person names, organisations, locations, date/time, monetary values, and so forth. They can, for example, help with the classification of news content, content recommentations and search algorithms.

Tools for normalization

This is a list of tools for text normalization that are available as part of the CLARIN Resource Families initiative.

Text normalization is the process of transforming parts of a text into a single canonical form. It represents one of the key stages of linguistic processing for texts in which spelling variation abounds or deviates from the contemporary norm, such as in texts published in historical documents or on social media. After text normalization, standard tools for all further stages of text processing can be used. Another important advantage of text normalization is improved search which can be performed with querying a single, standard variant but takes into account all its spelling variants, be it historical, dialectal, colloquial or slang.

Wordlists

This is a list of wordlists that are available as part of the CLARIN Resource Families initiative.

Wordlists are lexical resources which only provide alphabetical or frequency-based lexical inventories. In the vast majority of the cases, the wordlists can be directly downloaded from CLARIN national repositories or queried through easy-to-use online search environments.

Glossaries

This is a list of glossaries that are available as part of the CLARIN Resource Families initiative.

Glossaries are specialised dictionaries that contain domain-specific terminology and/or expressions. In the vast majority of the cases, the glossaries can be directly downloaded from CLARIN national repositories or queried through easy-to-use online search environments.