2.5 : Sharing HTR datasets with standardized metadata: the HTR United initiative

Durée : 00:34:41 -Réalisation : 24 juin 2022 -Mise en ligne : 6 octobre 2022
  • document 1 document 2 document 3
  • niveau 1 niveau 2 niveau 3
  • audio 1 audio 2 audio 3

par Alix Chagué et Thibault Clérice

Since some scholars adopted Ocropy in the mid-2010s, production of HTR or OCR ground truth has seen an impressive and steady growth. However, few projects share their gold dataset, and when they do, they are scattered across many different hosting options (Github, zenodo, gitlab, institutional repository, etc.) making them very hard to find. For reuse, when they are “discovered”, their description is often lacking crucial details. The HTR-United initiative is an answer to this problem: with a standardized metadata schema, a curated catalogue and tools focusing on helping them through every step, owners can now easily publish and make their dataset findable.

Lieu de réalisation
École nationale des chartes
Langue :
Yanet Hernandez (Montage), Alix Chagué (Intervenant), Thibault Clérice (Intervenant)
Conditions d'utilisation
Droit commun de la propriété intellectuelle
Citer cette ressource :
Alix Chagué, Thibault Clérice. ENC. (2022, 24 juin). 2.5 : Sharing HTR datasets with standardized metadata: the HTR United initiative. [Vidéo]. Canal-U. (Consultée le 24 mars 2023)

Dans la même collection

Sur le même thème