Canal-U

Mon compte
LIMSI

Conférence invitée de Christian Chiarcos Corpora and Linguistic Linked Open Data: Motivations, Applications, Limitations


Copier le code pour partager la vidéo :
<div style="position:relative;padding-bottom:56.25%;padding-top:10px;height:0;overflow:hidden;"><iframe src="https://www.canal-u.tv/video/limsi/embed.1/conference_invitee_de_christian_chiarcos_corpora_and_linguistic_linked_open_data_motivations_applications_limitations.32295?width=100%&amp;height=100%" style="position:absolute;top:0;left:0;width:100%;height: 100%;" width="550" height="306" frameborder="0" allowfullscreen scrolling="no"></iframe></div> Si vous souhaitez partager une séquence, indiquez le début de celle-ci , et copiez le code : h m s
Auteur(s) :
Chiarcos Christian

Producteur Canal-U :
LIMSI
Contacter le contributeur
J’aime
Imprimer
partager facebook twitter Google +

Conférence invitée de Christian Chiarcos Corpora and Linguistic Linked Open Data: Motivations, Applications, Limitations

JEP-TALN-RECITAL 2016 - Mardi 5 juillet 2016

Conférence invitée

Corpora and Linguistic Linked Open Data: Motivations, Applications, Limitations

Christian Chiarcos

Résumé : Linguistic Linked Open Data (LLOD) is a technology and a movement in several disciplines working with language resources, including Natural Language Processing, general linguistics, computational lexicography and the localization industry. This talk describes basic principles of Linguistic Linked Open Data and their application to linguistically annotated corpora, it summarizes the current status of the Linguistic Linked Open Data cloud and gives an overview over selected LLOD vocabularies and their uses. A resource constitutes Linguistic Linked Open Data if it is published in accordance with the following principles:

  1. The dataset is relevant for linguistic research or NLP algorithms.
  2. The elements in the dataset should be uniquely identified by means of a URI.
  3. The URI should resolve, so users can access more information using web browsers.
  4. Resolving an LLOD resource should return results using web standards such as Resource Description Framework (RDF).
  5. Links to other resources should be included to help users discover new resources and provide semantics.
  6. Data should be openly licensed using licenses such as the Creative Commons licenses.
  7. Criterion (1) defines linguistic(ally relevant) data, criteria (2-5) define linked data, criterion (6) defines open data, their combination thus yields Linguistic Linked Open Data. The primary benefits of LLOD have been identified as:

    • Representation: Linked graphs are a more flexible representation format for linguistic data
    • Interoperability: Common RDF models can easily be integrated
    • Federation: Data from multiple sources can trivially be combined
    • Ecosystem: Tools for RDF and linked data are widely available under open source licenses
    • Expressivity: Existing vocabularies help express linguistic resources.
    • Semantics: Common links express what you mean.
    • Dynamicity: Web data can be continuously improved.

    I specifically focus on linguistically annotated corpora and discuss the potential of Linked Data in relation to four standing problems in the field:

    1. representing highly interlinked corpora (e.g., multi-layer corpora, annotated parallel corpora),
    2. integrating corpora with lexical resources available from the web of data,
    3. facilitating annotation interoperability using terminology resources available from the web of data, and
    4. streamlining data manipulation processes in a modular and domain-independent fashion.

    These aspects will be discussed in relation to two selected resources from both general linguistics and Natural Language Processing. Finally, the talk will discuss some of the challenges that LLOD is still facing in both areas.

  •  
  •  
    Date de réalisation : 5 Juillet 2016
    Lieu de réalisation : Inalco, Paris
    Durée du programme : 62 min
    Classification Dewey : Informatique appliquée à la linguistique
  •  
    Catégorie : Conférences
    Niveau : Tous publics / hors niveau, niveau Doctorat (LMD), Recherche
    Disciplines : Linguistique, Informatique
    Collections : JEP-TALN-RECITAL 2016, Conférences invités
    ficheLom : Voir la fiche LOM
  •  
    Auteur(s) : Chiarcos Christian
  •  
    Langue : Anglais
    Mots-clés : Linked Open Data, Natural Language Processing
 

commentaires


Ajouter un commentaire Lire les commentaires
*Les champs suivis d’un astérisque sont obligatoires.
Aucun commentaire sur cette vidéo pour le moment (les commentaires font l’objet d’une modération)
 

Dans la même collection

FMSH
 
Facebook Twitter Google+
Mon Compte