About me

After graduating from ENSIMAG (École Nationale Supérieure d’Informatique et de Mathématiques Appliquées, a computer science and applied mathematics school of engineer) in 2015, I obtained a master’s degree in NLP at Université Paris-Sorbonne. I am now a second year PhD candidate at Université Paris-Sorbonne, STIH team, under the supervision of Karën Fort and Claude Montacié.

Curriculum vitae: (en) CV (en) (fr) CV (fr)

Interests

My research focuses on questionning whether crowdsourcing (and games with a purpose) can be used to gather linguistic resources for the so-called "less-resourced" languages.

I have developed a platform to enable collaborative part-of-speech annotation of an Alsatian (French regional language) corpus: BISAME, and a platform to collect recipes, dialectal and spelling variants and POS tags for the Alsatian language Recettes de Grammaire. Feel free to share theses links, create an account and contact me if you need further information!

Publications

2018

  • Alice Millour, Karën Fort, Krik: First Steps into Crowdsourcing POS tags for Kréyòl Gwadloupéyen, Proc. of the LREC workshop CCURL 2018, Miyazaki, Japan, May 2018. Pdf

  • Alice Millour, Karën Fort, Toward a Lightweight Solution for Less-resourced Languages: Creating a POS Tagger for Alsatian Using Voluntary Crowdsourcing, Language Resources and Evaluation Conference (LREC), Miyazaki, Japan, May 2018. Pdf

2017

  • Alice Millour, Karën Fort, Delphine Bernhard et Lucie Steiblé. Vers une solution légère de production de données pour le TAL : création d’un tagger de l’alsacien par crowdsourcing bénévole. Actes de Traitement Automatique des Langues Naturelles (TALN), Orléans, France, juin 2017- Présentation orale. Pdf

  • Alice Millour and Karën Fort. Why do we Need Games? Analysis of the Participation on a Crowdsourcing Annotation Platform. Symposium Games4NLP, Valencia, Spain, April 2017. Pdf

  • 26 avril : Séminaire invité «Recherches linguistiques et corpus» organisé par Franck Neveu à l’Université Paris Sorbonne : « Construire un corpus annoté pour une langue peu dotée : annotation collaborative en partie du discours d’un corpus de l’alsacien.»

2016

  • My master’s degree thesis is available here Master's Thesis(French) (French only).

Activities

  • Member of the organization committee of the DiLiTAL’s workshop: Diversité Linguistique et TAL (Linguistic Diversity and NLP), taking place during TALN 2017.

  • Student volunteer at TALN 2016.


Contact

Alice Millour
STIH Université Paris-Sorbonne
Maison de la recherche
28, rue Serpente
75006 Paris
France

Email: aliceI.millourdon't@wantspam! Sopleaseparis-sorleave bonneme.alonef!r