Pictograph Communication Technologies

Pictograph Communication Technologies
Doctoral Research Newsletter #1
Year 1: 2015

Leen Sevens, Centrum voor Computerlinguïstiek, KU Leuven
leen@ccl.kuleuven.be
My homepage

Supervisor: Prof. Dr. Frank Van Eynde
Co-supervisor: Dr. Vincent Vandeghinste

Dear reader,

Time flies. The 1st of January marked the beginning of my second year as a PhD student. As promised, I will update you on the progress that I have made so far. If you have any questions or suggestions, please do not hesitate to contact me. I'm always open to new ideas.

Just a quick reminder. My name is Leen Sevens, but people usually refer to me as "that girl who does that thing with those tiny, little images". At the Centre for Computational Linguistics (KU Leuven), we are building high-quality and fully automatic translation tools of Dutch texts into pictographs and of pictographs into Dutch texts. Our goal is to facilitate the accessibility of the Internet for users with reading and writing disabilities. For this purpose, various subtasks will have to be carried out. I will present these tasks and describe the progress that has already been made with respect to each one of them. Finally, I will discuss my dissemination activities.

We are developing and improving the pictograph translation systems for the WAI-NOT communication platform. WAI-NOT is a Flemish non-profit organization that gives people with severe communication disabilities the opportunity to familiarize themselves with computers, the Internet, and social media. Their safe website environment offers an email client that makes use of the pictograph translation solutions.

Additionally, we are developing English and Spanish versions of the tools within the Able-to-Include project. Able-to-Include seeks to improve the lives of people with Intellectual or Developmental Disabilities (ID). In order to be included in today’s society, it is becoming increasingly important to be able to use the current available technological tools. The number of apps is growing exponentially, but very few are really accessible to people with ID. Able-to-Include is creating a context-aware Accessibility Layer based on three key technologies that can improve the daily tasks of people with ID and help them interact with the Information Society. The integration of this Accessibility Layer with existing ICT tools will be tested in three different pilots in Spain, Belgium and the UK.

The Text-to-Pictograph translation technology translates Dutch text into Beta or Sclera pictographs in order to facilitate the understanding of written text. Before I started my PhD, the Centre for Computational Linguistics had already developed a baseline version of this system. You can check out the demo here. Please bear in mind that the technologies that are discussed below have not yet been implemented in our demo.ID

While it is very important to encourage people with ID to write their own messages if they are able to do so, their writings may pose several problems. First, even if the receivers of the ill-formed messages are (to some extent) able to read written text, they might not be able to understand these messages when they are confronted with too many spelling errors. Secondly, our Text-to-Pictograph translation tool, which translates the email into pictographs for people who have reading difficulties, may retrieve erroneous pictographs or find no pictographs at all for erroneously written words. And we definitely want to avoid that.

Our old spelling corrector did a bad job at dealing with spelling errors.
This often resulted in very weird pictograph translations.

We built the first version of an automated spelling corrector that is specifically tailored to users with ID. We analyzed 1,000 emails written by people with cognitive disabilities and compared them to tweets. Our results showed that users with ID make many more and different spelling errors than users who do not have a cognitive disability. For instance, they often write words the way in which they are pronounced. Sometimes, we even had to read their messages aloud to understand their meaning. But where there is a will, there is a way.

Our first spelling corrector system consists of a word variant generation and filtering step that is partially based on discovering phonetic similarities, followed by a completely novel approach to context-sensitive spelling correction. Our evaluations show that significant improvements over the baseline in the Text-to-Pictograph translation tool were made, but there is still room for improvement. To be continued!

Our new spelling corrector knows how to tackle these issues.
"Wiekent" is not "wieken" (wings) or "wie kent" (who knows), but "weekend".

Dutch input text in the Text-to-Pictograph translation system undergoes shallow linguistic analysis. This is mostly limited to the word level. We will improve the current system by adding deep linguistic analysis on the sentence level.

Our latest idea is to apply deep linguistic analysis for a very specific purpose. We want to develop a pictograph simplification system, the first of its kind. This is currently a work in progress. We will use semantic and syntactic information from the input sentence to transform the pictograph translation into a simpler, shorter sequence that should be easier to understand by the receiver. We have already done an effort to split complex sentences, relative clauses, and subordinate clauses. And while temporal information would originally be lost, we are now able to generate "past" or "future" pictographs and remove verbs that do not contribute to the meaning of the sentence.

To be honest, the original picto translation makes our brain explode,
but the second one is much easier to read.

The Pictograph-to-Text translation technology translates Beta or Sclera pictographs into Dutch text in order to facilitate the construction of written text. For a sneak peek behind the scenes, you can have a look at our demo here.

We built the first version of the Pictograph-to-Text translation tool. It performs Viterbi-decoding based on a trigram language model to find the optimal natural language translation. The first evaluations show that this approach is already an improvement over the initial baseline, but there is ample room for improvement in future work. For instance, we will have to check for grammaticality during the sentence generation process.

Pictograph-to-Text translation.

The Pictograph-to-Text translation engine relies on pictograph input. By now, we have developed two different input methods. The first approach offers a static hierarchy of pictographs, while the second option scans the user input and dynamically adapts itself in order to suggest appropriate pictographs.

The static hierarchy of pictographs consists of three levels. The structure of the hierarchy is based on topic detection and frequency counts applied to email messages sent by users of the WAI-NOT communication platform. The second method is a dynamic pictograph prediction tool, the first of its kind. You can compare this tool with Google's autocomplete function for words. Two prototypes have been developed, which will eventually be merged into one system. When using social media websites, users can construct pictograph messages using the hierarchy and the predictor. Their messages will be converted to natural language text, which can be posted on the website.

WAI-NOT will soon implement this new pictograph input interface on their website. User tests will reveal how the interface can be improved.

This is the "birds" level in the static pictograph hierarchy.
The user can navigate back to the top level (orange) or the "animals" level (blue).

In vitro evaluations of the tools

Statistical and manual evaluations were done for every (version of every) subcomponent that we have developed so far. This way, we can continue to measure the added value of future improvements.

In vivo evaluations of the users' experiences with the tools

We will assess the technology and its impact on the improvement of the user's abilities and their well-being. We are convinced that this approach is necessary to fine-tune our tools and adapt them to the users' needs. For this purpose, I registered for a credit contract for the course “Pedagogische hulpverlening voor mensen met een handicap” at the Faculty of Educational Studies in 2015-2016. In collaboration with Prof. Dr. Ilse Noens, her colleagues, and myself, students who start their Master's programme of Educational Studies in 2016-2017 will get the opportunity to dedicate their 2-year thesis research to an in vivo evaluation of (a specific component of) the pictograph translation tools.

Publications (in 2015)

Leen Sevens, Vincent Vandeghinste, Ineke Schuurman and Frank Van Eynde (2015). Natural Language Generation from Pictographs. Proceedings of 15th European Workshop on Natural Language Generation (ENLG 2015), pages 71-75. Brighton, UK. [Download paper]

Leen Sevens, Vincent Vandeghinste, Ineke Schuurman and Frank Van Eynde (2015). Extending a Dutch Text-to-Pictograph Converter to English and Spanish. Proceedings of 6th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT 2015), pages 1-8. Dresden, Germany. [Download paper]

Vincent Vandeghinste, Ineke Schuurman, Leen Sevens and Frank Van Eynde (2015). Translating Text into Pictographs. Natural Language Engineering, pages 1-28. Cambridge University Press. [Download paper]

Leen Sevens, Vincent Vandeghinste and Frank Van Eynde (2014). Improving the Precision of Synset Links Between Cornetto and Princeton WordNet. Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing. Coling 2014. Dublin, Ireland. [Download paper]

Awards (in 2015)

Best Research Poster prize: June 21, 2015 - LOT Summer School (Leuven, Belgium). Leen Sevens, Vincent Vandeghinste, Ineke Schuurman and Frank Van Eynde - "Text-To-Pictograph Translation for Six Language Pairs".

Presentations (in 2015)

I have been talking quite a lot! You can find my presentations here.

Broader dissemination (in 2015)

December 2015 - Music for Life (radio, Belgium)

WAI-NOT has been very supportive of my project, so I wanted to thank them by hosting a creative fundraiser. Thanks to my family and friends, I managed to raise a total of 800 euros. I made over 30 drawings in less than 5 weeks - that must be a personal record! Studio Brussel even allowed me to talk about my project and the pictograph translation tool on the radio. You can hear the (Dutch) interview below.

Fabled Fox for Life - Live @ Studio Brussel
Vandaag werd Fabled Fox for Life opgebeld door Studio Brussel!
Geplaatst door Fabled Fox for Life op dinsdag 1 december 2015

October 8, 2015 - The Big Draw (Leuven, Belgium)

The Big Draw, the world's biggest drawing festival, finally came to Leuven. My colleague Ineke and I were asked to give a university course about the pictograph translation tools. Want to know the best part? We actually gave this university course to people with a cognitive disability! I prepared several puzzles, games, and brain teasers, and we allowed the students to guess the meaning of various Sclera pictographs. The students clearly had a lot of fun. We couldn't ask for more!

April 2015 – Tripliek (magazine of Downsyndroom Vlaanderen)

I wrote a Dutch article for Tripliek, the official magazine of Downsyndroom Vlaanderen. It concerns the basic design of the Picto tools, our goals and our future plans. You can read the online version here.

Doctoral grant funded by Agentschap Innoveren & Ondernemen