Seminar at NAVER LABS Europe: Modelling the interplay between visual and textual information for computer vision applications

Actions Panel

Seminar at NAVER LABS Europe: Modelling the interplay between visual and textual information for computer vision applications

By NAVER LABS Europe

Date and time

Fri, 26 Apr 2019 11:00 - 12:00 CEST

Location

NAVER LABS Europe

6-8 Chemin de Maupertuis 38240 Meylan France

Description

Speaker: Dimosthenis Karatzas, senior lecturer at the UAB, Spain & associate director of the Computer Vision Centre where he leads the Vision and Language research line.

---------------
Title: Modelling the interplay between visual and textual information for computer vision applications
-------------

Abstract:
There is huge value in enabling machines to understand and interpret, through vision, written information, in unconstrained conditions in the world around us. At the same time, our visual interpretation capacity is jointly acquired with the linguistic structures we use to describe the world - it would be desirable for machines to be able to learn in a similar way.
My research group at the Computer Vision Centre focuses on the design of computational models at the meeting point between vision and language, that efficiently exploit available textual information to solve any type of computer vision challenge. We investigate new technologies to give machines the capacity to read, as well as methods to enable computer vision models to learn by properly exploiting textual information, in or about images, and to use natural language interfaces to interact with humans.
In this talk I will be discussing recent research in the group for modelling the interplay between visual and textual information for computer vision applications. I will focus on recent work we have done on image captioning and visual question answering, while during the presentation I will touch upon scene text recognition methods, cross-modal image retrieval, joint visual-textual embeddings, semantic retrieval and self-supervised learning.

-----------

Bio:
Dr. Dimosthenis Karatzas received his PhD in Computer Science from the University of Liverpool, UK in 2003. He is a senior lecturer at the Universitat Autònoma de Barcelona, Spain and an associate director of the Computer Vision Centre where he leads the Vision and Language research line. His main research interests are computer vision and machine learning, and in particular scene text extraction and recognition, joint visual-textual embeddings, self-supervised learning and image captioning systems. He has published over 100 scientific papers and his H-factor (Google Scholar) is 24. In 2013, he received the IAPR/ICDAR Young Investigator Award and in 2016, a Google Research Award in the line of Machine Perception. Dr Karatzas has been the principal investigator of various research projects. He counts with extensive technology transfer experience, including the creation of two spin-off companies (TruColour, 2007 / AllRead, 2019). Technology transferred includes gas meter reading systems for Naturgy (>8k readings per week), and digital mailroom solutions for CaixaBank (millions of documents per month). He conceived and led the creation of the “Library Living Lab” (L3), converting a public library in Sant Cugat del Vallés, Barcelona, into an open, participatory innovation space. Under his leadership L3 became a member of the European Network of Living Labs in 2015 and was nominated for the city awards of Sant Cugat in 2016. He is the lead organiser of the Robust Reading Competition series, the de-facto established international benchmark in his research field, serving >9,000 registered researchers from >120 countries. Dr Karatzas is the chair of the 1,300 members strong Technical Committee 11 (Reading Systems) of the International Association of Pattern Recognition (IAPR). He is a member of the IAPR Education Committee, a past-member of the IAPR Industrial Liaison Committee, a member of the IEEE, while I have been a founding member and a member of the executive committee of the UK Chapter of the SPIE. By invitation of the Catalan government he participates in the work group defining the Catalan Strategy on Artificial Intelligence.

Organised by

NAVER LABS is the R&D subsidiary of NAVER, Korea’s leading internet company and the part of NAVER responsible for creating future technology. Its world-class researchers in Korea and Europe create new connections between people, machines, spaces and information by advancing technology in AI, robotics, autonomous driving, 3D/HD mapping and AR.

NAVER LABS Europe is the biggest industrial research lab in artificial intelligence in France and a hub of NAVER’s global AI R&D Belt, a network of centres of excellence in Korea, Japan, Vietnam & Europe. The scientists at NAVER LABS Europe conduct fundamental and applied research in the fields of machine learning and optimization, computer vision, natural language processing and UX and ethnography. The two main areas of application of research are ‘AI for Robotics’ and ‘AI for our Digital World’. The site is located in Grenoble, France.

LinkedIn  

Sales Ended