Skip Main Navigation
Eventbrite
Browse Events
Organise an event
Organise
Help
Log InSign Up
Menu
Page Content
This event has ended.

Oct

17

[Seminar] Computer Hanabi: Playing Near-Optimally or Learning by Reinforcem...

by Game AI research group of Queen Mary University of London

Actions and Detail Panel

Sales Ended

Date and time

Tue, 17 October 2017

16:00 – 17:00 BST

Location

School of Electronic Engineering and Computer Science, Peter Landin Building

10 Godward Square

Queen Mary University of London

London

E1 4FZ

United Kingdom

View map

Description

This is the semianr by Dr. Bruno Bouzy, Associate Professor, Laboratory of Informatics Paris Descartes (LIPADE), Université Paris Descartes (Paris, France).

Coffee & Tea & Biscults will be served from 3:30pm.

Abstract

Hanabi is a multi-player cooperative card game in which a player sees the cards of the other players but not his own cards. The team of players aims at maximizing a score. After a brief presentation of the rules of the game, this talk will describe two sets of experiments. The first one is an exploitation experiment (how to play as well as possible ?) and the second one explores some pros and cons of the reinforcement learning approach.

The first part will describe computer players corresponding to the state-of-the-art in computer Hanabi. Particularly, we will describe players using the hat principle and depth-one search. The hat principle is well-known in recreational mathematics and gives amazing results on the game of Hanabi, resulting in scores that are almost perfect.

In front of this, the new trend about deep learning led us to perform a second set of experiments to build reinforcement learners using neural networks - not necessarily deep - as function approximators. Hanabi being an incomplete information game, the preliminary results with self-play and shallow neural networks show that the game of Hanabi is a hard game to tackle with a learning approach. We will present our results and discuss the features of the game of Hanabi such as the number of players, the number of cards per player, the possibility to play with open cards or not, the problem of learning a convention, that make this game a good opportunity to test many techniques of reinforcement learning: with TD learning or with Q learning, the use of a replay memory or not, the number of layers in the network, and tuning considerations on the gradient descent.


Bio

Born in Paris (France), Bruno Bouzy is Assistant Professor of Computer Science in the Department of Mathematics and Computer Science at the Paris Descartes University since 1997, and in the Laboratory of Informatics of PAris DEscartes (LIPADE) since its creation in 2005. His academic degrees include two engineering school diplomas (Ecole Polytechnique 1984, Ecole Nationale Supérieure des Techniques Avancées, 1986), a Ph.D. in Computer Science (1995) and an Habilitation for Research Supervising in Computer Science (2004). Between 1986 and 1991, he held a consulting engineer position with GSI, a leading software advisory company. Bruno Bouzy is the author of the Go playing program Indigo which won three bronze medals at the computer olympiads: two on 19x19 board (2004, 2006), and one on 9x9 board (2005). These achievements resulted from using the Monte-Carlo (MC) approach for the first time in a competitive Go playing program: playing out simulations until the end, and computing action values with the outcomes of the simulations. After these promising results, the Computer Go community adopted the MC approach, and the Monte-carlo Tree Search (MCTS) framework was created in 2006, and became the standard approach for many games. Since 2007, Bruno Bouzy took a step back from Computer Go - all Go playing progams were MCTS based programs - and moved to other interesting and difficult challenges such as Multi-Agent Learning (2008-2010), the game of Amazons (2004-2010), the Voronoi game (2009-2011), Cooperative Path-Finding (2012-now), the game of Hex (2013), the Rubik's cube (2014), the weak Schur problem (2014-2015), the Pancake problem (2015-2016). Today, the incomplete information games remaining hard obstacles for Artificial Intelligence, Bruno Bouzy works on the game of Hanabi, a cooperative card game. Practically, to obtain results as good as possible in all these domains, Bruno Bouzy uses various methods such as Game Theory, Heuristic Search, MCTS, Neural Networks, Reinforcement Learning, and domain dependent tools as well.

Google Scholar


Venue

Mile End campus map

Coffee & Tea & Biscuilts (3:30pm-3:55pm) and Refreshment (5:05-6pm):

3rd floor, Bancroft Road Teaching Rooms (CS), QMUL. It is the building number 6, but you will need to enter from the Bancroft Road (see black arrow on the map).

Seminar room (4-5pm):

3.24 Engineering Building (building number 15), QMUL.


Tags

  • United Kingdom Events
  • Greater London Events
  • Things to do in London
  • London Seminars
  • London Science & Tech Seminars
Event ended

[Seminar] Computer Hanabi: Playing Near-Optimally or Learning by Reinforcement ?


Follow this organiser to stay informed on future events

Game AI research group of Queen Mary University of London

Event creator

Events you might like

  • Computer Films

    Computer Films
    Computer Films

    Thu, Dec 15, 14:30
    BCS, The Chartered Institute for IT • London
    Free
  • SWC (Hybrid) Seminar -   Professor Loren Frank

    SWC (Hybrid) Seminar - Professor Loren Frank
    SWC (Hybrid) Seminar - Professor Loren Frank

    Mon, Sep 5, 12:00
    Sainsbury Wellcome Centre • London
    Free
  • Sustainability in Computer science

    Sustainability in Computer science
    Sustainability in Computer science

    Tue, Sep 20, 18:15
    32 Aybrook St • London
    Free
  • Learning Through Play

    Learning Through Play
    Learning Through Play

    Mon, Nov 21, 12:30
    The Stephen Wiltshire Centre • London
    Free
  • Learning Through Play

    Learning Through Play
    Learning Through Play

    Fri, Oct 14, 10:00
    Randolph Beresford Early Years Centre • London
    Free
  • MACHINE LEARNING CONFERENCES 2022

    MACHINE LEARNING CONFERENCES 2022
    MACHINE LEARNING CONFERENCES 2022

    Thu, Oct 27, 09:00
    London • London
    £594
  • Estate Planning Seminar

    Estate Planning Seminar
    Estate Planning Seminar

    Tue, Sep 6, 17:30
    Waverton Investment Management Ltd • London
    Free
  • 2|SEC Cyber Circle - Breakfast Seminar - 'Operational Resilience'

    2|SEC Cyber Circle - Breakfast Seminar - 'Operational Resilience'
    2|SEC Cyber Circle - Breakfast Seminar - 'Operational Resilience'

    Thu, Sep 8, 09:00
    The Folly • London
    Free
  • Imperial College Lighthill Lecture 2022: What makes turbulence tick?

    Imperial College Lighthill Lecture 2022: What makes turbulence tick?
    Imperial College Lighthill Lecture 2022: What makes turbulence tick?

    Mon, Sep 12, 16:00
    Imperial College London, City & Guilds Building • London
    Free
  • SOFEA Seminar - EARTH

    SOFEA Seminar - EARTH
    SOFEA Seminar - EARTH

    Mon, Sep 26, 10:30
    Wu's Tai Chi Chuan Academy • London
    £86.83 - £108.39

Site Navigation

Use Eventbrite

  • How it Works
  • Pricing
  • Event Blog
  • Event Planner Forum

Plan events

  • Online Registration
  • Sell Event Tickets
  • Event Management Software

Find events

  • Browse London Events
  • Get the Eventbrite App

Connect with us

  • Report This Event
  • Help Centre
  • Terms
  • Privacy
  • Accessibility
  • Community Guidelines
Eventbrite + Ticketfly

© 2022 Eventbrite