Dublin City University and Partners' participation in the INS and VTT Tracks at TRECVid 2016

Marsden, Mark; Mohedano, Eva; McGuinness, Kevin; Calafell, Andrea; Giro-i-Nieto, Xavier; O'Connor, Noel E.; Zhou, Jiang; Azavedo, Lucas; Daudert, Tobias; Davis, Brian; Hürlimann, Manuela; Afli, Haithem; Du, Jinhua; Ganguly, Debasis; Li, Wei; Way, Andy; Smeaton, Alan F.

View/Open

TRECVid2016.pdf (1.366Mb)

Date

2016-11-14

Author

Marsden, Mark

Mohedano, Eva

McGuinness, Kevin

Calafell, Andrea

Giro-i-Nieto, Xavier

O'Connor, Noel E.

Zhou, Jiang

Azavedo, Lucas

Daudert, Tobias

Davis, Brian

Hürlimann, Manuela

Afli, Haithem

Du, Jinhua

Ganguly, Debasis

Li, Wei

Way, Andy

Smeaton, Alan F.

Metadata

Show full item record

Usage

This item's downloads: 348 (view details)

Recommended Citation

Marsden, Mark and Mohedano, Eva and McGuinness, Kevin and Calafell, Andrea and Giró-i-Nieto, Xavier and O'Connor, Noel E. and Zhou, Jiang and Azavedo, Lucas and Daudert, Tobias and Davis, Brian and Hurlimann, Manuela and Afli, Haithem and Du, Jinhua and Ganguly, Debasis and Li, Wei B. and Way, Andy and Smeaton, Alan F. (2016) Dublin City University and Partners’ Participation in the INS and VTT Tracks at TRECVid 2016. In: TRECVid Conference, 14-16 Nov 2016, Gaithersburg, Md., USA.

Abstract

Dublin City University participated with a consortium of colleagues from NUI Galway and Universitat Polit`ecnica de Catalunya in two tasks in TRECVid 2016, Instance Search (INS) and Video to Text (VTT). For the INS task we developed a framework consisting of face detection and representation and place detection and representation, with a user annotation of top-ranked videos. For the VTT task we ran 1,000 concept detectors from the VGG-16 deep CNN on 10 keyframes per video and submitted 4 runs for caption re-ranking, based on BM25, Fusion, Word2Vec and a fusion of baseline BM25 and Word2Vec. With the same pre-processing for caption generation we used an open source image-to-caption CNN-RNN toolkit NeuralTalk2 to generate a caption for each keyframe and combine them.

URI

http://hdl.handle.net/10379/6242

Collections

Data Science Institute (Workshop Papers)

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland