You are here

Dublin City University and Partners’ Participation in the INS and VTT Tracks at TRECVid 2016


Mark Marsden, Eva Mohedano, Kevin McGuinness, Andrea Calafell, Xavier Giro-i-Nieto, Noel O'Connor, Jiang Zhou, Lucas Azevedo, Tobias Daudert, Brian Davis, Manuela Hurlimann, Haithem Afli, Jinhua Du, Debasis Ganguly, Wei Li, Andy Way, Alan Smeaton

Publication Type: 
Refereed Conference Meeting Proceeding
Dublin City University participated with a consortium of colleagues from NUI Galway and Universitat Polit`ecnica de Catalunya in two tasks in TRECVid 2016, Instance Search (INS) and Video to Text (VTT). For the INS task we developed a framework consisting of face detection and representation and place detection and representation, with a user annotation of top-ranked videos. For the VTT task we ran 1,000 concept detectors from the VGG-16 deep CNN on 10 keyframes per video and submitted 4 runs for caption re-ranking, based on BM25, Fusion, word2vec and a fusion of baseline BM25 and word2vec. With the same pre-processing for caption generation we used an open source image-to-caption CNN-RNN toolkit NeuralTalk2 to generate a caption for each keyframe and combine them.
Conference Name: 
Proceedings of TRECVid
Digital Object Identifer (DOI):
Publication Date: 
Conference Location: 
United States of America
Research Group: 
Dublin City University (DCU)
Open access repository: