Learning joint representations of videos and sentences with web image searchAug 1, 2016·Mayu Otani,Yuta Nakashima,Esa Rahtu,Janne Heikkilä,Naokazu Yokoya· 0 min read Cite DOITypeConference paperPublicationProc. Workshop on Web-scale Vision and Social MediaLast updated on Aug 1, 2016 ← Video summarization using deep semantic features Sep 1, 2016Human action recognition-based video summarization for RGB-D personal sports video Jul 1, 2016 →