Egoshots consists of real-life ego-vision images captioned using state of the art image captioning models, and aims at evaluating the robustness, diversity, and sensitivity of these models, as well as providing a real life-logging setting on- the-wild dataset that can aid the task of evaluating real settings. It consists of images from two computer scientists while interning at Philips Research, Netherlands, for one 1 month each.
Images are taken automatically by the Autoographer wearable camera when events of interest are detected autonomously.
Egoshots Dataset images are availaible at Egoshots repo with corresponding (transfer learning pre-trained) captions here.