Egocentric image captioning model fine tuned with pretrained blip-image-caption generation model.