This fall semester, I took a class on Vision and Language, taught by Prof. Devi Parikh. Part of the class involved writing ~1 page long review/summaries of various recent papers related to AI tasks involving both visual perception and language processing, generally involving deep learning.
I decided to post a few of my reviews for posterity and just in case anyone is looking for what is hopefully a decent summary of each paper’s contributions, strengths, and weaknesses, in my eyes.