Blog and info about explorations done in my Ph.D. research
From 2020
Work done during course 11-777 Multi-modal Machine Learning, prof. Louis-Philippe Morency, Fall 2020. Link to course website here. Link to GitHub repo here.
ADARIBERT: An object-agnostic multimodal BERT transformer to ground design intents in images.

Common cross-modal retrieval work doesn’t work on ADARI, as those are normally trained to detect and describe objects, not attributes. Sample of testing set. Implementation based on this paper. Raw notebook code on GitHub. More information here.
Unimodal distribution: design intent’s generation, early experiments. DCGAN on Furniture section of ADARI dataset. 17500 images on the training set.

Previous research http://www.multi-resolution.com