Aligning And Comparing Vision Representations To Improve Understanding And Performance