Advancing Vision-Language And Language Models In Low-Resource Settings