A Unified Framework With Benchmarks For Human-Like Visual And Relational Reasoning In The Real World