Deep Science: Combining vision and language could be the key to more capable AI

Home » News, Insights & Trends » Business » Deep Science: Combining vision and language could be the key to more capable AI

Depending on the theory of intelligence to which you subscribe, achieving “human-level” AI will require a system that can leverage multiple modalities — e.g., sound, vision and text — to reason about the world. For example, when shown an image of a toppled truck and a police cruiser on a snowy freeway, a human-level AI might infer that dangerous road conditions caused an accident. Or, running on a robot, when asked to grab a can of soda from the refrigerator, they’d navigate around people, furniture and pets to retrieve the can and place it within reach of the requester.
Today’s AI falls short. But new research …

"The Power of AI in Business and Entrepreneurship: Unlocking Opportunities and Driving Success"

"The Power of AI: Revolutionizing Business and Empowering Entrepreneurs"

Margaritaville Aims to Hang On After Jimmy Buffett’s Death

Pork Industry Grapples With Whiplash of Shifting Regulations