Meet the A.I That Sees How we See

Meet the A.I that sees how we see

Most of us are familiar with the somewhat imperfect visualizing skills of AIs, as typified by those examples where, when asked to see dogs, for example, they see them everywhere, or they struggle to define what is or isn’t a human face (particularly hilarious when it comes to face swap apps).

Part of the problem is that when we humans look at a picture, or at the world around us, we are processing light, and our brain is using key identifiers and visual cues to put together a reasonable sense of what it is we are taking in.

An AI, by comparison, is merely looking at thousands upon thousands of pixels and trying to make sense of the relationship between them, through finding patterns in the whole, without the mental shortcuts that humans are so used to using to find which elements in an image are important and what they might represent.

This is why a team at Duke University and MIT Lincoln laboratory have chosen to use a different approach, teaching their AI the tricks of the trade that we humans take for granted with a program called Birdwatcher.

Where previously AIs would have been force-fed a diet of the images they were looking for (ie, dogs’ faces) the Duke and MIT Lincoln team taught it to look for parts of images, in this case, parts of birds, using 11,788 images of 200 different species, and to feedback its observations and conclusions to the team.

By teaching the AI that a woodpecker might be identified by either a red plume or black and white breast feathers, they allowed it to use the visual compositing skills that help humans to quickly identify what an image might be, using the advance deep learning technique that has seen such breakthroughs across a variety of AIs.

The team also taught the AI that certain visual features might be common to several types of birds and gave it the discrimination skills needed to decide which, based on the visual evidence, might be the likely candidate in the picture before it.

The machine then displayed the image and highlighted the detail it had used to identify the particular bird type, with its reasoning for doing so, to allow the human operators to decide for themselves whether it had been chosen correctly. This wasn’t just to measure whether the test was working; this is fundamental to the model of how this AI would work in practice.

AIs can be used for quickly processing information, in times powers of magnitude greater than anything a human mind could manage. But trusting them to make decisions alone, particularly in matters of life and death, would be a step too far, particularly since human fallibility in programming them can never be ruled out.

These visualizing AIs could be useful in medicine, for spotting cancers, for instance, and the ability to present a human expert with what they have seen and why they have identified it allows for a last line of defense in preventing misdiagnosis.

It also allows the users working with the AI to observe its ‘thinking’ in a way that previous AIs just haven’t offered, meaning that any flaws in their thinking would be arduous, if not impossible, to identify or investigate.

Using Birdwatcher, the AI scored an 84% success rate, matching any other rivals in the field, but besting them due to this facility for sharing snapshots explaining how it reached its conclusions.

Recent research has shown that shortcuts humans use to process visual stimuli mean that we can identify images with as little as 13 milliseconds of viewing time.

They can lead to hasty misidentifications (ever waved at someone only to realize it wasn’t the person you thought?) but also more serious issues, such as hallucinations, particularly as seen in Charles Bonnet syndrome, where blind spots in a person’s vision can be filled with vivid hallucinations based on what your brain thinks you should be seeing.

Sign up for our newsletter to get the best of The Sized delivered to your inbox daily.