It's been a while since I started my little adventure of building an image recognition pipeline that would reliably identify animal and plant species from photos. I'm pretty happy with what the results are so far, and these days I'm even successfully scratching my own itch with it. Which is how I came up with the idea in the first place: I suck at identifying birds, butterflies, and plants, but I want to learn about them. And knowing an animal or plant's name is really the gateway to learning more about it.
So here's a painless way of identifying these species. Say you want to identify one bird that you see (and you're lucky to have a camera with a big zoom with you). You take a photo of it and then send it to @id_birds on Twitter using Twitter's own image upload. After a minute or two, you'll get the automatic reply, with hopefully the right species name. @id_birds can with a remarkably high accuracy identify yours between the 256 different birds that it knows.
There's two other bots that are also available to use for free: @id_butterflies knows around 50 butterfly species, and @id_wildflowers around 200 wildflowers. All three bots know species occurring natively in Europe only, at least for now.
"But does it actually work?" I hear you say. And the answer is: it works pretty well, especially if you send good quality photos where the animal or plant is clearly visible.
Seeing is believing though, so to prove that it works well, let's try and identify the ten birds that are in The Wildlife Trust's river bird spotter sheet.
Let's start with the first bird, a grey heron. To the left we have me addressing the Twitter bot and sending the image. To the right is @id_birds' reply. And it's correct! Ardea cinerea is indeed the scientific name for the grey heron:
(If you're not seeing the tweets below, I suggest you view this on my blog.)
Notice how @id_birds will always respond with the two most likely guesses. The second guess here was the demoiselle crane (Anthropoides virgo), which looks sort of similar.
The next bird on the river bird spotter sheet is the common moorhen (Gallinula chloropus). Let's try to ID it:
And already we have @id_birds' first mis-classification. It thinks that this moorhen is a mallard (Anas platyrhynchos). Only the second guess is correct. Thus so far we're counting one correct, one error.
The next bird we want to identify is incidentally a mallard:
And the response is correct. Notice how the second guess (Coracias garrulus) seems a little weird. I guess it's the similarities in color here. We don't count the second guesses here though, so it's two correct, one error.
What about the kingfisher? That's a bird with a pretty unique appearance among birds native to Europe. So it's probably one of the easier birds to identify:
And sure enough, @id_birds knows what it is. The latin name for the kingfisher is Alcedo atthis.
The fifth bird on the spotter sheet is the mute swan (Cygnus olor). Everyone probably knows this one, but does @id_birds?
Yes, the answer is correct. Note that the second guess is also a swan, namely the whooper swan, quite similar in appearance. A good job!
So out of the first five birds we got four correct, not bad. But let's see about the rest of the birds from the spotter sheet.
I'll post the responses to the remaining five birds without further comment. But do note that of the remaining ones, only the coot (Fulica atra) is mis-classified, that's the third one below:
So all in all we have two mis-classifications out of ten. That's 8 out of 10 accuracy, which is pretty remarkable when you consider that the differences between these 256 different birds species can sometimes be minuscule. As an example, compare the grey wagtail (Motacilla cinerea) from above with the yellow wagtail. According to Wikipedia, the grey wagtail "looks similar to the yellow wagtail but has the yellow on its underside restricted to the throat and vent."
Of course ten picture is a tiny test set, but the score roughly matches what I see with the much larger test set that I use for development, so it's cool. :-)
I'll write about how the computer vision works under the hood another time (but here's a hint: it's based deep learning). We call the stack that we've developed and that's running behind @id_birds and the other bots PhotoID. So far PhotoID has proven to have very competitive performance and be remarkably flexible to use for different image classification problems.
Let's conclude this post with two other IDs, one from @id_butterflies, the other from @id_wildflowers:
(Plug: Feel free to contact me through my company if you have a challenging image recognition problem that you want us to help solve.)