Identifying birds, butterflies, and wildflowers with a snap

September 13, 2014 | categories: Python, Biology, Programming, Computer Vision, Machine Learning | View Comments

It's been a while since I started my little adventure of building an image recognition pipeline that would reliably identify animal and plant species from photos. I'm pretty happy with what the results are so far, and these days I'm even successfully scratching my own itch with it. Which is how I came up with the idea in the first place: I suck at identifying birds, butterflies, and plants, but I want to learn about them. And knowing an animal or plant's name is really the gateway to learning more about it.

So here's a painless way of identifying these species. Say you want to identify one bird that you see (and you're lucky to have a camera with a big zoom with you). You take a photo of it and then send it to @id_birds on Twitter using Twitter's own image upload. After a minute or two, you'll get the automatic reply, with hopefully the right species name. @id_birds can with a remarkably high accuracy identify yours between the 256 different birds that it knows.

There's two other bots that are also available to use for free: @id_butterflies knows around 50 butterfly species, and @id_wildflowers around 200 wildflowers. All three bots know species occurring natively in Europe only, at least for now.

"But does it actually work?" I hear you say. And the answer is: it works pretty well, especially if you send good quality photos where the animal or plant is clearly visible.

Seeing is believing though, so to prove that it works well, let's try and identify the ten birds that are in The Wildlife Trust's river bird spotter sheet.

http://danielnouri.org/media/river-bird-spotter.jpg

Image credits: Grey heron (c) Amy Lewis / Moorhen and mallard (c) Steve Waterhouse / Kingfisher (c) Margaret Holland / Mute swan (c) Neil Aldridge / Dipper and grey wagtail (c) Tom Marshall / Coot (c) David Longshaw / Great crested grebe (c) Don Sutherland / Tufted duck (c) Bob Coyle

Let's start with the first bird, a grey heron. To the left we have me addressing the Twitter bot and sending the image. To the right is @id_birds' reply. And it's correct! Ardea cinerea is indeed the scientific name for the grey heron:

(If you're not seeing the tweets below, I suggest you view this on my blog.)

Notice how @id_birds will always respond with the two most likely guesses. The second guess here was the demoiselle crane (Anthropoides virgo), which looks sort of similar.

The next bird on the river bird spotter sheet is the common moorhen (Gallinula chloropus). Let's try to ID it:

And already we have @id_birds' first mis-classification. It thinks that this moorhen is a mallard (Anas platyrhynchos). Only the second guess is correct. Thus so far we're counting one correct, one error.

The next bird we want to identify is incidentally a mallard:

And the response is correct. Notice how the second guess (Coracias garrulus) seems a little weird. I guess it's the similarities in color here. We don't count the second guesses here though, so it's two correct, one error.

What about the kingfisher? That's a bird with a pretty unique appearance among birds native to Europe. So it's probably one of the easier birds to identify:

And sure enough, @id_birds knows what it is. The latin name for the kingfisher is Alcedo atthis.

The fifth bird on the spotter sheet is the mute swan (Cygnus olor). Everyone probably knows this one, but does @id_birds?

Yes, the answer is correct. Note that the second guess is also a swan, namely the whooper swan, quite similar in appearance. A good job!

So out of the first five birds we got four correct, not bad. But let's see about the rest of the birds from the spotter sheet.

I'll post the responses to the remaining five birds without further comment. But do note that of the remaining ones, only the coot (Fulica atra) is mis-classified, that's the third one below:

So all in all we have two mis-classifications out of ten. That's 8 out of 10 accuracy, which is pretty remarkable when you consider that the differences between these 256 different birds species can sometimes be minuscule. As an example, compare th grey wagtail (Motacilla cinerea) from above with the yellow wagtail. According to Wikipedia, the grey wagtail "looks similar to the yellow wagtail but has the yellow on its underside restricted to the throat and vent."

Of course ten picture is a tiny test set, but the score roughly matches what I see with the much larger validation set that I use for development, so it's cool. :-)

I'll write about how the computer vision works under the hood another time (but here's a hint). We call the stack that we've developed and that's running behind @id_birds and the other bots PhotoID. So far PhotoID has proven to have very competitive performance and be remarkably flexible to use for different image classification problems.

Let's conclude this post with two other IDs, one from @id_butterflies, the other from @id_wildflowers:

Happy ID'ing!

(Plug: Feel free to contact me through my company if you have a challenging image recognition problem that you want us to help solve.)

Read and Post Comments

Using deep learning to listen for whales

January 10, 2014 | categories: Python, Biology, Programming, Bioacoustics, Machine Learning | View Comments

Since recent breakthroughs in the field of speech recognition and computer vision, neural networks have gotten a lot of attention again. Particularly impressive were Krizhevsky et al.'s seminal results at the ILSVRC 2012 workshop, which showed that neural nets are able to outperform conventional image recognition systems by a large margin; results that shook up the entire field. [1]

Krizhevsky's winning model is a convolutional neural network (convnet), which is a type of neural net that exploits spatial correlations in 2-d input. Convnets can have hundreds of thousands of neurons (activation units) and millions of connections between them, many more than could be learned effectively previously. This is possible because convnets share weights between connections, and thus vastly reduce the number of parameters that need to be learned; they essentially learn a number of layers of convolution matrices that they apply to their input in order to find high-level, discriminative features.

http://danielnouri.org/media/deep-learning-whales-krizhevsky-lsvrc-2012-predictions.jpg

Figure 1: Example predictions of ILSVRC 2012 winner; eight images with their true label and the net's top five predictions below. (source)

Many papers have since followed up on Krizhevsky's work and some were able to improve upon the original results. But while most attention went into the problem of using convnets to do image recognition, in this article I will describe how I was able to successfully apply convnets to a rather different domain, namely that of underwater bioacoustics, where sounds of different animal species are detected and classified.

My work on this topic began with last year's Kaggle Whale Detection Challenge, which asked competitors to classify two-second audio recordings, some of which had a certain call of a specific whale on them, and others didn't. The whale in question was the North Atlantic Right Whale (NARW), which is a whale species that's sadly nearly extinct, with less than 400 individuals estimated to still exist. Believing that this could be a very interesting and meaningful way to test my freshly acquired knowledge around convolutional neural networks, I entered the challenge early, and was able to reach a pretty remarkable Area Under Curve (AUC) score of 97% after only three days into the competition. [2]

The trick to my early success was that I framed the problem of finding the whale sound patterns as an image reconition problem, by turning the two-second sound clips each into spectrograms. Spectrograms are essentially 2-d arrays with amplitude as a function of time and frequency. This allowed me to use standard convnet architectures quite similar to those Krizhevsky had used when working with the CIFAR-10 image dataset. With one of few differences in architecture stemming from the fact that CIFAR-10 uses RGB images as input, while my spectrograms have one real number value per pixel, not unlike gray-scale images.

http://danielnouri.org/media/deep-learning-whales-spectrogram.jpg

Figure 2: Spectrogram containing a right whale up-call.

Spurred by my success, I registered for the International Workshop on Detection, Classification, Localization, and Density Estimation (DCLDE) of Marine Mammals using Passive Acoustics, in St Andrews. A world-wide community of scientists meets every two years at this workshop to discuss the latest developments in using passive acoustics (listening for sounds) to detect and track marine mammals.

That there was such a breadth of research around this topic was entirely new to me, and it was fascinating to learn about it. Another thing that I've since learned is that this breadth is sadly dwarfed by the amounts of massive underwater noise that humans produce today, through shipping, oil exploration, and military sonar. And this noise severely affects the lives of animals for which "listening is as important as seeing is for humans – they communicate, locate food, and navigate using sound."

The talk that I gave at the DCLDE 2013 workshop was well received. In it, I elaborated how my method relied on little to no problem-specific human engineering, and therefore could be easily adapted to detect and classify all sorts of marine mammal sounds, not just right whale up-calls.

At DCLDE, the execution speed of detection algorithms was frequently quoted as being x times faster than real-time, with x often being a fairly low number around 1 to 10. My GPU-powered implementation turned out to be on the faster side here: on my workstation, it detects and classifies sounds 700x faster than real-time, which means it runs detections on one year of audio recordings in roughly twelve hours, using only a single NVIDIA GTX 580 graphics card.

In terms of accuracy, it was somewhat hard to get an idea of which of the algorithms presented really worked better than others. This had two reasons: the inconsistent use of reliable metrics such as AUC and use of cross-validation, and a lack of standard datasets that everyone could test their algorithms against. [3]

However, it should be mentioned that good datasets are a bit tricky to come by in this field. The nature of hydrophone recordings is that the signal you're listening for could be generated a few meters away, or many kilometers, and therefore be very faint. Plus, recordings often contain a lot of ambient noise coming from cargo ships, offshore drilling, hydrophone cable flutter, and the like. With the effect that often it's hard even for a human expert to tell if the particular sound they're listening to is a vocalization of the mammal they're looking for, or just noise. Thus, analysts will often label segments as unsure, and two analysts will sometimes even give conflicting labels to the same sound.

Four NARW up-calls that are easy to detect.
(Here's a much messier example. And some more fascinating recordings of marine mammals.)

This leads to a situation where people tend to ignore noisy sounds altogether, since if you consider them, predictions become difficult to verify manually, and good training examples harder to collect. But more importantly, when you ignore sounds with a bad signal-to-noise ratio (SNR), your algorithms will have an easier time learning the right patterns, too, and they will make fewer mistakes. As it turns out, noise is often more of a problem for algorithms than it is for human specialists.

The approach of ignoring sounds with a bad SNR seems fine until you're in a situation where you've put a lot of effort into collecting recordings, and then they turn out to be unusually noisy, and trying to adjust your model's detection threshold yields either way too many false positives detections, or too many calls are missed.

One of the very nice people I met at DCLDE was Holger Klinck from Oregon State University. He wanted to try out my convnet with one of his lab's "very messy" recordings. Some material that his group at OSU had collected at five sites near Iceland and Greenland in 2007 and 2008 had unusually high levels of noise in them, and their detection algorithms had maybe worked less than optimal there.

http://danielnouri.org/media/deep-learning-whales-osu-iceland-detections.png

Figure 3: "Locations of passive acoustic moorings near Iceland and southern Greenland (black spots), and the number of right whale upcalls detected per day in late 2007 at the five sites." Taken from [4]. Note the very low number of calls detected at the CE and SE sites.

I was rather amazed when a few weeks later I had a hard disk from OSU in my hands containing in total many years of hydrophone recordings from two sites near Iceland and four locations on the Scotian Shelf. I dusted off the model that I had used for the Kaggle Whale Detection Challenge and quite confidently started running detections on the recordings. Which is when I was in for a surprise: the predictions my shiny 97% model made were all really lousy! Very many obvious non-whale noises were detected wrongly. How was it possible?

To solve this puzzle, I had to understand that the Kaggle Whale Detection Challenge's train and test datasets had a strong selection bias in them. The tens of thousands of examples that I had used to train my model for the challenge were unrepresentative of all whale, and particularly, a lot of similar-sounding non-whale sounds out there. That's because the Kaggle challenge's examples were collected by use of a two-stage pipeline, where an automated detector would first pick out likely candidates from the recording, only after which a human analyst would label them with true or false. I realized that what we were building in the Kaggle challenge was a classifier that worked well only if it had a certain detector running in front of it that would take care of the initial pass of detection. My neural net had thus never seen during training anything like the sounds that it mistook for whale calls now.

If I wanted my convnet to be usable by itself, on continuous audio recordings, and independently of this other detector, I would have to train it with a more balanced training set. And so I ditched most of the training examples I had, and started out with only a few hundred, and trained a new model with them. As was expected, training with only few examples left me with a pretty weak model that would overfit and make lots of obvious mistakes. But this allowed me to pick up the worst mistakes, label them correctly, and feed them back into the system as training examples. And then repeat that. (A process that Olivier Grisel later told me amounts to active learning.)

Many (quite enjoyable) hours of listening to underwater sounds later, I had collected some 2000 training examples this way, some of which were already pretty tricky to verify. And luckily, the newly trained model started to make pretty good predictions. When I sent my results back to Holger, he said that, yes, the patterns I'd found were very similar to those that his group had found for the Scotian Shelf sites!

http://danielnouri.org/media/deep-learning-whales-my-scotian-shelf-detections.png

Figure 4: Number of right whale up-call detections per hour at two sites on the Scotian Shelf, detected by the convnet. The numbers and seasonal pattern match with what Mellinger et al. reported in [5].

The OSU team had used a three-stage detection process to produce their numbers. Humans verified in phases two (broadly) and three (in more detail) the detections that the algorithm came up with in phase one. Whereas my detection results came straight out of the algorithm.

A case-by-case comparison still needs to happen, but the similarities of the overall call patterns suggests that the convnet reaches comparable performance, but without the need for human analysts to be part of the detection pipeline, making it potentially much more time-efficient to use in practice.

What's even more exciting is that the neural net was able to find right whale up-calls at the problematic SE site near Iceland, where previously no up-calls could be detected due to high noise levels.

http://danielnouri.org/media/deep-learning-whales-my-iceland-detections.png

Figure 5: NRW up-call detections per day at sites SW and SE near Iceland, detected by the convnet. The patterns at the SW site match roughly with what was reported in [4], while no calls could be identified previously at the SE site (cf. Figure 3).

Another thing we're currently looking into is wheter or not the relatively small but constant number of calls that the convnet detected during Winter season are real, or if they're false positives. Right whales are not known to hang around so high up North during that time of the year, so proving that would constitute significant news for people studying the migration routes of these whales.

(Comments also on Hacker News.)

[1]For a more detailed history and recent developments around neural nets, see this article in Nature: "Computer science: The learning machines".
[2]See the mention of my results in this Wired article: "Wanted: Right Whale Caller ID".
[3]For a comparison of machine learning algorithms in use, see: Mellinger DK, et al. 2007. An overview of fixed passive acoustic observation methods for cetaceans. Oceanography 20:36–45.
[4](1, 2) Mellinger DK, et al. 2011. Confirmation of right whales near a historic whaling ground east of Southern Greenland. Biol Lett 7:411−413
[5]Mellinger DK, et al. 2007. Seasonal occurrence of North Atlantic right whale (Eubalaena glacialis) vocalizations at two sites on the Scotian Shelf. Mar. Mamm. Sci. 23, 856–867.

Read and Post Comments

Kotti Zidanca sprint report

July 31, 2013 | categories: Python, Web, Pyramid, Programming, Kotti | View Comments

Big up to Termitnjak for organizing the amazing Zidanca sprint held last week. The venue for our get-together in south-eastern Slovenia was one of the most beautiful place I've ever sprinted at. Despite that and the gallons of fine wine that were poured, we still managed to get quite a few things done. Here is a summary.

http://www.coactivate.org/projects/zidanca-sprint-2013/project-home/lokve2.JPG

kotti_tinymce: new version

Vanč <ferewuz> worked on upgrading Kotti's WISYWIG editor TinyMCE to version 4.0.2. This new version features a much nicer looking user interface, which fits a lot better into the overall Kotti style. It also works better with smaller displays, as the editor will now scale down along with the browser window's width. Our existing image-upload and document-linking pop-ups were updated to work with the new version. You can test drive on our demo site.

deform_bootstrap: tabbed forms

Natan <nightmarebadger> worked on adding tabbed forms support to deform_bootstrap. With this change, each nested (Mapping)Schema in your Colander schema will be rendered inside its own tab. As an example, consider this Client schema:

import colander

class Person(colander.Schema):
    name = colander.SchemaNode(
        colander.String(),
        title='Name',
        )

class Car(colander.Schema):
    horsepower = colander.SchemaNode(
        colander.Integer(),
        title='Horsepower',
        )

class Client(colander.Schema):
    person = Person(title='Person data')
    car = Car(title='Car stuffs')

A deform_bootstrap form using the Client schema will render with two tabs 'Person data' and 'Car stuffs'.

kotti_multilingual: translated content

kotti_multilingual was started by Andreas <disko> earlier this year. It includes basic support for language roots and switching between languages. The sprinters decided to use this package as the starting point for adding more advanced content translation support.

Before getting down to coding, we first had a round of discussion with Domen <iElectric>, Ramon <bloodbare>, Jure <ibi> and me. In particular, we discussed how plone.multilingual handles translations and 'language independent fields' and compared it to the venerable LinguaPlone. We decided that we'd use the LinguaPlone model, where translations derive from a so-called canonical document. This is the only document that holds the actual data for any of the language independent fields, and where all translated documents refer to when they look up those fields.

This is still a work in progress, but so far we've managed to h̶a̶c̶k̶ convince SQLAlchemy to return the value of a language independent attribute from the canonical document, if it exists.

We've also implemented linking between translations using a separate translations table. This, too, works quite similar to how LinguaPlone does it when you configure it to use one root folder per language (e.g. /en, /de, and so on). For locating the destination folder for new translations, we look up the parent document's translation, and put the new translation in there. Thus, a translation from /en/food/mexican will automagically be created at /de/food/mexican, given that /de/food is already a translation of /en/food.

Versioning support

Andreas pointed us to the SQLAlchmey Versioned Objects example. He had used it in a Pyramid project and it worked very well for him. So Vanč took this up on the last day of the sprint and implemented a prototype for history support for Kotti.

Vanč reports that his prototype saves the old version of a document anytime you edit it, and keeps all the old versions. Next on his list is adding a user interface for viewing older versions (there's merely a list right now), and adding options for deciding which content types to version and how many old versions to keep. Other things to work on are a separate button for when you want to save a new version, so that small changes can be made without creating a new version. And possibly a comment field to describe what the changes in your new version are about.

Finishing the tutorial

Natan took on the important task of finishing the developer tutorial. Previously, the tutorial introduced concepts such as setting up a new project, configuring Kotti, adding Poll and Choice content types, and simple views. But it did not have forms to actually vote on the poll. This is now being added.

During the sprint, we also discussed which are the most important topics that the docs don't cover well enough yet. We decided we should add more documentation for users and security, workflows, and a more advanced forms example.

Read and Post Comments

libblas and liblapack issues and speed, with SciPy and Ubuntu

December 19, 2012 | categories: Python, Programming | View Comments

The problem

You've built from source a package that uses the linear algebra libraries BLAS and LAPACK, but now you're getting:

libatlas.so.3: cannot open shared object file: No such file or directory

In my case, I was trying to install SciPy from source using pip install scipy, but then ran into this issue when I tried to use it. I also ran into this other problem:

ImportError: scipy/linalg/clapack.so: undefined symbol: clapack_sgesv

To fix it

  1. Make sure you've got libatlas3-base installed:

    sudo apt-get install libatlas3-base
    
  2. Make sure you're using the right implementation of libblas.so.3, that is, the one that libatlas3-base provides. This implementation lives in /usr/lib/atlas-base/atlas/libblas.so.3.

    To do this, run the following command to set your system's default Atlas implementation:

    sudo update-alternatives --config libblas.so.3
    

    An example of what this might produce, and what number you might need to enter:

    Selection      Path                                     Priority   Status
    ------------------------------------------------------------
    * 0            /usr/lib/openblas-base/libopenblas.so.0   40        auto mode
      1            /usr/lib/atlas-base/atlas/libblas.so.3    35        manual mode
      2            /usr/lib/libblas/libblas.so.3             10        manual mode
      3            /usr/lib/openblas-base/libopenblas.so.0   40        manual mode
    
     Press enter to keep the current choice[*], or type selection number: 1
    
  3. Use liblapack3.so.3 from /usr/lib/atlas-base/atlas/liblapack.so.3 as your default:

    sudo update-alternatives --config liblapack.so.3
    

That's it. Now you shouldn't be getting the aforementioned errors anymore.

More information

The Debian Wiki has an overview of linear algebra libraries available, and it also describes how to use update-alternatives to switch between BLAS and LAPACK implementations.

Speed it up! Build an optimized Atlas for your architecture.

The README.Debian file of libatlas3-base explains:

Building your own optimized packages of Atlas is straightforward. Just get the sources of the package and its build-dependencies:

# apt-get source atlas
# apt-get build-dep atlas
# apt-get install devscripts dpkg-dev dch

and type the following from the atlas source subdir:

# fakeroot debian/rules custom

it should produce a package called:

../libatlas3-base_*.deb

which is optimized for the architecture Atlas has been built on. Then install the package using dpkg -i.

If you, like me, get this "classical" error when building Atlas:

VARIATION EXCEEDS TOLERENCE, RERUN WITH HIGHER REPS

then you'll need to read this: Your install dies with "unable to get timings in tolerance".

Lastly, you might need to remove any existing libopenblas installation from your system for the building of the Debian package to finish successfully:

sudo aptitude purge libopenblas-{base,dev}

Read and Post Comments

Use apt-get to install Python dependencies for Travis CI

November 23, 2012 | categories: Python, Programming | View Comments

Travis CI is used by more and more open source Python projects to do their continous testing. scikit-learn is the latest project to adopt it.

With Travis, usually you'll use pip or setuptools to get your project's Python dependencies installed before running the test script. Here's the minimal .travis.yml example from the Travis docs for Python, one that will install dependencies using pip before it runs nosetests:

language: python
python:
  - "2.6"
  - "2.7"
  - "3.2"
# command to install dependencies
install: "pip install -r requirements.txt --use-mirrors"
# command to run tests
script: nosetests

There is however some dependencies where this is problematic. Two of those are numpy and scipy, which contain a lot of C code that, with the method just discussed, needs to be compiled every time you run the Travis tests.

Travis allows you to install system packages through apt-get, which is quite cool. And there's already the binary python-numpy and python-scipy packages in Ubuntu, so why not use them?

The problem is that simply installing them via apt-get does not work for the same reason it doesn't work when you do this locally: The default virtualenv that Travis sets up for you to run the tests in is isolated from the system packages, so it won't see those globally installed numpy and scipy packages.

The solution for this is use virtualenv with the --system-site-packages option, which allows you to also import packages from the global site packages directory.

How it works

Add these lines to your Travis configuration to use a virtualenv with --system-site-packages:

virtualenv:
  system_site_packages: true

You can thus install Python packages via apt-get in the before_install section, and use them in your virtualenv:

before_install:
 - sudo apt-get install -qq python-numpy python-scipy

A real-world use of this approach can be found in nolearn.

Read and Post Comments

Next Page »