Skip to content

MSA Keynote – Where the Bugs Are

Today the Mining Software Archives Symposium started off with a talk by Prem Devanbu. Hopefully I will be able to provide the slides for this talk later on but so far no-one uploaded their slides to slideshare.

Prem had some pretty bad news for us today. Most likely all the bug data we use for predictions is biased. The consequences of that are as in the U.S. presidential elections when Truman run against Dewey. Although Truman won every predictions of all available surveys said that Dewey will win with a huge lead. The predictions were wrong, but why? Of course because of something called sample bias. The surveys that were conduced happened over the phone, but back in the day mostly rich people owned a phone and as tradition in the U.S. has it most rich people vote for republicans. See the problem, if you ask mostly supporter of one candidate of course the predictions will tell you that that one candidate will win.

Sadly, Prem reported in a study, he conducted together with Christian Bird, Abraham Bernstein and others, on evidence that such a sample bias also exists with linked bug fixing changes. Meaning that the automatically identifiable bug fixing changes are non representative sample and on top of that they also found that our best prediction models don’t over come sample bias.

One Comment leave one →


  1. Time « Adrian's Blog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: