There has been a recent upsurge in complaints about the accuracy of data that Google uses in Maps. There were recent (false) reports of hijacking, of very old & outdated listings not being removed and of complete bungling of a medical facility’s listings. The increase in complaints is due in large part to the increased exposure of the data in the Local OneBox and the resulting increase of awareness on the part of business owners.
Bill Slawski and I have written about the issue of data accuracy as has Greg Sterling. It was (is) my contention that the data will improve in accuracy over time due to the self interest of the many parties involved. As I noted several months ago, the last step in that process would be getting small businesses directly involved in correcting their own record. That is starting to happen with the increased visibility of the Local OneBox.
There are other accuracy issues that are not addressed by my original post. For example: the problems with Google’s heavy reliance on an aglorithmic approach to information, the quality of the data that Google uses to create, verify and ultimately delete records, and the lack of easy end user corrections of obviously erroneous data.
That all being said, I wanted to test a data set against on the ground information to see if it was “accurate enough”. To do so I chose the data generated by the query: “Restaurants Olean, NY“. Why? Three reasons: 1)I know most of them by sight, 2)I had a local Chamber of Commerce list of current restaurants and 3)it presented a small enough set that I could manage the information.
Here is what I found:
*Google identified 71 restaurants with the query, the Chamber list identified 50.
*6 of Google’s 71 were in fact closed. Some as many as 3 (maybe 4) years
*4 of Google’s 71 were either duplicates or not really restaurants
*11 of Google’s were pubs and bars and in Olean. In this area, they don’t really serve food unless you consider Bud one of the basic food types.
*Google missed including 3 coffee shops that the Chamber had as restaurants and to its credit found 3 restaurants that the Chamber did not include.
*Google generally ranked the restaurants reasonably by their local popularity on the Maps listing (with the exception of my favorite that they put at number 10…guess its time to stuff the reviews:)).
*The ranking and choices for the Local OneBox were very good. The number 1 and number 2 choices are two of the area’s most popular and busiest restaurants. The choice for number 3, Pizza Hut is arguable but a reasonable choice.
*In the top 10 Map listings there was only one closed restaurant
The bulk of Google’s data accuracy problems occur below the top 10 Map’s listings. The types of errors range from mildly annoying (dupes & poor categorization of bars as eating establishments) to more serious (the closings). The issue of the closings not being removed in a timely fashion is far more problematic from a users perspective than a duplicate. And it is an area that definitely needs work. End user input would dramatically improve timeliness on this and there would be a strong incentive (for competitors) to report the problem. This probably can’t happen soon enough.
For the most part, the data works well enough. Why then is this data “acceptable” but we see other data sets that are not?
One theory I have is that those industries that more aggressively promote themselves through multiple channels are more likely to be kept up to date with Google’s system. The restaurants & hotels are examples. Industries like the health care industry and research hospitals promote themselves through fewer channels and are less likely to have accurate data.
We need to see how deep and wide the inaccuracies are within the Map product, particularly in the Local OneBox. While in my example the quality of the data seems “acceptable” if by no means perfect as I point out elsewhere that is of no consolation to the poor schmuck who suffers the imperfection.
Other little tidbits of interest:
*The search “Restaurant Olean NY” brings up an authoritative onebox which strikes me as overly agressive.
*Only one restaurant (Domino’s Pizza) had a coupon. This is probably from ValPak
*It appeared that many but not all of the national chains had a feed as they had their own category like: Category: Restaurants Subway. Pizza Hut was one that appeared to not have a feed…go figure.
*Reviews are not that common in the rural area in which we live. Although of the top 20 listings, 17 had reviews, none had more than 6 and most had 1 or 2
If anyone would like to review my source data, feel free to email me. I thought it too copious and boring to post.How accurate is Google Maps data? by Mike Blumenthal