How accurate is Google Maps data?

There has been a recent upsurge in complaints about the accuracy of data that Google uses in Maps. There were recent (false) reports of hijacking, of very old & outdated listings not being removed and of complete bungling of a medical facility’s listings. The increase in complaints is due in large part to the increased exposure of the data in the Local OneBox and the resulting increase of awareness on the part of business owners.

Bill Slawski and I have written about the issue of data accuracy as has Greg Sterling. It was (is) my contention that the data will improve in accuracy over time due to the self interest of the many parties involved. As I noted several months ago, the last step in that process would be getting small businesses directly involved in correcting their own record. That is starting to happen with the increased visibility of the Local OneBox.

There are other accuracy issues that are not addressed by my original post. For example: the problems with Google’s heavy reliance on an aglorithmic approach to information, the quality of the data that Google uses to create, verify and ultimately delete records, and the lack of easy end user corrections of obviously erroneous data.

That all being said, I wanted to test a data set against on the ground information to see if it was “accurate enough”. To do so I chose the data generated by the query: “Restaurants Olean, NY“. Why? Three reasons: 1)I know most of them by sight, 2)I had a local Chamber of Commerce list of current restaurants and 3)it presented a small enough set that I could manage the information.

Here is what I found:

*Google identified 71 restaurants with the query, the Chamber list identified 50.

*6 of Google’s 71 were in fact closed. Some as many as 3 (maybe 4) years

*4 of Google’s 71 were either duplicates or not really restaurants

*11 of Google’s were pubs and bars and in Olean. In this area, they don’t really serve food unless you consider Bud one of the basic food types.

*Google missed including 3 coffee shops that the Chamber had as restaurants and to its credit found 3 restaurants that the Chamber did not include.

*Google generally ranked the restaurants reasonably by their local popularity on the Maps listing (with the exception of my favorite that they put at number 10…guess its time to stuff the reviews:)).

*The ranking and choices for the Local OneBox were very good. The number 1 and number 2 choices are two of the area’s most popular and busiest restaurants. The choice for number 3, Pizza Hut is arguable but a reasonable choice.

*In the top 10 Map listings there was only one closed restaurant

The bulk of Google’s data accuracy problems occur below the top 10 Map’s listings. The types of errors range from mildly annoying (dupes & poor categorization of bars as eating establishments) to more serious (the closings). The issue of the closings not being removed in a timely fashion is far more problematic from a users perspective than a duplicate. And it is an area that definitely needs work. End user input would dramatically improve timeliness on this and there would be a strong incentive (for competitors) to report the problem. This probably can’t happen soon enough.

For the most part, the data works well enough. Why then is this data “acceptable” but we see other data sets that are not?

One theory I have is that those industries that more aggressively promote themselves through multiple channels are more likely to be kept up to date with Google’s system. The restaurants & hotels are examples. Industries like the health care industry and research hospitals promote themselves through fewer channels and are less likely to have accurate data.

We need to see how deep and wide the inaccuracies are within the Map product, particularly in the Local OneBox. While in my example the quality of the data seems “acceptable” if by no means perfect as I point out elsewhere that is of no consolation to the poor schmuck who suffers the imperfection.

Other little tidbits of interest:

*The search “Restaurant Olean NY” brings up an authoritative onebox which strikes me as overly agressive.

*Only one restaurant (Domino’s Pizza) had a coupon. This is probably from ValPak

*It appeared that many but not all of the national chains had a feed as they had their own category like: Category: Restaurants Subway. Pizza Hut was one that appeared to not have a feed…go figure.

*Reviews are not that common in the rural area in which we live. Although of the top 20 listings, 17 had reviews, none had more than 6 and most had 1 or 2

If anyone would like to review my source data, feel free to email me. I thought it too copious and boring to post.

Please consider leaving a comment as your input will help me (& everyone else) better understand and learn about local.
How accurate is Google Maps data? by

19 thoughts on “How accurate is Google Maps data?”

  1. Glad: you posted that Mike. I’m going to do some research on the volume of bad, outdated, data.

    Its definitely easier to test in a smaller market than a larger one.

    The problems with mistakes can cover a lot of territory. The Duke Hospital problem was flooding important phone numbers with mis directed calls. Flooding them. Not good.

    In my business case….after 3 years of never receiving a call of this type, in the last week we received 2 calls from potential customers remarking that they had called up to 3 phone numbers without getting answers. Clearly the result of bad data. Not good for users.

  2. Yes, I was able to find a testable data set and verify against reality. Although, in larger markets you just would need to pick smaller industries and see if the error rate is consistent accross them or varies as I think it might.

    Bad data is a royal pain and the new highlighting of local data will certainly uncover the bad stuff fairly quickly.

    For me the the measure of success for google and anyone in this game is whether it will improve over time and if in over that timespan it will get good enough for most people, most of the time.

    Let me know how your experiments go.

  3. Hello Mike,
    I live in a tiny town with only 4 restaurants. I’ve noticed that when I do a local search for restaurants, my town, my state, local search does first list the 4 restaurants, but after this, they are including random restaurants from the next big town over.

    Would you consider this bad data, in that it might confuse one into thinking the restaurants are located in my town (and not 10 miles away in the next town over)? Why would Google feel the need to supplement the small but correct dataset with outside data? Any theories?

    I really enjoyed this latest article of yours.
    Kind Regards,
    Miriam

  4. It seems to me that just because Google is good with algorithms, it always looks for an algorithmic solution for everything. Perhaps there are other ways of cutting through these data problems.

    For example, I’ve proposed that every web page should have an associated
    LURI or Location Uniform Resource Identifier. If such an approach was adopted, then there is no ambiguity: you know exactly where the place is. It’s simple and there’s every incentive for the website owner to get it right.

  5. Hi Miriam

    Great question. Do I have a theory? That is like asking if I have a nose…I always have a theory…accuracy though is not guaranteed :).

    Bill, Matt and I have had an extensive email exchange on this question and I was going to post on it this coming week…

    Bill Slawski calls it “Location Sensitivity” and has described a patent that deals with this. In Google’s local algorithm if they do not locate the service you are looking for in the city in which you are searching, they will search going out in a radius from the central point of your query…if the answer is within X distance (travel time?) of your query they will give these listings in distance order (more or less).

    If the service does not exist within a certain distance, Google will not display the Onebox at all…so they seem to have a definition of how far out the market exists… Again Bill has mentioned patents that describe that this distance might vary depending on whether the market is urban or rural. I have not explored that specific question yet (and I may be misquoting Bill).

    In rural areas there are many services “in the next town” over and in urban areas many businesses, particularly in industires with low density, are in the burbs…both might generate “next town” over responses.

    This query illustrates the idea of both proximity for relevance and distance for ranking as there are no motorcycle dealers in the city of Salamanca, NY:

    Salamanca NY Motorcycle dealer

    There are many business owners that are located in the next town over sompelace in the world and would ask: “I am the leading purveyor of this widget in the next town over, why am I not listed first” so the issue washes both ways on both the accuracy and the quality front. And when you are looking for a plumber do you really care as long as he/she gets there when your pipes are leaking (business location matters little in this market)?

    I would ask a question of you…do you consider these other restaurants close enough to visit on occasion? Are they in your market area? Did Google’s “location sensitivity” algo do a passable job?

    So to (finally) answer your question, I guess I would not consider it “bad” if the listings were in fact accurate and if Google’s definition of your market is reasonable…

    Whew…and you wanted to know if I had a theory…that will teach you :)

    Mike

  6. Hi Barry-

    Yes, Google’s reliance on algorithms works very well when they are indexing web pages on a topic and are expected to return relevant results as opposed to expecation of returning real world, accurate results in the local arena. Big difference there.

    One of the things they are attempting to do is find every business (am ambitious goal)…even those without websites.

    Your LURI would only solve the problem for those businesses with websites,no?

    Mike

  7. Mike, the LURI is a web page, admittedly of a very simple form. For example I have added a LURI to my website to show what a LURI might look like. If this idea took on, I’m sure there would be lots who would offer free hosting to such LURI web pages for those companies that didn’t want a full website. The other advantage of a LURI is that it displays well on a cell phone.

  8. I suppose that is what Google is attempting to do with their Local Business Center pin system…is verify every business in the world..

    Unfortunately, it doesn’t delete them when the business goes under. The closing of the business is only caught when they buy a list from one of the Yellow Page suppliers and the company is no longer there…mght that problem not perist regardless even with your suggestion?

    Mike

  9. Barry

    I need to go back and see if the addresses were accurate in the Google list of restaurants.

    I didn’t look closely at that issue although my sense was that they mostly had that aspect correct.

    Mike

  10. Hi Mike
    Interesting questions re businesses that suddenly stop operating. If the LURI is on their own website, then it would expire when their hosting contract expires, or earlier if they close the website. Hopefully that wouldn’t be too long after their demise.

    If the company doesn’t have a website but uses this LURI-hosting service we are imagining, then it would be easy to build in a quarterly verification service. Each quarter an e-mail message is sent to the owner requiring a reply for continuing hosting. If there were no reply within 15 days, then a reminder would be sent. If there were no reply within a further 15 days, then the LURI would be removed from the hosting service. Sounds like a feasible way to do this. :)

  11. Well, Mike, you sure DO have a theory.:)

    I have read Bill’s work on this as well (location sensitivity), and yes, I can see how this might apply to an area where the user might be willing to drive a bit farther to get to a restaurant out of town. So, my answer is, yes, the data is relevant, when you look at it this way.

    And, I see just what you are seeing in the case of the motorcycle dealership. There are not any chiropractors in my town…so Google local is, again, showing me next town over results.

    How could this be problematic? Well, perhaps if you were looking to move house, and wanted to find a community that had _____________ in it. ___________ might be a Catholic Church, a physical therapist, a Whole Foods market, a laundromat, a cancer treatment center. Though Google is providing town names in the local search results, the fact that they are bringing up exterior towns in the search could wrongly lead someone to believe those services would be located in the town they are searching for, if they are not looking very carefully at the results set.

    Because of something like this, maybe it would be better if you did a search for ‘Chriopractor, My Town’ Google could have a response like “There are no Chiropractors listed at present in this town. Try the nearest city?”

    Usually, when you live in an area, you know what is where. But when you are investigating a new place, or on the road, the accuracy of the data you can get from searches is really crucial. Google maps has sent us on wild goose chases in the past because of inaccurate driving directions. Not fun.

    Thanks for your great response, Mike!
    Miriam

  12. Miriam-

    Correct me if I am wrong, it seems that you are describing/asking two things in your response:

    1)How do people actually use Google Maps (your example of relocating research)and thus, what do they expect back from it?

    and
    2)What is the quality of the mapping data and the driving directions and how much can we really rely on it?

    Both are interesting points.

    Your question of how people actually use Google’s local data is an important one… it was always a question that I had about the Yellow Pages.

    Like you, I hardly ever discover new information from local queries. I know where my favorite restaurants are and have the # already in my cell. I only use Google Maps when searching for local stuff, as a phone directory. I live in area where I would need 7 phone books to call my leads etc. and the Google OneBox solves that problem elegantly. I personally use it much more than I ever did the Yellow Pages. I use 877-520-find and Google SMS in a similar way.

    However I do, like you, use Google local data for researching local businesses for trips that myself or my wife will be taking…I look for the nearest florist to buy my honey flowers when she is on a business trip, I check out where the nearest Kinko’s is so she can make an emergency copy etc. She, even though reasonably tech savvy, hasn’t really done that herself yet.

    Which brings me to my questions (ah..finally you say) that you alluded to:
    -How do real people use the information?
    -How often do they use it?
    -Is that frequency enough to someday generate as many pageviews as Google organic?
    -Is knowledge and familiarity the barrier to use? or
    -Does data quality discourage users?
    -Will the new technology lead to new ways of exploring local information?

    I have seen little solid information on these questions. I would love to know the answers but haven’t really figured out a way to gather the data in a meaningfully way.

    If $ investment by the big boys is any indicator (I don’t believe that it is), people are going to be using this information from when they wake up in the morning to when they go to sleep at night.

    The reality is sure to be different than that. But how deep and wide, broad based usage of Local Search will be, depends on technology (like the iPhone) that is not yet even shipping. When will my brother (who is in his late 50’s) use it? When will my wife (in her late 40’s) use it that way? When will my children in their teens use it? I just don’t know. I also don’t know the many potential future uses, although your idea is one of many.

    On the issue of driving directions, the algorithms Google uses, the quality of the underlying Map data, I am less well qualified to answer.

    I sense that Google is less than overjoyed about the issues that their map data providers cause, like the difficulty of correction. I also sense the they are working on improving this data and their purchase of Endoxin might get them into the mapping data business directly.

    There was a funny story in the WSJ the other day about the widespread adoption of GPS in automobiles in Germany and how it was leading to bonehead moves…like one guy turning right when the GPS said to and driving onto the curb instead of turning right on the next street… At some level human behavior will have to account for the quirks in the new technology.

    So while the answers that Google gives may be problematic, I am not sure that they are “bad data”, just not totally useful. End users will have to develop new skills to fully take advantage of the new technology. And Google will continue to to work on the interface and results..I guess that has always been the case. :)

    Thanks for your questions! Sorry for the ramble of opinion…your questions help me clarify, my thinking. Thanks!

    Mike

  13. Barry-

    You seem to be moving in the right direction…I would ask two more questions about the idea:

    1)What happens with the email follow up when the business owner changes email addresses and forgets to update the LURI (like what happens now with domain names)

    2)What is to prevent duplicate and/or hijacked information i.e. a nefarious competitor creating a LURI with his phone number for your business?

    Mike

  14. Mike, you’re pushing me beyond what is covered by the original idea of a LURI. The LURI is a very minimal web page that identifies the geographical location for a company or agency and gives a minimal set of communication coordinates, e.g. possibly a telephone number and an e-mail address. How the practical application of LURIs develops will depend on who is involved and how they wish to push the idea. For example you could add metadata that might give a very short description of what the company or agency does.

    Nevertheless your questions are obviously of interest. On the first, as you say the difficulty in communicating with the owner if the e-mail address changes is also present with full websites. If a quarterly e-mail verification goes out and the company doesn’t sink without trace overnight, hopefully the company owner or even the bankruptcy trustee will be aware that this useful geographical locator is about to disappear. If it is important that the world knows where they are, then presumably someone will update the LURI.

    On your second question, I see less of a problem. If someone else tries to ‘hijack’ a given company location with a false telephone number, they are likely to be fielding a number of calls from irate callers looking for some other company entirely. In the case where someone else is attempting identity theft, this could be more of a problem. Presumably if it is felt to be a real concern, some security process could be put in place to minimize the problems.

  15. I like long replies, Mike. Yours are very thought provoking for me.

    My father works for a phonebook company and they set up phone tracking for their clients to see how many people the phonebook ads are bringing in. It is this type of data, I think, that will begin to make us be able to understand the true scope of local search, but it’s going to take some time.

    At this point, no one I speak to (outside of the industry) has ever heard of local search. That is the funny thing about working in a web-related field. What we know today won’t be in common use until ‘tomorrow’. Your example of your wife is pretty perfect…she is married to a fellow who spends all day thinking about this stuff (you), so you’d think that exposure would make her into an avid user, but it doesn’t. My mother finally knows what SEO is. But, like everything else, it has taken time.

    A lot of this depends on Google educating the public about its services.

    Thanks, again, Mike. I’ll stay tuned in to keep reading about your findings!
    Miriam

  16. Google maps api do not geocode and are very inaccurate.
    I have used them before and the coverage is not good and is weak for Ireland.

    So Google map data is not very good and I would not recommend it for big business as privacy is a huge issue ( another hack yesterday) left the mapping api disabled

  17. Matthew

    I was traveling yesterday and missed the API going down. Do you have more details or a link?

  18. Lets rememeber, GOOGLE is a BASIC mapping service with no advanced Geocoding, hence for Europe they only can guarantee 4 digit postcode accuracy and the US Zip code accuracy is much worse. Bad co-ordinates.

    Their API does lack customisation and the maps lack coverage, The directions do not cover toll costs etc and are generally very poor in comparrision charts compared to someone like Viamichelin who usually come out on top.

    Google have no SLA and no cannot guarantee the service and any data going thought their server will be stored. (Privacy issues)

    The solution is no good for Asset tracking and can include adverts, they announced today they are putting up a wall of 5 views a day for news ads’s, so its coming !

    If it was not so convenient it would not be so popular. I want a little better quality, better accuracy and nicer looking maps and directions which work and so i use Viamichelin business as they have various platforms to play with including mobile store finders on the iphone

  19. Google maps are terribly inaccurate.

    Google DO NOT provide :
    Advanced Geocoding
    Option for Large Static Maps

    service Level Agreement
    Technical Support
    Support portal & usage reporting

    Google maps for business is a basic mapping service for API and is not always free. The customisation is poor, and the coverage is not great, seldom recognising small towns and only to postcode accuracy as NO geocoding !

    If something is good enough, they would not give it away !

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Comments links could be nofollow free.