Archive

Posts Tagged ‘yahoo!’

Google versus Bing – a competitive intelligence case study

February 2, 2011 7 comments

Search experts regularly emphasise that to get the best search results it is important to use more than one search engine. The main reason for this is that each search engine uses a different relevancy ranking leading to different search results pages. Using Google will give a results page with the sites that Google thinks are the most relevant for the search query, while using Bing is supposed to give a results page where the top hits are based on a different relevancy ranking. This alternative may give better results for some searches and so a comprehensive search needs to use multiple search engines.

You may have noticed that I highlighted the word supposed when mentioning Bing. This is because it appears that Bing is cheating, and is using some of Google’s results in their search lists. Plagiarising Google’s results may be Bing’s way of saying that Google is better. However it leaves a bad taste as it means that one of the main reasons for using Microsoft’s search engine can be questioned, i.e. that the results are different and that all are generated independently, using different relevancy rankings.

Bing is Microsoft’s third attempt at a market-leading, Google bashing, search engine – replacing Live.com which in turn had replaced MSN Search. Bing has been successful and is truly a good alternative to Google. It is the default search engine on Facebook (i.e. when doing a search on Facebook, you get Bing results) and is also used to supply results to other search utilities – most notably Yahoo! From a marketing perspective, however, it appears that the adage “differentiate or die” hasn’t been fully understood by Bing. Companies that fail to fully differentiate their product offerings from competitors are likely to fail.

The story that Bing was copying Google’s results dates back to Summer 2010, when Google noticed an odd similarity to a highly specialist search on the two search engines. This, in itself wouldn’t be a problem. You’d expect similar results for very targeted search terms – the main difference will be the sort order. However in this case, the same top results were being generated when spelling mistakes were used as the search term. Google started to look more closely – and found that this wasn’t just a one-off. However to prove that Bing was stealing Google’s results needed more than just observation. To test the hypothesis, Google set up 100 dummy and nonsense queries that led to web-sites that had no relationship at all to the query. They then gave their testers laptops with a new Windows install – running Microsoft’s Internet Explorer 8 and with the Bing Toolbar installed. The install process included the “Suggested Sites” feature of Internet Explorer and the toolbar’s default options.

Within a few weeks, Bing started returning the fake results for the same Google searches. For example, a search for hiybbprqag gave the seating plan for a Los Angeles theatre, while delhipublicschool40 chdjob returned a Ohio Credit Union as the top result. This proved that the source for the results was not Bing’s own search algorithm but that the result had been taken from Google.

What was happening was that the searches and search results on Google were being passed back to Microsoft – via some feature of Internet Explorer 8, Windows or the Bing Toolbar.

As Google states in their Blog article on the discovery (which is illustrated with screenshots of the findings):

At Google we strongly believe in innovation and are proud of our search quality. We’ve invested thousands of person-years into developing our search algorithms because we want our users to get the right answer every time they search, and that’s not easy. We look forward to competing with genuinely new search algorithms out there—algorithms built on core innovation, and not on recycled search results from a competitor. So to all the users out there looking for the most authentic, relevant search results, we encourage you to come directly to Google. And to those who have asked what we want out of all this, the answer is simple: we’d like for this practice to stop.

Interestingly, Bing doesn’t even try to deny the claim – perhaps because they realise that they were caught red-handed. Instead they have tried to justify using the data on customer computers as a way of improving search experiences – even when the searching was being done via a competitor.  In fact, Harry Shum, a Bing VP, believes that this is actually good practice, stating in Bing’s response to a blog post by Danny Sullivan that exposed the practice:

“We have been very clear. We use the customer data to help improve the search experience…. We all learn from our collective customers, and we all should.”

It is well known that companies collect data on customer usage of their own web-sites – that is one purpose of cookies generated when visiting a site. It is less well known that some companies also collect data on what users do on other sites (which is why Yauba boasts about its privacy credentials). I’m sure that the majority of users of the Bing toolbar and other Internet Explorer and Windows features that seem to pass back data to Microsoft would be less happy if they knew how much data was collected and where from. Microsoft has been collecting such data for several years, but ethically the practice is highly questionable, even though Microsoft users may have originally agreed to the company collecting data to “help improve the online experience“.

What the story also shows is how much care and pride Google take in their results – and how they have an effective competitive intelligence (and counter-intelligence) programme, actively comparing their results with competitors. Microsoft even recognised this by falsely accusing Google of spying via their sting operation that exposed Microsoft’s practices – with Shum commenting (my italics):

What we saw in today’s story was a spy-novelesque stunt to generate extreme outliers in tail query ranking. It was a creative tactic by a competitor, and we’ll take it as a back-handed compliment. But it doesn’t accurately portray how we use opt-in customer data as one of many inputs to help improve our user experience.

To me, this sounds like sour-grapes. How can copying a competitor’s results improve the user experience? If it doesn’t accurately portray how customer data IS used, maybe now would be the time for Microsoft to reassure customers regarding their data privacy. And rather than view the comment that Google’s exposure of Bing’s practices was a back-handed compliment, I’d see it as slap in the face with the front of the hand. However what else could Microsoft & Bing say, other than Mea Culpa.

Update – Wednesday 2 February 2011:

The war of words between Google and Bing continues. Bing has now denied copying Google’s results, and moreover accused Google of click-fraud:

Google engaged in a “honeypot” attack to trick Bing. In simple terms, Google’s “experiment” was rigged to manipulate Bing search results through a type of attack also known as “click fraud.” That’s right, the same type of attack employed by spammers on the web to trick consumers and produce bogus search results.  What does all this cloak and dagger click fraud prove? Nothing anyone in the industry doesn’t already know. As we have said before and again in this post, we use click stream optionally provided by consumers in an anonymous fashion as one of 1,000 signals to try and determine whether a site might make sense to be in our index.

Bing seems to have ignored the fact that Google’s experiment resulted from their observation that certain genuine searches seemed to be copied by Bing – including misspellings, and also some mistakes in their algorithm that resulted in odd results. The accusation of click fraud is bizarre as the searches Google used to test for click fraud were completely artificial. There is no way that a normal searcher would have made such searches, and so the fact that the results bore no resemblance to the actual search terms is completely different to the spam practice where a dummy site appears for certain searches.

Bing can accuse Google of cloak and dagger behaviour. However sometimes, counter-intelligence requires such behaviour to catch miscreants red-handed. It’s a practice carried out by law enforcement globally where a crime is suspected but where there is insufficient evidence to catch the culprit. As an Internet example, one technique used to catch paedophiles is for a police officer to pretend to be a vulnerable child on an Internet chat-room. Is this fraud – when the paedophile subsequently arranges to meet up – and is caught? In some senses it is. However saying such practices are wrong gives carte-blanche to criminals to continue their illegal practices. Bing appears to be putting themselves in the same camp – by saying that using “honeypot” attacks is wrong.

They also have not recognised the points I’ve stressed about the ethical use of data. There is a big difference between using anonymous data tracking user  behaviour on your own search engine and tracking that of a competitor. Using your competitor’s data to improve your own product, when the intelligence was gained by technology that effectively hacks into usage made by your competitor’s customers is espionage. The company guilty of spying is Bing – not Google. Google just used competitive intelligence to identify the problem, and a creative approach to counter-intelligence to prove it.

Delicious humbug and monitoring News stories

December 20, 2010 Leave a comment

Effective competitive intelligence monitoring means keeping up with the news, and where news is likely to impact you, drawing up strategies to take into account changes.

The problem with instant news via twitter, blog posts and various other news feeds is that news updates sometimes happen too quickly, before the snow has even had a chance to settle. That’s fine – just so long as the source for the news is 100% reliable, and the news story itself is also totally accurate. (I’m using snow as a metaphor here – rather than the more normal dust – as outside there is around 15cm of the stuff with more promised during what looks like being the coldest winter in Europe for over 20 years).

Unfortunately, more often than not, one of these two aspects fails: the source may not be reliable, or the story may not be true, or may be only half-true. Typically however people pick up on the story and it spreads like wildfire (so not giving that snow a chance to settle before it gets melted all over the web).

An example of this has been taking place this last week – with numerous posts reporting the demise of the web-bookmarking service Delicious.

Delicious (originally located at http://del.icio.us) was founded in 2003 and acquired by Yahoo! in 2005. By 2008 (according to Wikipedia) it had over 5 million users and 180 million bookmarked URLs. This makes it an important source for web-searching as, unlike with a search engine such as Google or Bing, each URL will be human-validated and valued.

Apparently, during a strategy meeting held by Yahoo! looking at its products, Delicious was named as a “sunset” product.

Slide from Yahoo! strategy presentation - on plans for various products

An image of this slide was tweeted – and after Yahoo! failed to deny that Delicious was to be closed, posts quickly appeared denouncing the company for the decision. Nobody really cared that sites like AltaVista and AlltheWeb were going – as they were to all intents and purposes dead anyway. (Their search features have long been submerged into Yahoo!’s own – although I for one, still miss some of the advanced features these services offered. Alltheweb allowed searching of flash content, and AltaVista had a search option that nobody now offers: the ability to specify lower/upper case searches).

The problem is that many of the sources posting the story are normally extremely accurate and reliable so when they post something, it is reasonable to believe what they say. This then compounds the problem as the news then gets spread even further – and when the story is corrected, the news followers often fail to spot the corrections.

The example of Delicious is not isolated. There are many news stories that develop over time – and when making strategy decisions based on news it is important to take into account changes, but also not rush in, if a news item hasn’t been fully confirmed.

Ideally check the source – and if the source is a press item (or blog post) then look to see if there is a press release or where the original item came from, in case there is a bias, inaccuracy or mis-interpretation. Only when the news has been confirmed (or where there are no contra-indications) should strategy implementation take place (although of course, the planning stage should be considered immediately if the potential impact of the news is high).

In the case of Delicious their blog gives the real story. The slide leaked to twitter was correct – Delicious is viewed as a “Sunset” product. However that doesn’t mean it will be closed down – and Yahoo! states that they plan to sell the service rather than shut it down (although it is noticeable that they don’t promise to keep the service going if they fail to find a buyer).

There is, in fact, another lesson to be learned here, relating to company awareness on the impact of industry blogs and twitter. It is important to not only monitor what is said about your company, but also to anticipate what could be said.  In a world where governments can’t protect secrets being leaked via Wikileaks, it would be surprising if high-impact announcements from companies didn’t also leak out to industry watchers. Some companies constantly face leaks – Apple is notorious in this regard – and part of their strategies involve managing potential leaks before they do harm. In this Yahoo! failed. As a media company that depends on the web for its business this is a further example suggesting how Yahoo! seems to have lost its way. This is not negated by evidence such as the leaked slide mentioning Delicious, showing they are thinking about their future and product/service portfolio.