Revealed: What goes into Google real-time search

Google - real-time search is 'very difficult'
Google - real-time search is 'very difficult'

Google has outlined exactly what elements go into its real time search, explaining that a judgement on what should show in the search results are made in a matter of seconds.

Speaking to TechRadar, Google Fellow and search expert Amit Singhal explained that bringing real-time search to Google was both thrilling and the toughest thing that he had done in his career.

"Real time has been the one of the most exciting projects I've undertaken in my 20 years of search, and it has been the toughest thing I have undertaken in 20 years of search," said Singhal.

Month to seconds

"When I started out as a grad student and in the early days of Google we would crawl a web page every month," he added.

"We would get an article and we would have 15 days on it in average where that article is on our disc and we can parse that article, we can tokenise the article, we can find all the keywords and we can figure out what this article is saying

"That was a thousand word article or something like that but in real time I get 140 characters and three seconds. That's what we have to judge the relevance of this thing. So it's been incredibly hard."

Two key elements

Singhal says that there are two key elements to applying real-time trends to search.

"What we have done is two distinctive properties of real-time search our number one is integration into the results page - no-one else has done it basically because it's infinitely hard to do it!

"Secondly there is the comprehensiveness [of what we look at]. So we have Twitter updates, we have MySpace status updates we have Facebook page updates, we have all the blogs we get for our blog search and all the news crawls we do and everything shows up on real -ime search integrated into the results page.

"It's been an amazing experience and our jobs are definitely not done.

"We are still working hard to further sharpen the relevance of this product because everyone can improve and so can this product."

The factors involved

Singhal showed off a slide in his presentation that showed off the wealth of different factors considered by Google's algorithm before it puts the real time results into the page.

That list is: language model, tweet quality, author quality, probability of relevance, semantics, real time URL resolution (ie checking the bit.ly link actually goes to a relevant page), query registration, query hotness, query volume fluctuation and tipicality.

So, to get a tweet on the Google page you need to be talking about a major topic that is trending, at the right time, your link needs to go to a relevant page and matters like your wording and how many followers/friends and how many have retweeted/liked you are taken in to account.

As Singhal suggests, no mean feat.

Patrick Goss

Patrick Goss is the ex-Editor in Chief of TechRadar. Patrick was a passionate and experienced journalist, and he has been lucky enough to work on some of the finest online properties on the planet, building audiences everywhere and establishing himself at the forefront of digital content.  After a long stint as the boss at TechRadar, Patrick has now moved on to a role with Apple, where he is the Managing Editor for the App Store in the UK.

TOPICS