User Menu

Your News

Related Items

Google is trying to read searchers' mind Print E-mail
Monday, 11 June 2007

 These days, Google seems to be doing everything, everywhere. It takes pictures of your house from outer space,  charms its way onto Madison Avenue, picks fights with Hollywood and tries to undercut Microsoft’s software dominance.
But at its core, Google remains a search engine. And its search pages, blue hyperlinks set against a bland, white background, have made it the most visited, most profitable and arguably the most powerful company on the Internet. Google is the homework helper, navigator and business directory for half a billion users, able to find the most obscure bit of information in just the blink of an eye.
Yet however easy it is to extol the modern-day miracle of Google, the site is also among the world’s biggest teases. Millions of times a day, users click away from Google, disappointed that they couldn’t find the hotel, the recipe or the background of that hot guy. Google often finds what users want, but it doesn’t always.
That’s why Amit Singhal and hundreds of other Google engineers are constantly tweaking the company’s search engine in an elusive quest to close the gap between often and always.
Singhal is the master of what Google calls its “ranking algorithm”- the formulas that decide which Web pages best answer each user’s question. It is a key part of Google’s inner sanctum, a department called “search quality” that the company treats like a state secret.
Google rarely allows outsiders to visit the unit, and it has been cautious about allowing Singhal to speak with the news media about the magical, mathematical brew inside the millions of black boxes that power its search engine.
“The fundamental value created by Google is the ranking,” says John Battelle, the Chief Executive of Federated Media, a blog ad network, and author of The Search, a book about Google.
Online stores, he notes, find that a quarter to a half of their visitors, and most of their new customers, come from search engines. And media sites are discovering that many people are ignoring their home pages - where ad rates are typically highest - and using Google to jump to the specific pages they want. “Google has become the life blood of the Internet,” Battelle says.

Read my mind
Users, of course, don’t see the science and the artistry that makes Google’s black boxes hum, but the search-quality team makes about a half-dozen major and minor changes a week to the vast nest of mathematical formulas that power the search engine.
These formulas have grown better at reading the minds of users to interpret a very short query. Are the users looking for a job, a purchase, or a fact? The formulas can tell that people who types “apples” are likely to be thinking about fruit, while those who type “Apple” are mulling computers or iPods. They can even compensate for vaguely worded queries or outright mistakes.
“Search over the last few years has moved from ‘Give me what I typed’ to ‘Give me what I want’,” says Singhal, a 39-year-old native of India who joined Google in 2000 and is now a Google Fellow, the designation the company reserves for its elite engineers.
As Google constantly fine-tunes its search engine, one challenge it faces is sheer scale. It is now the most popular Web site in the world, offering its services in 112 languages, indexing tens of billions of Web pages and handling hundreds of millions of queries a day. Even more daunting, many of those pages are shams created by hucksters trying to lure web surfers to their sites filled with ads, pornography or financial scams.
At the same time, users have come to expect that Google can sift through all that data and find what they are seeking, with just a few words as clues.
“Expectations are higher now,” said Udi Manber, who oversees Google’s entire search-quality group. “When search first started, if you searched for something and you found it, it was a miracle. Now, if you don’t get exactly what you want in the first three results, something is wrong.”

Team work
Google’s approach to search reflects its unconventional management practices. It has hundreds of engineers, including leading experts in search lured from academia, loosely organised and working on projects that interest them.
But when it comes to the search engine - which has many thousands of interlocking equations - it has to double-check the engineers’ independent work with objective, quantitative rigor to ensure that new formulas don’t do more harm than good. A big white board near Singhal’s desk is scrawled with graphs, queries and bits of multicolored, mathematical algorithms. Complaints from users about searches gone awry are also scrawled on the board.
Any of Google’s 10,000 employees can use its “Buganizer” system to report a search problem, and about 100 times a day they do - listing Singhal as the person responsible to squash them.
Some complaints involve simple flaws that need to be fixed right away. Recently, a search for “French Revolution” returned too many sites about the recent French presidential election campaign - in which candidates opined on various policy revolutions - rather than the ouster of King Louis XVI. A search-engine tweak gave more weight to pages with phrases like “French Revolution” rather than pages that simply had both words.

New problem  
Finding local businesses is important to users, but Google often has to rely on only a handful of sites for clues about which businesses are best. Within two months of Brougher’s complaint, Singhal’s group had written a new mathematical formula to handle queries for hometown shops.
But Singhal often doesn’t rush to fix everything he hears about, because each change can affect the ranking of many sites. “You can't just react on the first complaint,” he says. “You let things simmer.”

Ranking pages
As Google compiles its index, it calculates a number it calls PageRank for each page it finds. This was the key invention of Google’s founders, Page and Sergey Brin. PageRank tallies how many times other sites link to a given page. Sites that are more popular, especially with sites that have high PageRanks themselves, are considered likely to be of higher quality.
Singhal has developed a far more elaborate system for ranking pages, which involves more than 200 types of information, or what Google calls “signals.” PageRank is but one signal. Some signals are on Web pages - like words, links, images and so on. Some are drawn from the history of how pages have changed over time. Some signals are data patterns uncovered in the trillion of searches that Google has handled over the years.
Once Google corrals its myriad signals, it feeds them into formulas it calls classifiers that try to infer useful information about the type of search, in order to send that user to the most helpful pages.
Classifiers can tell, for example, whether someone is searching for a product to buy, or for information about a place, a company or a person. Google recently developed a new classifier to identify names of people who aren't famous. Another identifies brand names.
If all of this wasn’t excruciating enough, Google’s engineers must compensate for users who are not only fickle, but are also vague about what they want; often, they type in ambiguous phrases or misspelled words.
Long ago, Google figured out that users who type “Brittany Speers,” for example, are really searching for “Britney Spears.” To tackle such a problem, it built a system that understands variations of words. So elegant and powerful is that model that it can look for pages when only an abbreviation or synonym is typed in.
In the end, it’s hard to gauge exactly how advanced Google's techniques are, because so much of what it and its search rivals do is veiled in secrecy. In a look at the results, the differences between the leading search engines are subtle.

Source: New York Times News Service

Comments
Add NewSearchRSS
Write comment
Name:
Title:
Security Image

Powered by JoomlaCommentCopyright (C) 2006 Frantisek Hliva. All rights reserved.Homepage: http://cavo.co.nr/

 
< Prev   Next >