SMX West 2008 Day 2 Keynote
Another day; another keynote. Yes, that’s a lame opening line, but I have not had enough coffee yet to write a good one.
This morning’s keynote is titled “Past, Present & Future of Search” and is being given by Louis Monier, VP of Products at Cuill (pronounced “cool”). Monier was at AltaVista in the early days.
Monier is going to talk about people’s need to find things on the web. He is giving his own opinion, not speaking on behalf of Cuill.
The history of search began in the mid-90’s. Search went through the same initial phase as any new technology: rejection. The NYT was particularly harsh.
The industry quickly worked through this phase, and words like “query”, “index” and “relevance” became part of everyday language.
In 1995, AltaVista launched with sixteen million pages in its index. It could search these in 0.35 seconds. This was revolutionary. It was also the first engine that let you see who was linking to your site.
AltaVista almost launched under the name “Gotcha”. Monier managed to prevent that.
In 1998, aggressive marketing (e.g. spam) entered the search industry. Very specific queries were useful, but more general queries returned page after page of non-relevant content. The search engines were very naive in responding to this problem.
At this time, Google was in beta using link analysis to drive relevancy. This addressed the spam concerns of the day. Google stayed focused on search and didn’t follow the call of being a portal.
Today’s world is Google, Yahoo and MSN, and a slew of wannabes.
Queries for company names are easy to address. Queries with huge volumes can be cached and served very cheaply. The rest of the queries have not improved for about ten years.
Why do we get no guidence from the search engines?
Analysis of on-page factors for relevancy calculation has been around since the sixties. Even link/anchor analysis has been around for ten years. Some engines have analyzed actual traffic to impact relevancy (DirectHit).
Does size matter? The search engines regularly change their mind on this subject. Size does not matter, as people get too many results already. Access to a few, well-known sites should be enough. On the other hand, any obsure document might matter to someone, so size matters. A lot.
Do search engines cover enough of the web? Are they covering the same corner of the web? There is good content off the beaten path. Search engines have a responsibility to provide access to all the web.
He expects more from the search engines. He wants they to provide insight to what he wants.
The Future…
Human powered search produces access to high quality content, but the coverage is tiny. Not a scalable solution.
Personlization sounds good. If someone searches for “diamond”, do they want a jewel or a place to play baseball? He would happily volunteer information like location, gender, age range and languages spoken to the search engines to help improve his search results.
One’s search history only helps for things you’ve searched for previously. It doesn’t solve the problem.
Social search believes one’s friends already have the answers. He doesn’t think their history will truly help the queries. How many people would it require to do so?
A specialized search engine (vertical search) can provide better results for narrow queries. But, no one wants to manage 10,000 bookmarks. We’d need a search engine for bookmarks. People may bookmark a few vertical search engines for topics they really care about, but horizontal engines will suffice for more queries.
Natural launguage processing is when search engines try to actually understand the content on a page. The problem is, how much good language is there actually on the web? It’s a good goal, but not very helpful.
Semantic web entails webmasters tagging their pages. It would be an immense amount of work.
Artificial intelligence is the promise of going from “The Flintstones” to “The Jetsons”. Not very realistic.
Using queries for spell checking (which spelling of a word has more SERPs) shows the utility of search beyond navigation.
Another example is putting in an abbreviation. The SERP itself will give you varients of meaning of the abbreviation.
So, what is the future? Disregarding AI, imagine an “answer engine”, as opposed to a search engine. This is a research assistant for your queries. He doesn’t know how soon this will happen, but it is what we need.
Relevancy changes. Before 8/24/2006, there were nine major planets. After 8/24/2006 there are eight major planets.
In conclusion, search is still in its infancy. It is mostly based upon things that are a decade old. However, it’s the only game in town.
Size does matter. If we don’t have the whole web (and tools to make sense of it), who is making those choices for you?
Time for a few questions… “It seems that clustering of results would be the perfect solution. Why hasn’t that caught on?”
It may be too fancy for us. Or, the quality may not be there.
“What do you think about blended search?”
He thinks it’s fine to search all this info. It’s part of the web. However, we’re not doing any better with non-text data than with text data. It’s a good idea, but not going to transform the landscape.










