Archive for August, 2009
Internet Fax

Google’s Rich Snippets Starts the Semantic Snowball Effect

Thursday, August 13th, 2009 | Bing Tips with No Comments »

Almost lost in May’s whirlwind launches of Wolfram|Alpha andMicrosoft’s Bing and the unveiling of Google Wave, was a quieter announcement that may bring a seismic shift toward the realization of Web 3.0.

igoogle screenshotWhile some aspects of the next generation of the Web are taking place, there are major physical and cultural challenges to bring it about.Google’s launch of Rich Snippets may well be a watershed moment in resolving these problems.

Before the term Web 2.0 came into common use, World Wide Web inventor Tim Berners-Lee outlined his vision of the next generation — what he called the Semantic Web. In a 2001 article in Scientific American, Berners-Lee described a global database of linked knowledge, in a markup format that could be understood and manipulated by computers. The World Wide Web Consortium (W3C), the international standards organization headed by Berners-Lee, has a longstanding group that has laid out the tools and protocols for the Semantic Web.

Web 3.0 is here (somewhat)
There are no hard borders between one generation and another, and parts of what is being described as Web 3.0 are already here.

Personalized home pages have been available for years. iGoogle, for example, steps into Web 3.0 territory by allowing users to create a home page with multiple tabs, built by inserting news headline feeds, weather forecasts, Twitter and Facebook feeds and hundreds of other content modules via widgets, and integrating e-mail, calendars and documents into mobile versions. Mobile “lifestream” features, which keep track of personal connections and activities, are widely used through Twitter and similar tools.

Google’s new Wave promises a watershed in collaboration, marrying e-mail, instant messaging, chats and media-sharing in a new communication model that has left reviewers grasping for words.

Google wave screenshotAnd some things that were seen as being enabled by the Semantic Web in 2001 are already here without it. For many Americans, persistent mobile connection is a reality — e-mail and SMS-capable phones are ubiquitous, and Web-enabled phones are common. But the full power of machine-understood data, linked across the entire body of information in one global Web, with “agents” focused on personal service to humans, is only in its infancy. The Semantic Web vision is the other part of Web 3.0, which vertically integrates data from a diverse set of sources, according to the W3C’s Semantic Web group.

The challenges to the Semantic Web
The Web, as of July 2008, included one trillion distinct URLs, by Google’s count. The search giant is estimated to actually index less than 5 percent of those, still a matter of tens of billions of Web pages. The overwhelming majority of these pages are meant to be read and understood by humans. The content of the pages isn’t meant to be understood by computers. Search engines can index keywords, but without context.

Semantic Web experts have collected the toolkit of languages and metadata markup systems that will allow machines to understand key words and the relationships between them. Such metadata is already being used in many places. A microformat called hResume, for example, allows LinkedIn.com to tag appropriate resume fields of its public profiles so that the resume data can be understood and reused elsewhere.

The value of such machine-usable data is obvious. Since the infancy of the Web, finding valuable information amid the growing clutter has been a major challenge. Directories such as Yahoo! made their mark by pointing users to useful, hand-selected websites. This manual work could barely keep up with the scope of the Web of the mid-’90s. It also faced growing credibility issues because links were chosen — or excluded — by human editors. Full-text search engines, such as Web Crawler and Alta Vista, gained popularity, but search results included large amounts of garbage. Today’s top search engines have worked to reduce the signal-to-noise ratio and increase the value of results by using sophisticated algorithms. Microsoft’s Bing, for example, promises to give more relevant results and aid in decision-making.

The Wolfram|Alpha “computational knowledge engine” is being hailed as a prototype of what a global database in the Semantic Web could do to deliver high-value information, easily accessed in plain language. And Wolfram|Alphaitself appears to be claiming the turf of global database. With more than 10 trillion pieces of information, and plans to expand significantly, the site says:

“Wolfram|Alpha’s long-term goal is to make all systematic knowledge immediately computable and accessible to everyone. We aim to collect and curate all objective data; implement every known model, method and algorithm; and make it possible to compute whatever can be computed about anything. Our goal is to build on the achievements of science and other systematizations of knowledge to provide a single source that can be relied on by everyone for definitive answers to factual queries.”

This may resonate with some in the Semantic Web community; a number have seen the task of retrofitting the current Web into machine-friendly markup so daunting that the global database might need to be built from scratch. But on face value, Wolfram|Alpha violates one of the cardinal precepts of the Semantic Web: that the proprietary hoarding of databases behind walls must end — data must flow freely from and to all sources.

And the vision of W3C’s Semantic Web isn’t to replace the current Web, but to enhance it. The question is how to get the work done. There was no organized plan to build the Web. To be sure, there were plans to create the technology and the infrastructure. But most of those tens of billions of indexed Web pages were built by corporations, small businesses, non-profits and individuals, each for their own reasons. Persuading websites to recode Web pages to Semantic Web specifications — or even to do so going forward — will take a powerful motivator.

Google breaks the ice
Google may have provided such a motivator with its May 12 announcement of Rich Snippets. “Snippet” is the name Google uses for the short block of text appearing below a search result, giving more information about the Web page. Google announced in its Webmasters Central Blog (a bookmark for anyone interested in making his or her website more visible to the leading search engine) that it is now applying Google’s algorithms to “highlight structured data embedded in web pages.” Translation, content marked for the Semantic Web. The “rich snippets” will be based on the structured data.

This is a major event for a couple of reasons. First, Google is the poster child for machine learning, which in Web terms means teaching machines to scan plain-language Web pages and cull meaning from them. This is the other end of the spectrum from the Semantic Web vision of coding pages in a special way so they have meaning to machines. Google’s announcement, which explicitly discussed plans to extend support for structured data in new ways as well as to recognize metadata coding developed elsewhere on the Web, puts the company on a course for a synergy between machine learning and Semantic Web practices.

Yahoo searchmonkeyGoogle isn’t the first major search company to focus on structured data. Yahoo’s Search Monkey platform for Web developers supports a robust package of metadata formats, and urges developers to have at it. But the reality is that Google is the one people are paying attention to where it counts.

This brings us to the second reason this is a major step: self-interest. It’s important to harness the force that created those tens of billions of indexed Web pages in the first place. And Google’s announcement means money.

In the current Web economy, search engine status is a prime motivation. And Google ranking is the Holy Grail. What Google is offering (while explicitly not promising) is the chance for websites to attract the eye of the search engine’s algorithms, and even some measure of control over that vital couple of lines of text that tells a user “click me.” In an environment where every keystroke in a Web page’s metatags is dictated by a Search Engine Optimization guru, and every word of a headline and keyword-packed top paragraphs, Web producers across the Net are — or are about to be — learning metaformats.

And that just may be the sound of a Semantic Web snowball starting down the hill.

Popularity: 8% [?]

[Post to Twitter]   [Post to Digg]   [Post to Ping.fm]   [Post to StumbleUpon]  

Bing to define Microsoft’s capability to innovate

Monday, August 10th, 2009 | Bing News with No Comments »

Opinion – Microsoft CEO Steve Ballmer yesterday gave a casual speech in front of more than 1500 members and guests of the Executives Club of Chicago, explaining the importance of innovation during a “reset” of the economy. The pitch for Microsoft products was careful, and, not surprisingly, focused on the next Xbox and Bing. What about Windows 7?

It seemed that Steve Ballmer was delivering pretty much on the expectations of Chicagoans when he talked about the opportunities of innovation and increase productivity during “challenging” economic times. It was what you would have expected, a speech we have heard so many times in its optimistic tone. The 15-minute or so talk avoided negativity and focused on explaining the audience that debt will not be the growth driver of the U.S. economy anymore. Instead, the power and patience to innovate and a chance to increase productivity will be key to rebound, Ballmer said. He cautioned executives that the economic environment will not be what it once was and that we are not in a “recession”. He considers it to be an “unprecedented reset” that will become the “new normal”.

“Yesterday was the exception,” he said.

In terms of innovation, he believes “there has never been a more exciting time” and, according to the Ballmer, the next ten years of innovation will be at least as good as or better than the past ten years. For Microsoft, the innovation opportunity would be to create a virtual world that closely resembles the physical world – noting that “in ten years, I don’t want to be [physically in front of this audience] anymore”, who then would be seeing and listening to a virtual Steve Ballmer. Other examples included opportunities in making communication much more digital than today, or the change in the way media publishes and sells content. But he also addressed the need to create a “better workforce,” which would result from much more focus on education, digital literacy and science. “We need to keep this the place for the best and brightest,” he said. “Just think. The future is about innovation,” Ballmer concluded his speech.

In his very own way, Ballmer delivered a very inspiring and entertaining speech, but there was virtually no pitch for Microsoft, which I personally found a bit strange, especially since Microsoft handed out copies of Windows 7 RC to every attendee. It certainly was a unique opportunity to talk to those executives about the innovation in Windows 7 and upgrade from Vista or XP to Windows 7. That pitch never came and it seemed that Ballmer avoided mentioning Windows 7, even after being asked about current software trends, at all cost.

Instead, the most important software trend at this time is, according to Ballmer, a natural user interface integrated in PCs that recognize your touch, you voice and look. He was referring to a computer that is much more personal, a computer that will reach many more people through its human-like interface and behavior. He said that this technology is “really close” to being released and the first example may be the next Xbox, which, according to Ballmer, will be released in 2010 and feature a camera to recognize and track a player’s actions.

However, in the same way Ballmer was avoiding Windows 7, he was focusing on Bing, which became the symbol of Microsoft’s capability to innovate during his speech. He joked about it, saying that Microsoft “is working hard on its 8% market share” and asking the audience to click on search advertising to help Microsoft “make money”. But there was an overall serious tone about Bing.

He admitted that if had the chance for one “do-over in his career, he “would start sooner on search,” which was remarkable statement. He described Microsoft as being the underdog in this business, as “the little engine that could”. But the company will be throwing lots of money at its search business over the next five to ten years in an effort to catch up with the “market leader”, Ballmer’s term for Google. “We’ve got our mojo now going,” he said.

Ballmer sees Microsoft’s opportunity in the fact that “the market leader” cannot “experiment as much” as someone who is coming from behind and we should expect Microsoft taking big and unusual steps at a time.

Microsoft’s current effort in search is reminiscent of the company’s effort to replace Netscape’s Navigator as the dominant web browser in the mid to late 1990s. However, it appears that Microsoft is even more serious about search, especially since search represents a direct revenue opportunity – an opportunity Microsoft recognized very late. “We always had great search technology. We always believed in search. But we did not see that business change coming and that is why we have been slow to move,” he explained.

If Ballmer’s speech was any indication, Bing is positioned to define the perception of an innovative Microsoft in the future, as much as Windows did in the past. And, from a consumer’s perspective, we should look forward to a historic battle between Microsoft and Google for the lead in search and Internet advertising.

Popularity: 7% [?]

[Post to Twitter]   [Post to Digg]   [Post to Ping.fm]   [Post to StumbleUpon]  

Lawsuit Confirms That Microsoft is Serious About This Bing Business

Sunday, August 9th, 2009 | Bing News with No Comments »

If the $100 million advertising campaign wasn’t enough to convince you that Microsoft is serious about its new search engine–and the New York Post says Google’s convinced–maybe the fact that Microsoft is suing people for click fraud will be. (Click fraud is when you get people or programs to repeatedly click online ads to make more money for yourself or exhaust a competitor’s advertising budget, so their ads don’t appear or lose top placement.) Now that it’s got Bing, Microsoft will not have people messing with search engine results!
The Lam/Suen family–Brothers Eric and Gordon Lam, Mom Melanie Suen–was allegedly using click fraud to make more money for their site WoWMine.com–mainly by clicking competitors’ ads until their competitors’ ad budgets were exhausted. Thus, Microsoft is suing them for $750,000. The site sells virtual gold for the popular role-playing game, World of Warcraft, (the game’s makers don’t support this illicit gold dealing) and also apparently also sells info to auto insurance advertisers–which explains why the Lam/Suen family was allegedly click-frauding not just WoW gold ads but also auto insurance ones.

A quick check of court records confirms that this is among the nerdiest scams ever.

Popularity: 5% [?]

[Post to Twitter]   [Post to Digg]   [Post to Ping.fm]   [Post to StumbleUpon]  

links powered by Tweet This v1.3.9, a WordPress plugin for Twitter.