2011-01-16

Shopping starts at Google

I don't know where the Web ends. It may have multiple ends, or none. But I know where the Web starts. It starts at Google.

Few years back, it was reported that 6 % of all internet traffic starts at Google. Also, plenty of people have Google set as their homepage. I think many of us would agree that our brain is only a thin layer on top of Google.

One reason for using Google is that people don't remember URIs. Google does it well. On the Web the address of a thing is a URI. In human brain the address of a thing is a set of associations which locate it in a neural network. That's why we need a way to translate these associations to a URI. Google does it fairly well. You pass it a bunch of keywords related to the thing you are looking for and it produces a nice, ordered list of URIs that might point to the thing you have on mind.

People don't use URIs to describe the things they are thinking of, machines do. I can't remember URIs, especially those of RDF vocabularies, which tend to be quite long. That's why I use prefix.cc which lets me to find the URI I'm looking for by passing it something I can remember: the vocabulary's prefix. The service remembers the vocabulary's URIs for me.

As it turns out, people don't remember the URIs of the things they want to buy either. So these days, a lot of shopping starts at Google. When you are looking to buy something you often start by describing that something to Google.

In commerce, things are addressed by brand. The problem with that is that people don't search for brands and they don't search for product names; they search for concepts. People don't search for Olympus E-450, they search for a camera. Brands and product names are not in their vocabularies, but concepts described by keywords are. People don't use brand names to describe the things they are thinking of, commerce does.

To bridge this gap you need to translate the keywords that people use to describe stuff to the brands that commerce uses to describe stuff. Enter search engine optimization (SEO). One of the things that SEO does is that it creates synonym rings. Synonym ring is a set of synonyms, words that people use to describe a thing, such as words mentioned in this tweet:

Can you all please stop retweeting those SEO jokes, gags, cracks, funnies, LOLs, humour, ROFLs, chuckles, rib-ticklers, one-liners, puns?

This SEO task consists in collecting the keywords people might use when searching for a thing so that they find your thing™ that you have described with these keywords.

It would be better if you can say that your thing™ (e.g., Olympus E-450) is a kind of thing people search for (e.g., a camera). Then, when people would search for a thing, they may find that your thing™ is such a thing. This is one of the promises of the semantic web vision. But, just as its Wikipedia article, the semantic web still has a lot of issues.

Nevertheless, the semantic web vision created some interesting by-products in the last few years. One of them is the Linked Open Data initiative striving to build a common, open data infrastructure for the semantic web that is coming (for sure). Other by-product of this vision is the so-called semantic SEO.

Both the semantic web and semantic SEO are misnomers. There is nothing exceptionally semantic in them. I would rather like to call it data SEO, but it seems the current name will stick. Semantic SEO is a practise of adding a little bit of structured data (preferably in RDF) to websites instead of adding a bunch of keywords. For instance, you can use the GoodRelations RDF vocabulary to mark-up your web page describing the product you're offering; even Google says you can. In semantic SEO a little bit of semantics is good enough, it can still go a long way.

Having your thing™ described with structured data makes it machine readable. Search engine, like Google, is a kind of machine. Therefore making your data machine-readable makes them readable for search engines. You can try how Google reads your data yourself.

By adding a bit of data into the mark-up of your web page (preferably via RDFa) you can optimize the way it will be displayed in Google's search results. Instead of a boring, text-only rendering you can get a display that contains useful information, such as an image of your thing™, its rating, reviews and the like. See the example at the GoodRelations website to compare the difference.

People are more likely to click on a search result with nice image in it, a result that is enriched with all kinds of useful information. This may lead to an increase in your click-through rate. For example, RDFa adoption at BestBuy resulted in a 30 % increase in search traffic. Pursuing the semantic web vision has been a largely academic undertaking, so it's good to see that its by-product, semantic SEO, has some real financial benefits.

The practise of semantic SEO is definitely not an academic endeavour, quite the opposite, a lot of high-profile companies and institutions are adopting it (e.g., BestBuy, O'Reilly, or Tesco). The share of webpages that have structured data in RDFa in them is growing. In October 2010, RDFa was in 3,5 % webpages, whereas the year before the share was 0,5 %.

E-commerce is one of the key factors that contributed to the growth of the Web in the 1990s. The same may become true for the Web of Data, a.k.a. linked data, and the e-commerce applications of the semantic web technologies, such as semantic SEO, may become a crucial drive behind its growth and lead to accelerate the rate of adoption of the linked data principles.