2011-04-10

Data-driven e-commerce with GoodRelations

On April 6th at the University of Economics, Prague, Martin Hepp gave a talk entitled Advertising with Linked Data in Web Content: From Semantic SEO to E-Commerce on the Web. Martin presented his view of the current situation in e-commerce and how it can be made better through structured data, explaining it on the use of GoodRelations, the ontology he has created.

GoodRelations

GoodRelations is an ontology describing the domain of electronic commerce. For instance, it can be used to express an offering of a product, specify a price, or describe a business and the like. The author and active maintainer of GoodRelations is Martin Hepp. As he has shared in his talk, there is actually quite a lot of features that set it apart from other ontologies.
  1. It's the single one ontology that someone has paid for doing. At Overstock.com an expert was hired to consult the use of GoodRelations.
  2. It's not only a research project. It's been accepted by the e-commerce industry and it's used by companies such as BestBuy or O'Reilly Media.
  3. Its design is driven mainly by practice and real use cases and not only by research objectives. For instance, it's been amended when Google requested minor changes. And Google even stopped recommending its own vocabulary it has created for the domain of e-commerce in favour of GoodRelations. It's the piece of the semantic web Google has chosen. Nonetheless, it's still an OWL-compliant ontology.
  4. It comes with a healthy ecosystem around it. The ontology provides a thorough documentation with lots of examples and recipes that you can adopt and fine-tune to your specific use case. There are available validators for the ontology and there is a plenty of e-shop extensions and tools built for GoodRelations.
  5. Finally, it's not only a product of necessity. As Martin Hepp said, he actually quite enjoys doing it.

Product Ontology

The other project that was showcased by Martin Hepp is the Product Ontology. It's a dataset describing products that is derived from Wikipedia's pages. It contains a several hundred thousand precise OWL DL class definitions of products. These class definitions are tightly coupled with Wikipedia: the edits in Wikipedia are reflected in Product Ontology. For instance, if the Product Ontology doesn't list the type of product you sell, you can create a page for it in Wikipedia and, given that it's not deleted, the product type will appear within 24 hours in the Product Ontology. This is similar to the way BBC uses Wikipedia. An added benefit is that it can also serve as dictionary containing up to a hundred labels in different languages for a product because it's built on Wikipedia containing the bundles of pages describing the same thing in different languages.

Semantic SEO

The primary benefit of GoodRelations is in how it improves search. We spend more time searching than we have ever used to. Martin Hepp said that there's an order of magnitude increase in the time we spend searching. It takes us long time before we finally find the thing we interested in because the current web search is a blunt instrument.
World-Wide Web acts as a giant information shredder. In databases, data are stored in a structured format. However, during the data transmission to web clients, data are being lost. They aren't sent as structured data but presented in a web page that can be read by a human customer but machines can pretty much treat it only as a black-box. Instead of being sent in the form in which it's stored in database, the message is not kept intact when it's being sent through the web infrastructure. The structure of the data gets lost on the way to a client and only the presentation of the content is delivered. This means that the agent accessing the data via the Web often needs to reconstruct and infer the original structure of the data.
The web search operates on a vast amount of data that is most for part unstructured and as such it doesn't provide the affordances to conduct anything clever. Simple HTML doesn't allow you to articulate your value proposition well. The products and services are often reduced to a price tag. Enter the semantic SEO.
Semantic SEO can be defined as using data to articulate your value proposition on the Web. It strives to preserve the specificity and richness of your value proposition when you need to send it over the Web. Ontologies such as GoodRelations allow you to describe your products and services with a high degree of precision.

Specificity

We need clever and more powerful search engines because of the tremendous growth in specificity. Wealth fosters the differentiation of products and this in turn leads to an increased specificity. This means there is a plethora of various types of goods and services available on the shelves of markets and shops. The size of the type system we use has grown (In RDF-speak, this would be the number of different rdf:types). We're overloaded with the number of different product types we're able to choose from. It's the paradox of choice: faced with a larger number of goods our ability to choose one of them goes down.
What GoodRelations does is that it provides a way to annotate products and services on the Web in a way that can be used by search engines to deliver a better search experience to their users. It allows for the deep search — a search that accepts very specific search queries and gives very precise answers. With GoodRelations you can retain the specificity of your offering and harness it in search. This is a possibility to target niche markets and get customers with highly specific needs in the long tail.
We need better search engines built on the structured data on the Web to alleviate the analysis paralysis that results from us being overwhelmed by the number of things to choose from. The growing amount of GoodRelations-annotated data is a step in the direction to a situation when you'll be able to pose a specific question to a search engine and get a list of only the highly relevant results.
The e-commerce applications and ontologies such as GoodRelations or Product Ontology show the pragmatic approach to the use of the semantic web technologies. Martin Hepp also mentioned his pragmatic view of linked data. In his opinion, the links that create the most business advantage are the most important. And it was interesting to see parts of the semantic web that work. It seems we're headed to a future of data-driven e-commerce.

2 comments :