Let’s get to the crux of the matter. It never fails to make me smile when people misuse the term ‘semantic search’, so really… what is semantic web search?
A simple definition of semantic searchWe often talk of semantic search as if it’s something new. Semantic search goes back years, even centuries. Here is a simple definition of semantic search… Let’s take a village. Mark and Claire have a seven year old son, James, and a baby who is six months’ old and who has asthma, Olivia. Mark and Claire want to go to a dinner party next Thursday, alone. They need a babysitter. Not just any babysitter but one who will keep James away from video games after 7pm and someone who has experience minding new-born children and experience with asthma management. If Mark and Claire were to go to a search engine they might type something like: And we have not even got to the Thursday thing or even included location… The engine fails to return a list of people/babysitters who are strict and know how to work video games (let’s say James puts a video game on and the babysitter needs to take the video game out or disconnect the console, in the most extreme case) but also has experience with new-born childminding with experience or a certification in infant childminding which covers asthma management. The day an engine returns a list of babysitters that meets all of Mark and Claire’s requirements is the day we have semantic search. We are a far away off from this.
But what about Knowledge Graphs and bases, are these semantic?Google’s Knowledge Graph, and Bing, Yahoo and Baidu’s Knowledge Bases have all been working on presenting media objects to the user using an additional SERP snippet or two. Knowledge Graph and other engines’ bases themselves do touch on semantic search but engines first need to understand the query. This is something Google’s Hummingbird, and now with the use of Artificial Intelligence, is starting to lead the way with. Just look at translation services, get a native speaker and you will see that search engines cannot yet properly translate queries into full, local dialect. Keeping in mind that Hummingbird was released over two and a half years ago!
What Google Hummingbird really does to queries?Google Hummingbird attempts to examine queries, usually more than two keywords long, and first filters out which keywords are required and which are optional. There must always be one required keyword which is also the subject keyword. Subject keywords are searched for semantically and today this is often just synonyms, a bit like an online thesaurus. For semantic search, the engine must deconstruct the whole query and reformulate it with variations, matching it with semantics, and construct sub-queries for each combination. To do this properly engines need to add a segmented, semantic tab to their index. Media objects, such as, webpages, images, audio clips, social media profiles, have always been connected within the current web. This is what Knowledge Graph and Bases use. Not semantics. Knowledge Graphs and Bases are often called semantic search and media objects are often called entities. Semantic search goes further by also connecting media objects to objects themselves (e.g. people, places, organisations and events).
Modes of collecting semantic search on the web
- Voluntarily tell the engine through Schema
- The engine discovers this themselves by using HTML scrapers