Semantics for Search

Talking about HTML we usually highlight its structural function. However, this time we are going to discuss semantics especially the reasons why it doesn’t provide us with all the advantages promised. The truth is that we spend so much time and effort on writing structurally and semantically valuable html not for screenreader users only.

Google – The Best?

Semantics for Search tips to remember

With growing internet it is more difficult to find valuable sources. Don’t you notice that while looking for something on the Web you can wade through irrelevant and outdate sites for a long while till find the things you are looking for. Google is the leader in search engine market. However, it lacks some important things to make our browsing easier. First of all, we should mention date factor. With enough time to build strong link base, older articles have more chances to rank higher than more relevant, but recent articles. Thus, we can get outdated information.

Lack of recognition of content types is another drawback of search engines we should pay attention to. Looking for some film reviews people want to get the list of reviews but not pages with this word. Thus, semantics can be quite useful.

As we know some attempts were made to extend the semantic power of html code. Today there are two methodologies used: Microformats and html5 microdata. There is also RDF, but we will discus it later.

Microformats extend html semantics using standardized (not necessarily semantic) class names. The most popular Microformat is the hCard which holds the data of a person or company (name, address, contact data,etc.) There are also some formats defined, but as a rule, they are ignored by the web.

The other thing to use is html5 microdata. Using four properties (itemscope, itemtype, itemid, itemprop) you can add extra semantics to your html. However, there are some problems and all of them make things much complex that they should be. For example, most values for the itemprop seem to correspond with the class names you’d normally put on there, which you still need for styling.Semantics for Search basics for all

Perhaps, we just overreaching here. We are trying to automatically and fully process content types on the web and it is great, but still all the users need is to find their search queries.

Trying to provide a full standardized description of a content type, we make Microformats and microdata too complicated. The other thing to consider is that most people appreciate raw data itself. Deciding to buy another TV we need to be transferred for valid product pages. That’s all. But we try to process everything at once. Maybe basic recognition will be enough instead of complete definitions of content types.

What we are trying to say is that instead of defining a complex model for content types we can start to define standardized and semantic base identifier. Let them be simple enough. For example, use “product” for products, “holiday” for holiday, “review” for reviews. You can prefix them but that’s all. Let it work and process the data within. Microformat ideology works, but sometimes it is too complicated and makes search results irrelevant that also leads to disappointment of some users.


Comments are closed.