HTML and Semantics – Conjoined twins of the Future Web !


Around 4-5 years ago, my daily knowledge building and socializing activities went like – read books, attend some sessions/training, visit a friend’s place, engage in a constructive debate, take a walk @ the nearby park … and so on . But years later … the scenario is slowly changing and I hardly find any time doing at least a few of the items listed above, and a lot more additional items have creeped in to the list!

The technology has taken over, and the collective wisdom of humanity, is slowly getting harnessed onto the most powerful medium today – the WEB ! Kids are more concerned about making more friends on their favorite social-networking site than friends at school / college, we talk through comments and write on walls (I would’ve been happier, if we burn that much calorie to write on an actual wall … but this is something else J). The internet is slowly becoming a part of daily life now. Imagine a day, when you could gather all the data that you would need, fetched from the millions of servers on the web, custom tailored to fit into your global profile page, without needing to go searching deeply into the fathoms of the complex web traffic. Yes, that is where the web is heading towards, and that day will be called – ‘The day of Semantic Web’, and the most important players in that era will be none other than our old kid ‘HTML’ and the relatively sophisticated kid ‘Semantics’.

I plan to throw light on the relationship between these 2, from a web developer’s point of view.

What is ‘SEMANTICS’?

Wikipedia, says - ‘Semantics (from Greek  – semantikos) is the study of meaning, usually in language’. Relating this to the internet, semantics is the study of meaning of data, ie, it is the study of the concept which makes a machine, understand what type of data it is dealing with (as far as web is concerned, the ‘machine’ can be the browser, or any program / code, that deals with the data).

Example: A page may contain, a group of data, which are related to each other, like the steps to be taken to accomplish a task. There may be a tabular data, which comprehends the details of a product, its configurations and price details. There may be a date on which a particular document was created … and the list goes on and on.  In order to make those content / data meaningful to the machines, we need to follow some global guidelines, while creating them,  which entitles the data to be a particular type or the other. Streamlining the data like this would make it really easier to make the data ‘machine readable’ and this is the first step towards efficient data segregation and relation.

For instance, how would it feel like, if you are able to aggregate all the data available about a new gadget that you are planning to buy – the pricing, reviews, configuration, features  everything displayed in a single page, fetched from millions of machine readable data available over the web !

This exactly is the future of web, and the search god ‘Google’, is already relying on such search features, which would make them ‘uber-god’ of the search engine world!

How is HTML related to ‘Semantics’?

Since, a browser is the primary interface enabling the user to get connected to the internet (computer screen, mobile screen, TV, Projector are all different medias, but ultimately, all these needs a browser to download the page locally and view the contents), and there is only one way, with which content can be displayed in a browser – which is the ‘HTML’ way. So, HTML holds the major stake, in determining the way, data has to be displayed on a page, and the ‘tags’ used in rendering the output as HTML, to a browser determines the meaningfulness of the data contained. This is important, because a particular data/content can be rendered, using more than one HTML tag.

The page/content is said to be meaningfully rendered, or said to be ‘semantically correct’, if those are rendered using tags which represent the correct meaning of the data. Eg.  A date rendered in a tag, a quote rendered in a tag, a label rendered in a

Find below, the ‘semantics’ of a few HTML tags

(courtesy: htmlplayground)

h1Normally the main header of the page. An ideal page should have a single h1 tag in it. Some constructive debate here link1, link2. Headers from H1 till H6 should be used hierarchically to denote the importance of the header in the context.
abbrIndicates an abbreviated form, like “Inc.”, “etc.”. By marking up abbreviations you can give useful information to browsers, spell checkers, translation systems and search-engine indexers.
tag defines the start of an address. You should use it to define addresses, signatures, or authorships of documents. The address usually renders in italic. Most browsers will add a line break before and after the address element, but line breaks inside the text you have to insert yourself.
supDefines superscript text.
divDefines a division/section in a document. Browsers usually place a line break before and after the div element. Use the <div> tag to group block-elements to format them with styles.
kbdDefines keyboard text. Phrase element. It is possible to achieve a much richer effect using style sheets.
legendDefines a caption for a fieldset.
bigRenders as bigger text. Not deprecated, but it is better to achieve richer effects using style sheets.
PDefines a paragraph.
buttonDefines a push button. Inside a button element you can put content, like text or images. This is the difference between this element and buttons created with the input element.
olDefines the start of an ordered list.
citeDefines a citation. Phrase element.It is possible to achieve a much richer effect using style sheets.
emRenders as emphasized text. Phrase element.It is possible to achieve a much richer effect using style sheets.
preThe pre element defines preformatted text. The text enclosed in the pre element usually preserves spaces and line breaks. The text renders in a fixed-pitch font. It is worth noting that while <xmp> is deprecated, the <pre> tag does not perform all of the functions of <xmp>: <pre><b>Hello</b></pre> displays Hello <xmp><b>Hello</b></xmp> displays <b>Hello</b>

Why Semantics?

The actual advantages of sticking to semantically correct HTML are numerous. Here are a few of them …

Search Engine Optimization

With the advent of web, information is money, and getting the right information which you look for, at the right time is more important, since no body wants to waste time searching for an information too long. Here comes the era of search engines, and the key here is to make the data / content in your piece of land on the web, appear in a decent position of SERP (Search Engine Result Pages). Using semantically correct code means that your page is easy to read for the machines, and ‘search engine robots’ are nothing else, but machines / programs of search engines J. So, semantically correctly coded pages stands a better chance to appear on top of SERPs.

Scalability / Flexibility

Updating the pages, which are semantically well coded, is a pleasure! Since each sections are coded using the relevant tags, a change in style for the element can be done in the updation of CSS selector for that particular tag and markup level changes can also be done seamlessly.


There are visually impaired users using the internet, and they might use a screen reader to read out the contents aloud. These screen readers are programs (machines in our terminology J)which reads the content of the page, by trying to understand the meaning of the data represented.

There is not a better way to make things easier for these programs to understand the content of the page, than  using a semantically well coded page ?

Cleaner and shorter code

Semantically correct page means, a clean and crisper code, which also means lightweight stuff for the browser to load !

‘The pursuit of Semantic Wellness’ – some practical insights

The pursuit to achieve ‘semantic wellness’, never ends! Every time you code a page, and assume it to be the semantically perfect markup ever, there comes another better way to represent the same in a more correct way. It is all practice, experience and a thorough knowledge of the HTML specifications, which would make a coder compliant to semantics. Here are some insights that I have gained over a period of time as a UX evangelist and Front end engineer, that would help you to be successful in your pursuit of ‘semantic wellness’ :)

Avoid ‘DIV syndrome’

Separating content and presentation using a DIV based layout with CSS, doesn’t mean that DIV is the only tag that you should be using. Avoid ‘DIVITIS’ by carefully choosing the tags for coding your page. This is the first step for ‘ semantic wellness ’.  A great resource to start with is ‘htmlplayground

It really is a pain, but learn to embrace it

Fast tracking your HTML coding can save you time and you may be able to earn more by billing more to the client. But ‘Better things come at a cost’ ! If you want to be a semantically correct coder, be ready to bear the brunt of the excess hours and frustrating cross-browser issues and fixes. But remember, all those hardships are worth it, and the more you learn to embrace it, the more valuable you would be as a web standard evangelist.

Know the structure of the content/information

For knowing which tags to use where, it is extremely important to understand the type of content, in an Information Architecture level.

For eg .

  • The main navigation of a page, are a list of items that have something in common. ie, they all belong to the logical set called, main navigation. Here the ideal HTML tag candidate is unordered list (ul).
  • A step wizard, where there are a number of steps required to accomplish a task, will need to have the chronological order of the step as a very important factor. Here the ideal HTML tag candidate is ordered list (ol)
  • A list of posts displayed in a page are nothing but unordered lists (ul)
  • A glossary with alphabetical segregation is a couplet of, a term and a definition. Here the ideal HTML tag candidate is definition lists (dl)

Some related resources

Watch out your Class names and ID names

Semantic wellness does not mainly fall under well written HTML tags. But it spills over to the CSS also. I would easily find out a semantic compliant coder, by scanning through the naming conventions of the classes and IDs of the CSS file !

Always avoid presentational names for classes or IDs, and it would yell out to the world about your attitude towards semantics. Read this for more insights on structural naming convention.

HTML5 – The big leap towards ‘Semantic wellness’

As I told in the introduction of this article, the coming era of the web will be that of Semantic web, and the giants of web have already started embracing this in a very big way. This latest edition to the family HTML is going to be biggest leap of web towards a semantically correct place. With a lot of groundbreaking proprietary tags implemented, and the dependency of various third-party tools being dropped, this version of HTML is truly going to lead the bandwagon towards the semantic web.

Here is a brief account on the new tags that are implemented in HTML5

The header of a 'section', typically a headline or grouping of headlines, but may also contain supplemental information about the section.

This tag logically represents the footer section, with the copyright and other footer information including the footer links and all

Represents the navigation of the page, where normally there would be a list of links which leads to respetive contents/pages. This tag should be in the same level as header, footer and the main section tags used in a particular page

An article could represent content that logically is equivalent to an entry of a blog, an article or some other content from an external source

An aside indicates content that is tangentially related to the content around it.

Defines sound, like music or audio streams

Defines video, like video clip or streaming video

The following is a visual representation of the architecture of HTML tags in a typical web page.

Resources on HTML5

Resource to help you out in ‘The Pursuit of Semantic Wellness’

Semantics is a vast ocean by itself … and what i have covered here is its relation with the HTML. This article intends to throw on the aspects of semantics, when it comes to coding an HTML page, and is aimed at those aspiring front end engineers and web standards evangelists, to help them out in bringing out the best. Sharing few of the resources that I have gathered over years, which may help you out to become a semantic rockstar :)

Don’t forget to share your input with us via comments section. Thanks for reading.



  1. Mathew Burrell

    If we are to have a semantic web, html has to be at the forefront. HTML should be the one language for the web creating a universal framework for applications to communicate over the web. Thanks for a great article.

  2. David Ball

    A nice article explaining an introduction to semantics, but many of the HTML tags you mentioned have now been superceded or changed a bit in the latest HTML5 specification. For example there can now we quite legitimately many ‘s on a page, one for each .

  3. Bruno Fonseca

    Amazing article, congratulations, this made many of my doubts dissapear. thanks!

  4. Husien Adel

    thanks for very useful article about the semantic web future in it’s html language :D

  5. Adie

    Amazing post. I particularly appreciate the little piece on HTML5 stuff.

    I found a site just the other day that i had coded without thinking semantically and i have to say it looked a complete mess looking back on it now.

  6. Very nice article, I’d like to add one little tip. Whenever possible try to stick to standard id and class names like microformats. For example use hCard for addresses, hCalendar for events, etc.

    To be honest, I am not really sure if outside of microformats and similar your class and id names really matter for giving meaning to the data they contain. Although in principal you are right, html should not contain presentational markup.

    And let’s not stop at id’s and classes, there are a lot of other attributes that are very useful in describing your content: rel, cite, lang etc.

    • Ranjith

      Thanks Joost !
      Its a gud idea … i prefer short names for ID / Classnames, like ftr for footer, cntnr for container, etc. microformat is a gud comparison :)

  7. Nice article but the the link on “Read this” in the sentence “Read this for more insights on structural naming convention” shows your missing a semantic trick!

  8. Dan Sunderland

    Worth mentioning the “section” tag as well. I’ve been using this on some recent designs to break up functional areas of the site below the header.

    Good article, thanks.

  9. JASIL

    Hello Ranjith ,

    It was realy amazing article, past 2 months am following your buzz articles and links, it’s very helpful for my team.. thanks again….


  10. Pramod Nair

    Good Article showcasing the importance of something we usually ignore. In recent times, I have realized the importance of semantic coding more and more, especially in terms of isolating the presentation layer, accessibility and dev support. HTML provides numerours options for semantic design but the more important ones are hardly used in everyday development. A favourite of mine is the tags which is the best way to highlight key value pairs.

  11. Balachandar

    I stumbled upon this from delicious home page. THats a well written informative stuff. Good job. Thanks

  12. Shankar

    Excellent piece of information for source code doer. Well, you’ve given enough things to develop an efficient like Google. Great info Perfect UX enthusiast !!

  13. Ranjith

    Hi Kadikoy, We are just talking about the html coding of the page and not design :) … there are all the creative freedom to do a stunning Visual design, for the designer :)