W3C compliant code

This post belongs to a series about search engine optimization (SEO) with Magnolia CMS. Today we look at creating W3C compliant code.

In order to crawl and index a website, a search engine has to be able to read and interpret its code. A page coded in compliance with standards developed by the World Wide Web Consortium (W3C) is simpler for search engines to parse.

The standards page at W3C lists more than 100 specifications, ranging from accessibility to XSLT. While this can seem overwhelming, four of the standards are key:
  • Valid code (HTML, XML, XHTML)
  • Semantically correct code
  • CSS
  • DOM and ECMAScript


Valid code

The Magnolia Templating Kit is a best-practice framework that produces valid XHTML. The templates can be used to build compelling websites quickly and efficiently without the need for technical expertise.

The use of XHTML markup is declared with a DOCTYPE element, as you can see in the source code of any page created using a Magnolia template.


The page validates against the Document Type Definition (DTD) declared in the same element. The structure of a Magnolia page conforms to the document type by having an html element with the XHTML namespace declared, a head element including the title element, and a body element. All elements and attribute names are written in lower case and attribute values are enclosed in quotes. All non-empty elements (e.g. p, li) are properly terminated with a closing tag. All empty elements (e.g. br, hr, img) are properly terminated with a trailing slash (<br />).

Semantically correct code

Semantic markup describes the meaning of a page in terms of content, rather than the design.

In Magnolia templates, heading elements (h1, h2, h3 etc.) are used in the correct order without skipping levels, lists are used to list items, and tables are only used to present tabular data, not to create page layouts. Try removing CSS, JavaScript and images from your code. Is the code still understandable? Does it make sense to a reader? If yes, then it is semantically good code.

CSS

Cascading Style Sheets change the appearance of page elements by assigning styles to them. A style sheet gives a consistent appearance to an entire site. Semantic XHTML and CSS can improve your rankings, as it leads to better crawling, faster website response, and better accessibility and usability, which gives you higher conversions and increases your linking chances.

The look and feel of a Magnolia-based website is controlled by a single CSS file. Since the XHTML structure is designed to be styled with CSS, appearances can be changed without touching templates. Designers working with CSS don't necessarily need to know anything about Magnolia in order to create a compatible site design. They can start with the static HTML prototype that ships with the system, pick an HTML structure that is a close match to their original design, and then modify it further.

Magnolia CSS includes some browser-specific properties such as -moz-border-radius that are not part of the official CSS specification. Such properties are not used in areas critical to usability and fall back to standard properties gracefully.

DOM and ECMAScript

Document Object Model (DOM) provides a scripting language such as ECMAScript, the standardized version of JavaScript, easy access to the structure, content and presentation. DOM is future proof; it will allow any scripting language interact with the document. As long as you site has useful content, JavaScript code will plays virtually no role in SEO optimization. The only thing that matters is where the code resides.

Magnolia uses jQuery, the most popular JavaScript library. jQuery is a cross-browser library designed to simplify client-side scripting of HTML. It supports DOM element selection, traversal and modification. The entire library is stored in single external JavaScript file rather than embedded in the page HTML. An external file saves bandwidth by reducing page length, yielding faster downloads.