Posted on 1 Comment

SEO Key Metrics and Factors for implementing a SEO quick check tool

Source Code Icon

In case you are searching for ways on how to implement a simple but useful SEO quick check tool here are some tips on the key metrics and factors required to get you started. The SEO quick check tool itself is based on PHP to provide a proof of concept that can be easily adapted and extended. First, let’s have a quick review of the SEO key metrics and factors.

SEO Key Metrics and Factors

First, most of the work related to SEO is based on working on the HTML code level. Thus, inspecting the HTML code via its DOM tree plays and important part when conducting and evaluating SEO checks. A few of the key metrics and factors relevant to SEO are:

  1. Title
  2. Meta-description
  3. Meta-keywords (not really anymore but for the sake of completeness)
  4. OpenGraph Meta-tags (as alternative or addition to traditional meta-tags)
  5. Additional general Meta-tags (locale, Google webmaster tools verification, etc.)
  6. Headers <h*> and their ordering
  7. Alternate text attributes for images
  8. Microdata

Based on the the underlying HTML code the following metrics can be calculated:

  1. Length and “quality” of data provided
  2. Data volume
  3. Text to HTML ratio
  4. Loading time

Apart from these core metrics make sure that the general syntax is correct and matches the W3C standards:

  1. W3C validation

You should even go one step further and validate against the Web Content Accessibility Guidelines (WCAG):

  1. WCAG validation (level A-AAA)

In addition to the HTML generated make sure to provide search engines enough information on the pages available to be indexed and those that should be left aside, i.e. by providing a XML sitemap and a robots.txt file:

  1. XML sitemap
  2. robots.txt

The XML sitemap can either by a sitemap index consisting of multiple sitemaps where each is for instance referring to a special page type (posts vs. pages) or a simple list of URLs. Link metrics in return can be differentiated by site internal and external links:

  1. internal links
  2. external links

When it comes to linking and SEO acquiring link juice is the ultimate goal you should be going for. By getting backlinks from preferably established websites link juice is transferred back to your site, thus strengthening it. This list is not complete and there are loads of details you need to keep in mind when dealing with SEO. Nevertheless, this post is about implementing a SEO quick check tool, right?

Implementing a SEO quick check tool

The following presents a proof of concept for implementing a SEO quick check tool written in PHP. Feel free to use it as a foundation. First of all, let’s assemble our toolset to save us a lot of trouble parsing and evaluating the DOM tree.

cURL

Of course there also exists a PHP extension of cURL. Make sure that the corresponding extension is activated in your php.ini. We will be using cURL for getting various remote assets, starting with the webste HTML code itself:

 
function curl_get($url) {
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_TIMEOUT, 30);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

  $start = microtime();
  $result = curl_exec($ch);

  if ($result === false) {
    return array('error' => curl_error($ch));
  }

  curl_close($ch);
  $end = microtime();

  return array('data' => $result, 'duration' => abs(round($end - $start, 2)));
}

This function will be used throughout the SEO quick check tool and returns an array containing the fields

  1. data: data received based on $url
  2. duration: for benchmarking loading duration
  3. error: in case something went wrong otherwise not set

In case you need additional headers, etc. feel free to adjust this function to your needs.

Simple HTML DOM

Once we have the HTML code  we need to parse it into a DOM tree that we can evaluate. For PHP there exists a handy tool called Simple HTML DOM Parser that does a nice job parsing HTML code to a DOM:

$htmlDOM = <strong>str_get_html</strong>($html);

Yes, that’s all you need to parse the HTML code into a DOM object which we will use the evaluate various tags. Please refer to the Simple HTML DOM Parser Manual for more information on how to use this tool.

SimpleXML

When dealing with XML in PHP SimpleXML is definitely the way to go. We will be using SimpleXML for parsing XML sitemaps. First, we need to check if a XML sitemap is present by inspecting the robots.txt and then we will be using the cURL function defined above to retrieve the sitemap for further inspection.

Check robots.txt

 
$robotsTxtResponse = <strong>curl_get</strong>($robotsUrl); //$url + "/robots.txt", use e.g. parse_url() to assemble URL correctly
$robotsTxt = $robotsTxtResponse['data']; //make sure to check if 'error' is not set in the response

So, let’s assume that robots.txt exists and the content is available through $robotsTxt.

Load XML Sitemap

Based on the contents of robots.txt we can check if a XML sitemap is present:

 
$xmlResponse = <strong>curl_get</strong>($html); 
$xml = $xmlResponse['data']; //make sure to check if 'error' is not set in the response

$siteMapUrl = null;
$siteMapMatches = array();

if (preg_match('#Sitemap:\s?(.+)$#', $robotsTxt, $siteMapMatches)) {
  if (count($siteMapMatches) < 3) {
    // we got ourselves a sitemap URL in $siteMapMatches[1]
    $siteMapUrl = $siteMapMatches[1]);
  }
}

Let’s assume we have a sitemap URL determined above in $siteMapUrl our next step would be to check if it’s a plain sitemap or a sitemap index, i.e. a list of sitemaps for various content types such as pages, posts, categories, etc.

 
// load sitemap
$siteMapData = curl_get($siteMapUrl);

$isSitemapIndex = false;
$sitemaps = array();
$sitemapUrls = array();

if (preg_match('/<urlset/', $siteMapData)) { // plain sitemap
  $sitemapUrlIndex = $xml = new SimpleXMLElement($siteMapData);
 
  if (isset($sitemapUrlIndex->url)) {
    foreach ($sitemapUrlIndex->url as $v) {
      $sitemapUrls[] = $v->loc;
    }
  }
} else if (preg_match('/<sitemapindex/', $siteMapData)) { // sitemap index
  $sitemapIndex = $xml = new SimpleXMLElement($siteMapData);

  if (isset($sitemapIndex->sitemap)) {
    $isSitemapIndex = true;
    foreach ($sitemapIndex->sitemap as $v) {
      $sitemaps[] = $v->loc;
    }
  }
}

Depending on the contents of the original sitemap this snippet parses plain sitemaps or nested sitemaps inside.

W3C Validator

In order to validate an URL for W3C conformity you can use the handy w3c-validator by micheh. The code required to run this validator is pretty simple:

 
$validator = new \W3C\HtmlValidator();
$result = $validator->validateInput($html); // $html from above

if ($result->isValid()) {
  // Hurray! no errors found :)
} else {
  // Hmm... check failed
  //$result->getErrorCount()
  //$result->getWarningCount()
}

Again, please refer to the w3c-validator documentation for more information.

Google Web Search API

Although technically speaking deprecated, the Google Web Search API still is handy to quickly generate the search preview:

 
// use user's IP to reduce&nbsp;server-to-server requests
$googleWebSearchApiUrl = "https://ajax.googleapis.com/ajax/services/search/web?v=1.0&amp;"
 . "q=site:" . urlencode($url) . "&amp;userip=" . $_SERVER['REMOTE_ADDR'];

$googleWebSearchApiResponse = curl_get($googleWebSearchApiUrl);
$googleWebSearchApiResponseArray = json_decode($googleWebSearchApiResponse, true); // do some checks here if request succeeded

// access data from response
$searchResults = $searchResultData['responseData']['cursor']['resultCount'];
$searchResultAdditionalData = $searchResultData['responseData']['results'];

Conclusion

As you can see implementing a basic SEO quick check tool can be achieved with a basic set of tools and frameworks. Furthermore, based on the key metrics determined you are able to quickly identify potential SEO problems.

Live Demo

Enough of the theoretical information? Ok! Head over to the Both Interact SEO Quick Check Tool for a live demonstration of this SEO Quick Check Tool. In case you like it feel free to drop a comment.

Posted on 1 Comment

WCAG 2.0 – Web Content Accessibility Guidelines

W3C WCAG 2.0

When designing web sites and portals make sure to also address general accessibility issues governed for instance by the WCAG 2.0 – Web Content Accessibility Guidelines.

WCAG 2.0 – Web Content Accessibility Guidelines

Basically, the WCAG is composed of three priority levels:

  1. Level A (beginner),
  2. Level AA (intermediate)
  3. Level AAA (advanced)

Each level adds additional requirements concerning the four guidelines principles

  1. perceivable
  2. operable
  3. understandable
  4. robust

Perceivable (section 1.1 Text alternatives through 1.4 Distinguishable) defines that

Information and user interface components must be presentable to users in ways they can perceive.

Operable (section 2.1 Operable through 2.4 Navigable) makes sure that

User Interface components and navigation must be operable.

Understandable (section 3.1 Readable through 3.3 Input Assistence) defines that

Information and the operation of user interface must be understandable.

Robust (section 4.1 Compatible) finally makes sure that

Content must be robust enough that it can be interpreted reliably by a wide variety of user agents, including assistive technologies.

WCAG 2.0 – Checklists

Below you find checklists for each WACG level published by Luke McGrath that outlines the guidelines to make websites WCAG conformant. Following the checklist for each of the levels you find online tools that enable you to check websites for conformance.

WCAG 2.0 – Checklist Level A (Beginner)

Guideline Summary
1.1.1 – Non-text Content Provide text alternatives for non-text content
1.2.1 – Audio-only and Video-only (Pre-recorded) Provide an alternative to video-only and audio-only content
1.2.2 – Captions (Pre-recorded) Provide captions for videos with audio
1.2.3 – Audio Description or Media Alternative (Pre-recorded) Video with audio has a second alternative
1.3.1 – Info and Relationships Logical structure
1.3.2 – Meaningful Sequence Present content in a meaningful order
1.3.3 – Sensory Characteristics Use more than one sense for instructions
1.4.1 – Use of Colour Don’t use presentation that relies solely on colour
1.4.2 – Audio Control Don’t play audio automatically
2.1.1 – Keyboard Accessible by keyboard only
2.1.2 – No Keyboard Trap Don’t trap keyboard users
2.2.1 – Timing Adjustable Time limits have user controls
2.2.2 – Pause, Stop, Hide Provide user controls for moving content
2.3.1 – Three Flashes or Below No content flashes more than three times per second
2.4.1 – Bypass Blocks Provide a ‘Skip to Content’ link
2.4.2 – Page Titled Use helpful and clear page titles
2.4.3 – Focus Order Logical order
2.4.4 – Link Purpose (In Context) Every link’s purpose is clear from its context
3.1.1 – Language of Page Page has a language assigned
3.2.1 – On Focus Elements do not change when they receive focus
3.2.2 – On Input Elements do not change when they receive input
3.3.1 – Error Identification Clearly identify input errors
3.3.2 – Labels or Instructions Label elements and give instructions
4.1.1 – Parsing No major code errors
4.1.2 – Name, Role, Value Build all elements for accessibility

WCAG 2.0 Level A basically makes sure that the content is accessible based on the four main principles perceivable, operable, understandable and robust. As stated above levels AA and AAA add additional requirements to these principles which are outlined below.

WCAG 2.0 checklist Level AA (Intermediate)

Guideline Summary
1.2.4 – Captions (Live) Live videos have captions
1.2.5 – Audio Description (Pre-recorded) Users have access to audio description for video content
1.4.3 – Contrast (Minimum) Contrast ratio between text and background is at least 4.5:1
1.4.4 – Resize Text Text can be resized to 200% without loss of content or function
1.4.5 – Images of Text Don’t use images of text
2.4.5 – Multiple Ways Offer several ways to find pages
2.4.6 – Headings and Labels Use clear headings and labels
2.4.7 – Focus Visible Ensure keyboard focus is visible and clear
3.1.2 – Language of Parts Tell users when the language on a page changes
3.2.3 – Consistent Navigation Use menus consistently
3.2.4 – Consistent Identification Use icons and buttons consistently
3.3.3 – Error Suggestion Suggest fixes when users make errors
3.3.4- Error Prevention (Legal, Financial, Data) Reduce the risk of input errors for sensitive data

WCAG 2.0 checklist Level AAA (Advanced)

Guideline Summary
1.2.6 – Sign Language (Pre-recorded) Provide sign language translations for videos
1.2.7 – Extended Audio description (Pre-recorded) Provide extended audio description for videos
1.2.8 – Media Alternative (Pre-recorded) Provide a text alternative to videos
1.2.9 – Audio Only (Live) Provide alternatives for live audio
1.4.6 – Contrast (Enhanced) Contrast ratio between text and background is at least 7:1
1.4.7 – Low or No Background Audio Audio is clear for listeners to hear
1.4.8 – Visual Presentation Offer users a range of presentation options
1.4.9 – Images of Text (No Exception) Don’t use images of text
2.1.3 – Keyboard (No Exception) Accessible by keyboard only, without exception
2.2.3 – No Timing No time limits
2.2.4 – Interruptions Don’t interrupt users
2.2.5 – Re-authenticating Save user data when re-authenticating
2.3.2 – Three Flashes No content flashes more than three times per second
2.4.8 – Location Let users know where they are
2.4.9 – Link Purpose (Link Only) Every link’s purpose is clear from its text
2.4.10 – Section Headings Break up content with headings
3.1.3 – Unusual words Explain any strange words
3.1.4 – Abbreviations Explain any abbreviations
3.1.5 – Reading Level Users with nine years of school can read your content
3.1.6 – Pronunciation Explain any words that are hard to pronounce
3.2.5 – Change on Request Don’t change elements on your website until users ask
3.3.5 – Help Provide detailed help and instructions
3.3.6 – Error Prevention (All) Reduce the risk of all input errors

 

Posted on Leave a comment

Good Bye HTML5 – Welcome HTML5.1!

As you might have already heard the W3C has finally officially published a recommendation for the (15 years old) HTML5 standard.

HTML5

In fact, HTML 5.0 has (co-)existed for quite some time now. Numerous frameworks implemented HTML5 features long before the official recommendation (as usual).

And be honest, who hasn’t used at least some of the features HTML 5 has to offer?

So, what about HTML5.1? Well, it for sure will bring new features. But, one of the core aspects is the built-in support for DRM which should end the era of additional browser plugins forced on users especially by streaming providers.

But it will definitely take some time before we get to officially use HTML5.1 😉

Good bye HTML5 – and Welcome HTML5.1!