In case you are searching for ways on how to implement a simple but useful SEO quick check tool here are some tips on the key metrics and factors required to get you started. The SEO quick check tool itself is based on PHP to provide a proof of concept that can be easily adapted and extended. First, let’s have a quick review of the SEO key metrics and factors.
SEO Key Metrics and Factors
First, most of the work related to SEO is based on working on the HTML code level. Thus, inspecting the HTML code via its DOM tree plays and important part when conducting and evaluating SEO checks. A few of the key metrics and factors relevant to SEO are:
- Title
- Meta-description
- Meta-keywords (not really anymore but for the sake of completeness)
- OpenGraph Meta-tags (as alternative or addition to traditional meta-tags)
- Additional general Meta-tags (locale, Google webmaster tools verification, etc.)
- Headers <h*> and their ordering
- Alternate text attributes for images
- Microdata
Based on the the underlying HTML code the following metrics can be calculated:
- Length and “quality” of data provided
- Data volume
- Text to HTML ratio
- Loading time
Apart from these core metrics make sure that the general syntax is correct and matches the W3C standards:
- W3C validation
You should even go one step further and validate against the Web Content Accessibility Guidelines (WCAG):
- WCAG validation (level A-AAA)
In addition to the HTML generated make sure to provide search engines enough information on the pages available to be indexed and those that should be left aside, i.e. by providing a XML sitemap and a robots.txt file:
- XML sitemap
- robots.txt
The XML sitemap can either by a sitemap index consisting of multiple sitemaps where each is for instance referring to a special page type (posts vs. pages) or a simple list of URLs. Link metrics in return can be differentiated by site internal and external links:
- internal links
- external links
When it comes to linking and SEO acquiring link juice is the ultimate goal you should be going for. By getting backlinks from preferably established websites link juice is transferred back to your site, thus strengthening it. This list is not complete and there are loads of details you need to keep in mind when dealing with SEO. Nevertheless, this post is about implementing a SEO quick check tool, right?
Implementing a SEO quick check tool
The following presents a proof of concept for implementing a SEO quick check tool written in PHP. Feel free to use it as a foundation. First of all, let’s assemble our toolset to save us a lot of trouble parsing and evaluating the DOM tree.
cURL
Of course there also exists a PHP extension of cURL. Make sure that the corresponding extension is activated in your php.ini. We will be using cURL for getting various remote assets, starting with the webste HTML code itself:
function curl_get($url) { $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_TIMEOUT, 30); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); $start = microtime(); $result = curl_exec($ch); if ($result === false) { return array('error' => curl_error($ch)); } curl_close($ch); $end = microtime(); return array('data' => $result, 'duration' => abs(round($end - $start, 2))); }
This function will be used throughout the SEO quick check tool and returns an array containing the fields
- data: data received based on $url
- duration: for benchmarking loading duration
- error: in case something went wrong otherwise not set
In case you need additional headers, etc. feel free to adjust this function to your needs.
Simple HTML DOM
Once we have the HTML code we need to parse it into a DOM tree that we can evaluate. For PHP there exists a handy tool called Simple HTML DOM Parser that does a nice job parsing HTML code to a DOM:
$htmlDOM = <strong>str_get_html</strong>($html);
Yes, that’s all you need to parse the HTML code into a DOM object which we will use the evaluate various tags. Please refer to the Simple HTML DOM Parser Manual for more information on how to use this tool.
SimpleXML
When dealing with XML in PHP SimpleXML is definitely the way to go. We will be using SimpleXML for parsing XML sitemaps. First, we need to check if a XML sitemap is present by inspecting the robots.txt and then we will be using the cURL function defined above to retrieve the sitemap for further inspection.
Check robots.txt
$robotsTxtResponse = <strong>curl_get</strong>($robotsUrl); //$url + "/robots.txt", use e.g. parse_url() to assemble URL correctly $robotsTxt = $robotsTxtResponse['data']; //make sure to check if 'error' is not set in the response
So, let’s assume that robots.txt exists and the content is available through $robotsTxt.
Load XML Sitemap
Based on the contents of robots.txt we can check if a XML sitemap is present:
$xmlResponse = <strong>curl_get</strong>($html); $xml = $xmlResponse['data']; //make sure to check if 'error' is not set in the response $siteMapUrl = null; $siteMapMatches = array(); if (preg_match('#Sitemap:\s?(.+)$#', $robotsTxt, $siteMapMatches)) { if (count($siteMapMatches) < 3) { // we got ourselves a sitemap URL in $siteMapMatches[1] $siteMapUrl = $siteMapMatches[1]); } }
Let’s assume we have a sitemap URL determined above in $siteMapUrl our next step would be to check if it’s a plain sitemap or a sitemap index, i.e. a list of sitemaps for various content types such as pages, posts, categories, etc.
// load sitemap $siteMapData = curl_get($siteMapUrl); $isSitemapIndex = false; $sitemaps = array(); $sitemapUrls = array(); if (preg_match('/<urlset/', $siteMapData)) { // plain sitemap $sitemapUrlIndex = $xml = new SimpleXMLElement($siteMapData); if (isset($sitemapUrlIndex->url)) { foreach ($sitemapUrlIndex->url as $v) { $sitemapUrls[] = $v->loc; } } } else if (preg_match('/<sitemapindex/', $siteMapData)) { // sitemap index $sitemapIndex = $xml = new SimpleXMLElement($siteMapData); if (isset($sitemapIndex->sitemap)) { $isSitemapIndex = true; foreach ($sitemapIndex->sitemap as $v) { $sitemaps[] = $v->loc; } } }
Depending on the contents of the original sitemap this snippet parses plain sitemaps or nested sitemaps inside.
W3C Validator
In order to validate an URL for W3C conformity you can use the handy w3c-validator by micheh. The code required to run this validator is pretty simple:
$validator = new \W3C\HtmlValidator(); $result = $validator->validateInput($html); // $html from above if ($result->isValid()) { // Hurray! no errors found :) } else { // Hmm... check failed //$result->getErrorCount() //$result->getWarningCount() }
Again, please refer to the w3c-validator documentation for more information.
Google Web Search API
Although technically speaking deprecated, the Google Web Search API still is handy to quickly generate the search preview:
// use user's IP to reduce server-to-server requests $googleWebSearchApiUrl = "https://ajax.googleapis.com/ajax/services/search/web?v=1.0&" . "q=site:" . urlencode($url) . "&userip=" . $_SERVER['REMOTE_ADDR']; $googleWebSearchApiResponse = curl_get($googleWebSearchApiUrl); $googleWebSearchApiResponseArray = json_decode($googleWebSearchApiResponse, true); // do some checks here if request succeeded // access data from response $searchResults = $searchResultData['responseData']['cursor']['resultCount']; $searchResultAdditionalData = $searchResultData['responseData']['results'];
Conclusion
As you can see implementing a basic SEO quick check tool can be achieved with a basic set of tools and frameworks. Furthermore, based on the key metrics determined you are able to quickly identify potential SEO problems.
Live Demo
Enough of the theoretical information? Ok! Head over to the Both Interact SEO Quick Check Tool for a live demonstration of this SEO Quick Check Tool. In case you like it feel free to drop a comment.
Thanks, very helpful! Do you offer a download version of your solution?