Tag: robots.txt

  • Sample SEO Magento robots.txt file

    Sample SEO Magento robots.txt file

    Since I get a lot of requests for a robots.txt file designed for Magento SEO here is a sample to get you started. This Magento robots.txt makes the following assumptions:

    • We don’t differentiate between search engines, hence User-agent: *
    • We allow assets to be crawled
      • i.e. images, CSS and JavaScript files
    • We only allow SEF URLs set in Magento
      • e.g. no direct access to the front controller index.php, view categories and products by ID, etc.
    • We don’t allow filter URLs
      • Please note: The list provided is not complete. In case you have custom extension that use filtering make sure to include these filter URLs and parameters in the filter URLs section.
    • We don’t allow session related URL segments
      • e.g. product comparison, customer, etc.
    • We don’t allow specific files to be crawled
      • e.g. READMEs, cron related files, etc.

    Magento robots.txt

    Enough of the talking, here comes your SEO Magento robots.txt:

    # Crawlers Setup
    User-agent: *

    # Directories
    Disallow: /app/
    Disallow: /cgi-bin/
    Disallow: /downloader/
    Disallow: /includes/
    Disallow: /lib/
    Disallow: /pkginfo/
    Disallow: /report/
    Disallow: /shell/
    Disallow: /var/

    # Paths (clean URLs)
    Disallow: /index.php/
    Disallow: /catalog/product_compare/
    Disallow: /catalog/category/view/
    Disallow: /catalog/product/view/
    Disallow: /catalogsearch/
    #Disallow: /checkout/
    Disallow: /control/
    Disallow: /contacts/
    Disallow: /customer/
    Disallow: /customize/
    Disallow: /newsletter/
    Disallow: /poll/
    Disallow: /review/
    Disallow: /sendfriend/
    Disallow: /tag/
    Disallow: /wishlist/
    Disallow: /catalog/product/gallery/

    # Misc. files you don’t want search engines to crawl
    Disallow: /cron.php
    Disallow: /cron.sh
    Disallow: /composer.json
    Disallow: /LICENSE.html
    Disallow: /LICENSE.txt
    Disallow: /LICENSE_AFL.txt
    Disallow: /STATUS.txt
    Disallow: /mage
    #Disallow: /modman
    #Disallow: /n98-magerun.phar
    Disallow: /scheduler_cron.sh
    Disallow: /*.php$

    # Disallow filter urls
    Disallow: /*?min*
    Disallow: /*?max*
    Disallow: /*?q*
    Disallow: /*?cat*
    Disallow: /*?manufacturer_list*
    Disallow: /*?tx_indexedsearch

    Feel free to leave comments below for additional remarks and suggestions for improvement.