Since I get a lot of requests for a robots.txt file designed for Magento SEO here is a sample to get you started. This Magento robots.txt makes the following assumptions:
- We don’t differentiate between search engines, hence User-agent: *
- We allow assets to be crawled
- i.e. images, CSS and JavaScript files
- We only allow SEF URLs set in Magento
- e.g. no direct access to the front controller index.php, view categories and products by ID, etc.
- We don’t allow filter URLs
- Please note: The list provided is not complete. In case you have custom extension that use filtering make sure to include these filter URLs and parameters in the filter URLs section.
- We don’t allow session related URL segments
- e.g. product comparison, customer, etc.
- We don’t allow specific files to be crawled
- e.g. READMEs, cron related files, etc.
Magento robots.txt
Enough of the talking, here comes your SEO Magento robots.txt:
# Crawlers Setup
User-agent: *# Directories
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /includes/
Disallow: /lib/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /shell/
Disallow: /var/# Paths (clean URLs)
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalogsearch/
#Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /catalog/product/gallery/# Misc. files you don’t want search engines to crawl
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /composer.json
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
Disallow: /mage
#Disallow: /modman
#Disallow: /n98-magerun.phar
Disallow: /scheduler_cron.sh
Disallow: /*.php$# Disallow filter urls
Disallow: /*?min*
Disallow: /*?max*
Disallow: /*?q*
Disallow: /*?cat*
Disallow: /*?manufacturer_list*
Disallow: /*?tx_indexedsearch
Feel free to leave comments below for additional remarks and suggestions for improvement.