May 25, 2020, 7:28 p.m.

4 Steps to create robots.txt file in Django

robots.txt is a standard used by websites to communicate with web crawlers and other web robots. The standard specifies how to inform the web robots about which areas of the website should not be processed or scanned. The pages or the url patterns included in robots.txt file will not indexed by the search engines.

When a site owner wishes to give instructions to web robots they place a text file ( called robots.txt) in the root of the web site hierarchy (e.g. https://www.example.com/robots.txt). This text file contains the instructions in a specific format (see examples below). Robots that choose to follow the instructions try to fetch this file and read the instructions before fetching or scanning any other file from the website. If this file doesn't exist, web robots assume that the website owner is not wishing to place any limitations on crawling the entire site.

#example
User-agent: *
allow: /blog/
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/
Creating a robots.txt file in django is a simple task. Just follow the following steps to make a robots.txt file in django.

Step 1 We will use django-robots module to make robots.txt file for our website.

Install the package by using following command :
pip install django-robots

Step 2 Include the "robots" app in INSTALLED_APPS list :
#in settings.py
INSTALLED_APPS=[
.
.
.
'robots',
]

Now run the migrate command to include robots app tables in django admin app -
python manage.py migrate


Step 3 Add a url pattern for robots.txt file in urls.py file.
#in urls.py
urlpatterns=[
.
.
url(r'^robots.txt$' , include('robots.urls')),
]
Use example.com/robots.txt url for robots.txt file.


Step 4 Now, go to django admin app and add the urls patterns which you want to allow or disallow in Url .
Set rules for these urls in

Rules

table. You can select the urls manually to apply rules for them.


image not found
Tips

You can also make an txt file in templates and render it with any url mapping as we do for other templates, but that' s makes the process manual and it is hard to set urls allow or disallow if we have a huge number of urls i.e the management of urls becomes hard. So the method explained above is the efficient one.

robots.txt and sitemap.xml are the most important files for a website . Read How to create a sitemap in django to learn how you can create a sitemap for your website.


image not found




Some recent posts



Complete process of changing database from SQLite to MySQL - How to migrat...

" How to download a file in django | Making a large file downloadable from ...

How to use proxy in python requests | http & https proxies - Scraping...

Top Django Querysets, How to filter records | Fetching data from database e...

How to change base url or domain in sitemap - Django ...

How to Make a Website - Everything you should know about web development...

What is Javascript - Features and usages ...

Top 5 Interview Questions : Tips for HR round ...

How to get job in IT : Perfect resume guide for IT job ...

Programming vs coding | difference between programming and coding ...



View all...