Robots and Spiders

Robots/spiders list http://info.webcrawler.com/mak/projects/robots/robots.html 

Robot/spider meta tag commands:

The Robots META tag is a simple mechanism to indicate to visiting Web Robots if a page should be indexed, or links on the page should be followed.

Note: Currently only few robots support this tag!

Where to put the Robots META tag

Like any META tag it should be placed in the HEAD section of an HTML page:
<html>
<head>
<meta name="robots" content="noindex, nofollow">
<meta name="description" content="This page ....">
<title>...</title>
</head>
<body>
...
What to put into the Robots META tag
The content of the Robots META tag contains directives separated by commas. 
The currently defined directives are [NO] INDEX and [NO] FOLLOW.
The INDEX directive specifies if an indexing robot should index the page. 
The FOLLOW directive specifies if a robot is to follow links on the page. 
The defaults are INDEX and FOLLOW.
The values ALL and NONE set all directives on or off: 
ALL=INDEX, FOLLOW and NONE= NOINDEX, NOFOLLOW.

Some examples:

<meta name="robots" content="index,follow">
<meta name="robots" content="noindex,follow">
<meta name="robots" content="index,nofollow">
<meta name="robots" content="noindex,nofollow">

Note the "robots" name of the tag and the content are case insensitive.

You obviously should not specify conflicting or repeating directives such as:

<meta name="robots" content="INDEX,NOINDEX,NOFOLLOW,FOLLOW,FOLLOW">

A formal syntax for the Robots META tag content is:

content    = all | none | directives
all        = "ALL"
none       = "NONE"
directives = directive ["," directives]
directive  = index | follow
index      = "INDEX" | "NOINDEX"
follow     = "FOLLOW" | "NOFOLLOW"

spiderhunter.com/spiderlist/ list of 358 known spiders. At spiderhunter.com find spider name and IP address from Altavista, inktomi.com, hotbot, google, northernlight, lycos, excite and infoseek.

http://www.tardis.ed.ac.uk/~sxw/robots/botwatch.html BotWatch is a short perl script that analyses log files (in either the Common, or NCSA Extended log file formats) and produces an HTML page reporting on the robots seen.

 
 


DIRECTORY


Top  Sponsors


Affiliate Marketing Programs
WebSponsor.com  
Directleads.com 
CyberBounty 
Allcicks
Popupmoney


Home | Link 2 US | Submit URL | Email | Directory | Keyword | Domain | Traffic

Site Hosted by: Cheap Web Host

Privacy

Partners: Matchboxes | CreditCards | Casino | Phone Call | InternetGoldHunt | ohSoSilly | Searchings.co.uk | Travel