Home | All Questions | alt.html FAQ >

What Meta tags can I use to allow/disallow bots indexing my pages?

Index this page, don't follow the links

<META name="robots" content="index, nofollow">

Don't index this page, follow the links

<META name="robots" content="noindex, follow">

Don't index this page or follow the links

<META name="robots" content="noindex, nofollow">

Index this page and follow the links

<META name="robots" content="index, follow">

you can also use the robots.txt file in your root directory.

In theory, the robots.txt file could be used but if the particular _document_ is within your control, I strongly suspect that robots.txt (which is part of system-wide server configuration, controlled by the Web server administration) is outside your control even more. Besides, it's usually not a good idea to use robots.txt on a _per-document_ basis; rather, it is used to specify that some _directories_ should be excluded from robots, and such.

If there's a good reason for excluding the document from indexing robots, then I suppose you could tell that reason to the document owner. After all, preventing (to the extent possible) robots from finding it through a particular link on a particular page of yours wouldn't in any way prevent robots from finding it in other ways. (Even if you think there are no other links to it, it could still find its way to indexing robots.)

If this doesn't apply, for some reason, and you still wish to try to prevent robots from finding the document via a particular link, you could play a little with indirection, or redirection. Instead of linking to /somepath/foo.html, link to bar.html which you set up so that it gets redirected to /somepath/foo.html on browsers but (hopefully) not on robots. In a sense, you would then need to do redirection the _wrong_ way (the common way), just because that wrong way so often fights against search engines!

The content of bar.html could be

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
<title>Redirection</title>
<meta http-equiv="Refresh" content="7; URL=/somepath/foo.html">
<meta name="robots" content="noindex,nofollow">
<p>You are being redirected to the
<a href="/somepath/foo.html">Foo Bar</a> page.</p>

Of course, this is an awkward hack, with no guarantee of working, but it might do the job in most cases. And there is no good choice for the parameter that specifies the delay in redirection. I've used 7 (seconds) which is probably sufficient when the user has hit the Back button (when having finished reading foo.html) and needs to hit it again within the delay time, to prevent getting redirected back to foo.html. But it's quite a delay when the user follow the link to bar.html. Of course, you might add something entertaining to read onto that bar.html page, such as a short note of what's coming, content-wise.

Recommended Resources

Discussion

Related Questions