Server Time:
Tuesday May 13 2008 07:29 AM  
Your Time:
  
HostMySite.Com is sponsoring this tutorial, please visit their site today!
This tutorial is sponsored by HostMySite.Com - ColdFusion Hosting

Search Engine Bot Notifier
by: Jim Summer
Email this tutorial to a friend Display Printer Friendly Format
[Download in PDF Format] [Download in FlashPaper Format]

This tutorial will show you how you can implement a system into your web site that will notify you when a search engine robot is crawling your site. The first thing you do is to create a variable called "stringToSearch" that contains the visitor (or robot) browser system (or better known as USER_AGENT).

<cfset stringTosearch = cgi.USER_AGENT>

Next, we will look to see if it it a real browser or not:

<cfif (findnocase(
"MSIE",stringToSearch) EQ 0) AND
       (findnocase(
"Gecko",stringToSearch) EQ 0) AND
       (findnocase(
"Opera",stringToSearch) EQ 0) AND
       (findnocase(
"Konqueror",stringToSearch) EQ 0) AND
       (findnocase(
"Safari",stringToSearch) EQ 0) AND
       (findnocase(
"Netscape",stringToSearch) EQ 0)>

     <!--- THE VISITOR IS NOT USING A BROWSER, SEND AN EMAIL ALERTING ME ITS A ROBOT CRAWLING MY SITE --->

     <cfmail to=
"bots@yourdomain.com"
                from=
"bot#cgi.REMOTE_ADDR#@yourdomain.com"
                subject=
"Spider Bot Alert"
                type=
"HTML"
                server=
"mail.yourdomain.com">

             <font face="verdana" size="2">
               <p><a href="http://ws.arin.net/cgi-bin/whois.pl?queryinput=#cgi.REMOTE_ADDR#">#cgi.REMOTE_ADDR#</a></p>
               <p>#cgi.HTTP_USER_AGENT#</p>
               <p><a href="#cgi.HTTP_REFERER#">#cgi.HTTP_REFERER#</a></p>
             </font>
     </cfmail>
</cfif>


Date added: Wed. April 14, 2004
Posted by: Jim Summer | Views: 9695 | Tested Platforms: CF5 | Difficulty: Beginner
Categories Listed: Charting

HostMySite.Com is sponsoring this tutorial, please visit their site today!
This tutorial is sponsored by HostMySite.Com - ColdFusion Hosting

This author's other tutorials:
Replacing "enter" key with "<br>" tag
This little piece of code will transform those pesky "enter" keys in a textarea into "<br>" tags so your users input is printed out properly in your html page. - Date added: Fri. December 13, 2002
Navigation as an include file
Create an include file (custom tag) for your navigation to help make maintenance easier! This include file lights up the button depending on where the user is at in your website, and is XHTML 1.1 validated! If you need to edit your navigation, just do it in 1 place, the include file! Javascript included :) - Date added: Thu. December 12, 2002
Aliasing Your SQL Statements
Many developers I have talked to are not aware of the ability to create an "alias" in an SQL statement, concatenate, and even add strings into your SQL statements. A little work up front in the SQL can cut out a lot of tedious coding in the <cfoutput> tag. - Date added: Tue. December 10, 2002
Database Dates (between ranges)
This deals with database dates: (1)inserting a properly formatted date into the database, and then (2)pulling a query from the database between a date range defined by 2 text boxes. - Date added: Mon. December 9, 2002
Recordset Paging in Cold Fusion
This will pull a predefined number of records from a database, allow the user to change the number of records to be shown, and write the "NEXT" or "BACK" (or both) buttons at the bottom of the page. Thus allowing the user to "surf" through the database. See it in action at http://freecfm.com/t/tentonhead/ and click on the "CTCS" link when you get there. The HTML for this page has changed a little since I first did this (so it validates as XHTML1.1) but the CFML remains as it is in this snippet. - Date added: Mon. December 9, 2002
Please rate this tutorial:
5 Stars 4 Stars 3 Stars 2 Stars 1 Stars
Comments on this tutorial
Read previous comments on this particular tutorial
Good with SEO?
Hey folks this stuff is very inportant to keep track of your visibility?

Try a search at (any) search engine for the phrases:

Jacksonville Web Design
or
Resume Cold Fusion
or
Resume ASP

just to see for yourself... Resume ASP I am like #1 out of over 5 million returns at Google!!! Whoa...

Thanks a lot and once again thanks to Pablo for this great site! I work with him now (thanks for the tip) he is quite a character in addition to being a great Cold Fusion programmer!

Jim Summer
http://tentonweb.com/
Posted by: Jim S.
Posted on: 11/10/2004 12:31 AM
error using http_referer
Http_referer generates error when the site is searched using bot or search engines. How to get rid of that error?
Posted by: johnny
Posted on: 02/19/2006 11:17 PM
error what error? :)
I just checked my code and all instances referring to this are written as such: #cgi.HTTP_REFERER# - I get emails from this every day so I am sure it is working properly. I will post another way to do this that I am using on another site - that blocks bothersome bots as I encounter them... that post is next.
Thank you,
Jim S.
http://tentonweb.com/
Jacksonville, Florida USA
Posted by: Jim S
Posted on: 02/20/2006 01:12 PM
another way to do this
This blocks bad bots by name and IP's as you find them - sends me an email if not a bad bot. Used as an include file on whatever pages you want - tells you in the email what page was hit. At some point I will store all of these bad bot IP's in a db and loop through that - but... not yet :)
############

<cfset agent=cgi.HTTP_USER_AGENT>
<cfset ip=cgi.REMOTE_ADDR>
<cfset strRef=cgi.HTTP_REFERER>
<cfset remote_host=cgi.REMOTE_HOST>
<cfset thispage=cgi.SCRIPT_NAME>

<!--- if bothersome bot or IP stop page processing totally - show bad msg and blank white screen --->
<cfif (#findnocase("internetseer",agent)# GT 0)>
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif (#findnocase("canadiancontent",agent)# GT 0)><!--- google wrecking referrer --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif findnocase("Wget",agent) GT 0>
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif findnocase("Java",agent) GT 0>
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif findnocase("Twiceler",agent) GT 0><!--- experimental bot --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif findnocase("nicebot",agent) GT 0><!--- spam or game bot --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "71.131.17.120"><!--- bothersome sfo IP --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "218.247.52.8"><!--- bothersome china IP --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "202.108.23.81"><!--- bothersome beijing IP --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "80.77.90.229"><!--- uk spam phisher http://www.spamfo.co.uk/component/option,com_content/task,view/id,164/Itemid,2/ --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "71.131.54.252"><!--- one of many bothersome sbc IP's --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "71.131.61.84"><!--- another sbc --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "71.131.30.31"><!--- another sbc --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "83.149.204.195"><!--- ripe - experts.pereslavl.ru - russian spammer IP--->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "81.29.70.32"><!--- ripe --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "84.180.241.181"><!--- ripe asia pacific bothersome IP --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "71.131.19.89"><!--- sbc private customer --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
<cfelseif remote_host IS "59.182.53.121"><!--- some cheezy persistent apnic bot --->
<cfoutput>Your IP has been added - #ip#</cfoutput>
<cfabort>
</cfif>

<!--- send mail alert if ok--->
<cfmail to="me@mydomain.com"
from="#ip#@mydomain.com"
subject="Alert"
type="HTML">
<font face="verdana" size="2">
<p><b>IP:</b> <a href="http://ws.arin.net/cgi-bin/whois.pl?queryinput=#ip#">#ip#</a><br>
<b>PAGE:</b> #thispage#<br>
<b>REFERRER:</b> <a href="#strRef#">#strRef#</a><br>
<b>USER AGENT:</b> #agent#</p>
</font>
</cfmail>

############
Hope this helps...

Thank you,
Jim S.
http://tentonweb.com/
Jacksonville, Florida USA
Posted by: Jim S
Posted on: 02/20/2006 01:44 PM
z
Why do people submit bad code here without thinking for just one minute how best to write it?

<cfif NOT refind("msie|gecko|opera|konqueror|safari|netscape", lcase(cgi.http_user_agent))>

(Same concept applies to the comment with 50 abort tags.)

Why waste processing time generating pretty html emails for something so simple that can go in text?

Anyway, careful where you put such code. When a bot grabs dozens of pages on your site in a few minutes, do you really want dozens of emails???

Posted by: z
Posted on: 10/07/2006 11:08 PM
Yes I want dozens of emails
Sorry for the bad code Z. Yes I want dozens of emails... that is the reason for the cfabort. Once I identify bothersome bots they are in essence blocked by the identification of the IP and the cfabort.
Thank you,
Jim S.
http://tentonweb.com/
Jacksonville, Florida USA
Posted by: Jim Summer
Posted on: 01/11/2008 12:14 PM
Post a new comment on this tutorial
post a new comment on this particular tutorial
Your Name:
Your Email:
Comment Title:
Comments:
Key Phrase:
 
Skyscrapper Banner Advertisement
ColdFusion Hosting by HostMySite

You are 1 of 671 active sessions! | Privacy | Company
Copyright © 2002 EasyCFM.Com, LLC. (Easy ColdFusion Tutorials) All Rights Reserved
All other trademarks and copyrights are the property of their respective holders.
ColdFusion Hosting ColdFusion Hosting
ADD TO:
Blink
Del.icio.us
Digg
Furl
Google
Simpy
Spurl
Y! MyWeb