[ Site Map ]


Member Login

Home Server Security Web Crawler Protection
Web Crawler Protection E-mail
User Rating: / 0
PoorBest 
Wednesday, 13 July 2011 08:06
This documentation is written in the regard of keeping your web locations safe from web crawlers which may crawl your location in order to mirror and dissect your web location / technology used in the writing of the web location. This information can be used to penetrate into your web applications in order to directly effect your web site.







Web Crawler applications (not those run from search engines) can pose a security risk on your organizational web locations. Web crawlers are utilized by attackers to mirror web locations in order to search for hidden field comments, e-mail addresses, phone numbers, hidden form values, links to additional servers and more. This information can then be extracted for use during an attack.

What attackers normally look for when they begin a web crawl is 3 basic implementations. Of those implementations include: Contacts, hidden form values and web comments. However, this information can be used for the following (of the first two) Social Engineering, spamming, and targeting of specific targets of the web location. The last mentioned, can be used to gain further information such as: user names, and passwords, a general idea of how the code was written (a code road map if you will), and in turn can lead to using the source code against the web server in order to access areas a normal user shouldn't have access, and to further penetrate into the server.

The entire point of web crawling is to assemble the information locally in an IIS or Apache web server as so the attacker can then test the security, before he or she then decides to break into the target. Doing so limits the amount of alarms which can be set off during the actual attack; thus allowing the attacker direct access to a one-time shot at gaining successful access to the location. This is where the protections section of web crawling comes to play. It shall be noted that if implemented incorrectly; the effects of search engines to crawl and index your web location can possibly fail. As, this document is providing that ONLY specific browsers access your web location: Netscape, Mozilla, IE, Safari, Konqueror.

.htaccess Protection:



If you are utilizing the .htaccess files to protect your web site you can implement the following (Provided that RewriteEngine is enabled):

Demonstrating how to block a web browser by userAgent in .htaccess file.



Javascript Denial:



Some web developers when they want to block user agents, they will employ the usage of javascript code in order to do so. However, the problem with certain javascript applications is that, they cannot provide as much security as the .htaccess files can. And, the security methods employed here can be circumvented by using an application such as wget to get the information, modifying it and replaying it back to the server; Or, downloading the information with wget and then viewing it offline. The javascript code which enables web developers to selectively determine which user agents they want to allow is as follows (This is a basic example and a simpler version does exist. This is just for demonstration purposes.)

Demonstrating how to block a web browser by userAgent.



To back a success rate of blocking a browser based upon user agent, you should also utilize the .htaccess method mentioned above. Also, be aware that you will also have to include this information inside each of the pages you create. A cheaper way to include this would be to call it from an external JS file so that you can reference it; breaking it out of the onLoad form event.

PHP Denial:

Demonstrating how to block a web browser by userAgent in PHP code.



Last Updated on Friday, 26 August 2011 14:01
 


© 2007 Network Defense Solutions, Inc.
All Rights Reserved