The latest version is 0.30 released January 13, 2002 12:03am CDT: ChangeLog |
Download it: in tar.gz format or rpm format. |
Required perl packages can be found on CPAN or as rpm's at your favorite redhat mirror, in the /powertools/CPAN directory. Or as deb's using apt-get! (see the INSTALL file) |
Check the README |
Mail the Author: Bob McElrath <bob+filterproxy@mcelrath.org>. |
SourceForge project page |
FilterProxy is a generic http proxy with the capability to modify proxied content on the fly. It has a modular system of filters which can modify web pages. The modular system means that many filters can be applied in succession to a web page, and configuration is easy and flexible. FilterProxy can proxy any data served by the HTTP protocol (i.e. anything off the web), and filter any recognizable mime-type. All configuration is done via web-based forms, or editing a configuration file. It was created to fix some of the annoyances of poor web design by rewriting it. It also can improve the web for you, in both speed (Compress) in quality (Rewrite/XSLT). After ads (and their graphics) are stripped out, and html is compressed, surfing over a modem is much faster. Compare to Muffin (a similar project in java), and WebCleaner (a similar project in python) in purpose and functionality. FilterProxy is written in perl, and is quite fast.
(NEW!) Also check out my list of ways to fix web/Netscape annoyances that don't involve filtering. (currently small fonts, blink, and javascript popup windows)
For instance, I might bookmark the homepage for xmms (http://www.xmms.org/) which I would then classify by adding the keywords (mp3, linux, audio, music, eyecandy, earcandy, X11, software). Then when I do a search for "software" using this module's interface, I get all items which have the keyword, including xmms. If I search for "linux sofware", I get all things with these keywords, etc. You get the idea. (You could make a yahoo-like index, or filesystem-like path by joining keywords "/linux/software/mp3" note this is the same as "/software/linux/mp3") (Does anyone else but me have thousands of bookmarks, and occasionally think "I saw a piece of software that does X", and then spend 2 hours manually searching your bookmarks?)
For bonus points, add a web spider that will search documents linked from the bookmarked page, and add them to the search engine's database. (This way you could find info by searching that you've never seen, but is closely related to something you've bookmarked).
For bonus bonus points, add the capability for the spider to use Netscape's "What's Related" (or similar) interface to find things similar to the page bookmarked, and index them too.
For bonus bonus bonus points, make sure this doesn't get exploited by advertisers.
This could be an entire thesis project on software agents. Any takers?
Well, first download it. It requires perl, and several modules from CPAN (See the INSTALL file).
After getting it running, tell your browser to us e the proxy. Under netscape, select the menu item Edit->Preferences. Then, in the preferences dialog box, select Advanced->Proxies. (You may have to click the little arrow next to advanced to get netscape to expand the menu). Then select "Manual proxy configuration", and put in the "HTTP Proxy" field the host and port on which you ran FilterProxy. If you haven't edited FilterProxy.pl, this should be 'localhost' and '8888'.