Advanced mod_rewrite Expert Tricks

Are you an advanced mod_rewrite expert or guru? This article is for YOU too!

The following undocumented techniques and methods will allow you to utilize mod_rewrite at an “expert level” by showing you how to unlock its secrets.

Most if not all web developers and server administrators struggle with Apache mod_rewrite. It’s very tough and only gets a little easier with practice. Until Now! Get ready to explode your learning curve,….

Decoding Mod_Rewrite Variables

So when I realized my problem was that I didn’t know the value of the variable being tested by the RewriteCond, I set out to try and discover how to view those variables.. Keep in mind you can also use RewriteLogging, but its only allowed for root users who can edit the httpd.conf, this is .htaccess.

Setting Environment Variables with RewriteRule

I discovered a multitude of methods to set and view apache environment variables, using various modules and some core tricks, but the method that allows me to view the most environment variables is RewriteRule.. I wanted to use SetEnvIf more, but its just not as powerful as mod_rewrite, due to programming.

This code sets the variable INFO_REQUEST_URI to have the value of REQUEST_URI.

RewriteEngine On
RewriteBase /
RewriteRule .* - [E=INFO_REQUEST_URI:%{REQUEST_URI},NE]

Saving the Apache Variable Values

Now the trick is how to view that environment variable… The method I came up with is nice… We will send the environment variable value in an HTTP Header, as there isn’t much data manipulation/validation so you get an accurate look at the actual value.. At first I tried adding the variable value to a redirection using the query_string.. but a HTTP_USER_AGENT value doesn’t play well as a query_string.

Using RequestHeader in .htaccess

This code takes advantage of the incredible mod_headers apache module to actually ADD a whole new header to YOUR request. Seriously one of the coolest tricks I’ve found yet.. Its almost the same as being able to spoof POST requests! Since Headers can be protected data… especially the HTTP_COOKIE header..

RequestHeader set INFO_REQUEST_URI "%{INFO_REQUEST_URI}e"

Viewing the Variable Values

Now you can use any kind of server-run interpreter like perl, php, ruby, etc., to view all the variable values. All cgi-script handlers like those are able to view request headers..

PHP Code to access Apache Variables

Works even in safe-mode… any interpreter can view HTTP Headers! Note that each of these variables are added as HTTP headers to the request for the script.. kinda confusing.. So each variable sent as a header is prefixed with HTTP_ to denote it was a header.

<?php
header("Content-Type: text/plain");
$INFO=$MISS=array();
foreach($_SERVER as $v=>$r)
{
  if(substr($v,0,9)=='HTTP_INFO')
  {
    if(!empty($r))$INFO[substr($v,10)]=$r;
    else $MISS[substr($v,10)]=$r;
  }
}

/* thanks Mike! */
ksort($INFO);
ksort($MISS);
ksort($_SERVER);

echo "Received These Variables:\n";
print_r($INFO);

echo "Missed These Variables:\n";
print_r($MISS);

echo "ALL Variables:\n";
print_r($_SERVER);
?>

Time to Get Crazy

Just create the above php file on your site as /test/index.php or whatever, then create /test/.htaccess which should contain the below .htaccess file snippet. Now just request /test/index.php and be amazed!

Ok, so I’ve prepared the .htaccess code you can use to view the values of all these variables. Just add it to a .htaccess file and make a request. For this test I created an index.php file that printed out all the $_SERVER variables, and made requests to it.

RewriteEngine On
RewriteBase /
RewriteRule .* - [E=INFO_API_VERSION:%{API_VERSION},NE]
RewriteRule .* - [E=INFO_AUTH_TYPE:%{AUTH_TYPE},NE]
RewriteRule .* - [E=INFO_CONTENT_LENGTH:%{CONTENT_LENGTH},NE]
RewriteRule .* - [E=INFO_CONTENT_TYPE:%{CONTENT_TYPE},NE]
RewriteRule .* - [E=INFO_DOCUMENT_ROOT:%{DOCUMENT_ROOT},NE]
RewriteRule .* - [E=INFO_GATEWAY_INTERFACE:%{GATEWAY_INTERFACE},NE]
RewriteRule .* - [E=INFO_HTTPS:%{HTTPS},NE]
RewriteRule .* - [E=INFO_HTTP_ACCEPT:%{HTTP_ACCEPT},NE]
RewriteRule .* - [E=INFO_HTTP_ACCEPT_CHARSET:%{HTTP_ACCEPT_CHARSET},NE]
RewriteRule .* - [E=INFO_HTTP_ACCEPT_ENCODING:%{HTTP_ACCEPT_ENCODING},NE]
RewriteRule .* - [E=INFO_HTTP_ACCEPT_LANGUAGE:%{HTTP_ACCEPT_LANGUAGE},NE]
RewriteRule .* - [E=INFO_HTTP_CACHE_CONTROL:%{HTTP_CACHE_CONTROL},NE]
RewriteRule .* - [E=INFO_HTTP_CONNECTION:%{HTTP_CONNECTION},NE]
RewriteRule .* - [E=INFO_HTTP_COOKIE:%{HTTP_COOKIE},NE]
RewriteRule .* - [E=INFO_HTTP_FORWARDED:%{HTTP_FORWARDED},NE]
RewriteRule .* - [E=INFO_HTTP_HOST:%{HTTP_HOST},NE]
RewriteRule .* - [E=INFO_HTTP_KEEP_ALIVE:%{HTTP_KEEP_ALIVE},NE]
RewriteRule .* - [E=INFO_HTTP_MOD_SECURITY_MESSAGE:%{HTTP_MOD_SECURITY_MESSAGE},NE]
RewriteRule .* - [E=INFO_HTTP_PROXY_CONNECTION:%{HTTP_PROXY_CONNECTION},NE]
RewriteRule .* - [E=INFO_HTTP_REFERER:%{HTTP_REFERER},NE]
RewriteRule .* - [E=INFO_HTTP_USER_AGENT:%{HTTP_USER_AGENT},NE]
RewriteRule .* - [E=INFO_IS_SUBREQ:%{IS_SUBREQ},NE]
RewriteRule .* - [E=INFO_ORIG_PATH_INFO:%{ORIG_PATH_INFO},NE]
RewriteRule .* - [E=INFO_ORIG_PATH_TRANSLATED:%{ORIG_PATH_TRANSLATED},NE]
RewriteRule .* - [E=INFO_ORIG_SCRIPT_FILENAME:%{ORIG_SCRIPT_FILENAME},NE]
RewriteRule .* - [E=INFO_ORIG_SCRIPT_NAME:%{ORIG_SCRIPT_NAME},NE]
RewriteRule .* - [E=INFO_PATH:%{PATH},NE]
RewriteRule .* - [E=INFO_PATH_INFO:%{PATH_INFO},NE]
RewriteRule .* - [E=INFO_PHP_SELF:%{PHP_SELF},NE]
RewriteRule .* - [E=INFO_QUERY_STRING:%{QUERY_STRING},NE]
RewriteRule .* - [E=INFO_REDIRECT_QUERY_STRING:%{REDIRECT_QUERY_STRING},NE]
RewriteRule .* - [E=INFO_REDIRECT_REMOTE_USER:%{REDIRECT_REMOTE_USER},NE]
RewriteRule .* - [E=INFO_REDIRECT_STATUS:%{REDIRECT_STATUS},NE]
RewriteRule .* - [E=INFO_REDIRECT_URL:%{REDIRECT_URL},NE]
RewriteRule .* - [E=INFO_REMOTE_ADDR:%{REMOTE_ADDR},NE]
RewriteRule .* - [E=INFO_REMOTE_HOST:%{REMOTE_HOST},NE]
RewriteRule .* - [E=INFO_REMOTE_IDENT:%{REMOTE_IDENT},NE]
RewriteRule .* - [E=INFO_REMOTE_PORT:%{REMOTE_PORT},NE]
RewriteRule .* - [E=INFO_REMOTE_USER:%{REMOTE_USER},NE]
RewriteRule .* - [E=INFO_REQUEST_FILENAME:%{REQUEST_FILENAME},NE]
RewriteRule .* - [E=INFO_REQUEST_METHOD:%{REQUEST_METHOD},NE]
RewriteRule .* - [E=INFO_REQUEST_TIME:%{REQUEST_TIME},NE]
RewriteRule .* - [E=INFO_REQUEST_URI:%{REQUEST_URI},NE]
RewriteRule .* - [E=INFO_SCRIPT_FILENAME:%{SCRIPT_FILENAME},NE]
RewriteRule .* - [E=INFO_SCRIPT_GROUP:%{SCRIPT_GROUP},NE]
RewriteRule .* - [E=INFO_SCRIPT_NAME:%{SCRIPT_NAME},NE]
RewriteRule .* - [E=INFO_SCRIPT_URI:%{SCRIPT_URI},NE]
RewriteRule .* - [E=INFO_SCRIPT_URL:%{SCRIPT_URL},NE]
RewriteRule .* - [E=INFO_SCRIPT_USER:%{SCRIPT_USER},NE]
RewriteRule .* - [E=INFO_SERVER_ADDR:%{SERVER_ADDR},NE]
RewriteRule .* - [E=INFO_SERVER_ADMIN:%{SERVER_ADMIN},NE]
RewriteRule .* - [E=INFO_SERVER_NAME:%{SERVER_NAME},NE]
RewriteRule .* - [E=INFO_SERVER_PORT:%{SERVER_PORT},NE]
RewriteRule .* - [E=INFO_SERVER_PROTOCOL:%{SERVER_PROTOCOL},NE]
RewriteRule .* - [E=INFO_SERVER_SIGNATURE:%{SERVER_SIGNATURE},NE]
RewriteRule .* - [E=INFO_SERVER_SOFTWARE:%{SERVER_SOFTWARE},NE]
RewriteRule .* - [E=INFO_THE_REQUEST:%{THE_REQUEST},NE]
RewriteRule .* - [E=INFO_TIME:%{TIME},NE]
RewriteRule .* - [E=INFO_TIME_DAY:%{TIME_DAY},NE]
RewriteRule .* - [E=INFO_TIME_HOUR:%{TIME_HOUR},NE]
RewriteRule .* - [E=INFO_TIME_MIN:%{TIME_MIN},NE]
RewriteRule .* - [E=INFO_TIME_MON:%{TIME_MON},NE]
RewriteRule .* - [E=INFO_TIME_SEC:%{TIME_SEC},NE]
RewriteRule .* - [E=INFO_TIME_WDAY:%{TIME_WDAY},NE]
RewriteRule .* - [E=INFO_TIME_YEAR:%{TIME_YEAR},NE]
RewriteRule .* - [E=INFO_TZ:%{TZ},NE]
RewriteRule .* - [E=INFO_UNIQUE_ID:%{UNIQUE_ID},NE]

RequestHeader set INFO_API_VERSION "%{INFO_API_VERSION}e"
RequestHeader set INFO_AUTH_TYPE "%{INFO_AUTH_TYPE}e"
RequestHeader set INFO_CONTENT_LENGTH "%{INFO_CONTENT_LENGTH}e"
RequestHeader set INFO_CONTENT_TYPE "%{INFO_CONTENT_TYPE}e"
RequestHeader set INFO_DOCUMENT_ROOT "%{INFO_DOCUMENT_ROOT}e"
RequestHeader set INFO_GATEWAY_INTERFACE "%{INFO_GATEWAY_INTERFACE}e"
RequestHeader set INFO_HTTPS "%{INFO_HTTPS}e"
RequestHeader set INFO_HTTP_ACCEPT "%{INFO_HTTP_ACCEPT}e"
RequestHeader set INFO_HTTP_ACCEPT_CHARSET "%{INFO_HTTP_ACCEPT_CHARSET}e"
RequestHeader set INFO_HTTP_ACCEPT_ENCODING "%{INFO_HTTP_ACCEPT_ENCODING}e"
RequestHeader set INFO_HTTP_ACCEPT_LANGUAGE "%{INFO_HTTP_ACCEPT_LANGUAGE}e"
RequestHeader set INFO_HTTP_CACHE_CONTROL "%{INFO_HTTP_CACHE_CONTROL}e"
RequestHeader set INFO_HTTP_CONNECTION "%{INFO_HTTP_CONNECTION}e"
RequestHeader set INFO_HTTP_COOKIE "%{INFO_HTTP_COOKIE}e"
RequestHeader set INFO_HTTP_FORWARDED "%{INFO_HTTP_FORWARDED}e"
RequestHeader set INFO_HTTP_HOST "%{INFO_HTTP_HOST}e"
RequestHeader set INFO_HTTP_KEEP_ALIVE "%{INFO_HTTP_KEEP_ALIVE}e"
RequestHeader set INFO_HTTP_MOD_SECURITY_MESSAGE "%{INFO_HTTP_MOD_SECURITY_MESSAGE}e"
RequestHeader set INFO_HTTP_PROXY_CONNECTION "%{INFO_HTTP_PROXY_CONNECTION}e"
RequestHeader set INFO_HTTP_REFERER "%{INFO_HTTP_REFERER}e"
RequestHeader set INFO_HTTP_USER_AGENT "%{INFO_HTTP_USER_AGENT}e"
RequestHeader set INFO_IS_SUBREQ "%{INFO_IS_SUBREQ}e"
RequestHeader set INFO_ORIG_PATH_INFO "%{INFO_ORIG_PATH_INFO}e"
RequestHeader set INFO_ORIG_PATH_TRANSLATED "%{INFO_ORIG_PATH_TRANSLATED}e"
RequestHeader set INFO_ORIG_SCRIPT_FILENAME "%{INFO_ORIG_SCRIPT_FILENAME}e"
RequestHeader set INFO_ORIG_SCRIPT_NAME "%{INFO_ORIG_SCRIPT_NAME}e"
RequestHeader set INFO_PATH "%{INFO_PATH}e"
RequestHeader set INFO_PATH_INFO "%{INFO_PATH_INFO}e"
RequestHeader set INFO_PHP_SELF "%{INFO_PHP_SELF}e"
RequestHeader set INFO_QUERY_STRING "%{INFO_QUERY_STRING}e"
RequestHeader set INFO_REDIRECT_QUERY_STRING "%{INFO_REDIRECT_QUERY_STRING}e"
RequestHeader set INFO_REDIRECT_REMOTE_USER "%{INFO_REDIRECT_REMOTE_USER}e"
RequestHeader set INFO_REDIRECT_STATUS "%{INFO_REDIRECT_STATUS}e"
RequestHeader set INFO_REDIRECT_URL "%{INFO_REDIRECT_URL}e"
RequestHeader set INFO_REMOTE_ADDR "%{INFO_REMOTE_ADDR}e"
RequestHeader set INFO_REMOTE_HOST "%{INFO_REMOTE_HOST}e"
RequestHeader set INFO_REMOTE_IDENT "%{INFO_REMOTE_IDENT}e"
RequestHeader set INFO_REMOTE_PORT "%{INFO_REMOTE_PORT}e"
RequestHeader set INFO_REMOTE_USER "%{INFO_REMOTE_USER}e"
RequestHeader set INFO_REQUEST_FILENAME "%{INFO_REQUEST_FILENAME}e"
RequestHeader set INFO_REQUEST_METHOD "%{INFO_REQUEST_METHOD}e"
RequestHeader set INFO_REQUEST_TIME "%{INFO_REQUEST_TIME}e"
RequestHeader set INFO_REQUEST_URI "%{INFO_REQUEST_URI}e"
RequestHeader set INFO_SCRIPT_FILENAME "%{INFO_SCRIPT_FILENAME}e"
RequestHeader set INFO_SCRIPT_GROUP "%{INFO_SCRIPT_GROUP}e"
RequestHeader set INFO_SCRIPT_NAME "%{INFO_SCRIPT_NAME}e"
RequestHeader set INFO_SCRIPT_URI "%{INFO_SCRIPT_URI}e"
RequestHeader set INFO_SCRIPT_URL "%{INFO_SCRIPT_URL}e"
RequestHeader set INFO_SCRIPT_USER "%{INFO_SCRIPT_USER}e"
RequestHeader set INFO_SERVER_ADDR "%{INFO_SERVER_ADDR}e"
RequestHeader set INFO_SERVER_ADMIN "%{INFO_SERVER_ADMIN}e"
RequestHeader set INFO_SERVER_NAME "%{INFO_SERVER_NAME}e"
RequestHeader set INFO_SERVER_PORT "%{INFO_SERVER_PORT}e"
RequestHeader set INFO_SERVER_PROTOCOL "%{INFO_SERVER_PROTOCOL}e"
RequestHeader set INFO_SERVER_SIGNATURE "%{INFO_SERVER_SIGNATURE}e"
RequestHeader set INFO_SERVER_SOFTWARE "%{INFO_SERVER_SOFTWARE}e"
RequestHeader set INFO_THE_REQUEST "%{INFO_THE_REQUEST}e"
RequestHeader set INFO_TIME "%{INFO_TIME}e"
RequestHeader set INFO_TIME_DAY "%{INFO_TIME_DAY}e"
RequestHeader set INFO_TIME_HOUR "%{INFO_TIME_HOUR}e"
RequestHeader set INFO_TIME_MIN "%{INFO_TIME_MIN}e"
RequestHeader set INFO_TIME_MON "%{INFO_TIME_MON}e"
RequestHeader set INFO_TIME_SEC "%{INFO_TIME_SEC}e"
RequestHeader set INFO_TIME_WDAY "%{INFO_TIME_WDAY}e"
RequestHeader set INFO_TIME_YEAR "%{INFO_TIME_YEAR}e"
RequestHeader set INFO_TZ "%{INFO_TZ}e"
RequestHeader set INFO_UNIQUE_ID "%{INFO_UNIQUE_ID}e"

Mod_Rewrite Variables Decoded!

[API_VERSION] => 20020903:12
[AUTH_TYPE] => Digest
[DOCUMENT_ROOT] => /home/user/www_root/askapache.com
[HTTPS] => off
[HTTP_ACCEPT] => text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
[HTTP_COOKIE] => PHPSESSID=752ee6d56e15f305233e30045987e5ce568c034; __qca=1176541225-59967328-5223185;
[HTTP_HOST] => www.askapache.com
[HTTP_REFERER] => http://www.askapache.com/protest/index.php?askapache=awesomeness&you=rock
[HTTP_USER_AGENT] => Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.16) Gecko/20080702 Firefox/2.0.0.16
[IS_SUBREQ] => false
[QUERY_STRING] => e=404
[REMOTE_ADDR] => 22.162.144.211
[REMOTE_HOST] => 22.162.144.211
[REMOTE_PORT] => 4511
[REMOTE_USER] => administrator
[REQUEST_FILENAME] => /home/user/www_root/askapache.com/protest/index.php
[REQUEST_METHOD] => GET
[REQUEST_URI] => /protest/index.php
[SCRIPT_FILENAME] => /home/user/www_root/askapache.com/protest/index.php
[SCRIPT_GROUP] => daemonu
[SCRIPT_USER] => askapache
[SERVER_ADDR] => 208.113.134.190
[SERVER_ADMIN] => webmaster@askapache.com
[SERVER_NAME] => www.askapache.com
[SERVER_PORT] => 80
[SERVER_PROTOCOL] => HTTP/1.1
[SERVER_SOFTWARE] => Apache/2.0.61 (Unix) PHP/4.4.7 mod_ssl/2.0.61 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2
[THE_REQUEST] => GET /protest/adf HTTP/1.1
[TIME] => 20080820014309
[TIME_DAY] => 20
[TIME_HOUR] => 01
[TIME_MIN] => 43
[TIME_MON] => 08
[TIME_SEC] => 09
[TIME_WDAY] => 3
[TIME_YEAR] => 2008

Request using HTTPS

[API_VERSION] => 20020903:12
[AUTH_TYPE] => Digest
[DOCUMENT_ROOT] => /home/user/www_root/askapache.com
[HTTPS] => on
[HTTP_ACCEPT] => text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
[HTTP_COOKIE] => PHPSESSID=752ee6d56e15f305233e30045987e5ce568c034; __qca=1176541225-59967328-5223185;
[HTTP_HOST] => www.askapache.com
[HTTP_REFERER] => http://www.askapache.com/protest/index.php?askapache=awesomeness&you=rock
[HTTP_USER_AGENT] => Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.16) Gecko/20080702 Firefox/2.0.0.16
[IS_SUBREQ] => false
[QUERY_STRING] => hi=you&whats=&amp;you
[REMOTE_ADDR] => 22.162.144.211
[REMOTE_HOST] => 22.162.144.211
[REMOTE_PORT] => 4605
[REMOTE_USER] => administrator
[REQUEST_FILENAME] => /home/user/www_root/askapache.com/protest/index.php
[REQUEST_METHOD] => GET
[REQUEST_URI] => /protest/index.php
[SCRIPT_FILENAME] => /home/user/www_root/askapache.com/protest/index.php
[SCRIPT_GROUP] => daemonu
[SCRIPT_USER] => askapache
[SERVER_ADDR] => 208.113.134.190
[SERVER_ADMIN] => webmaster@askapache.com
[SERVER_NAME] => www.askapache.com
[SERVER_PORT] => 443
[SERVER_PROTOCOL] => HTTP/1.1
[SERVER_SOFTWARE] => Apache/2.0.61 (Unix) PHP/4.4.7 mod_ssl/2.0.61 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2
[THE_REQUEST] => GET /protest/index.php?hi=you&whats=&amp;you HTTP/1.1
[TIME] => 20080820015016
[TIME_DAY] => 20
[TIME_HOUR] => 01
[TIME_MIN] => 50
[TIME_MON] => 08
[TIME_SEC] => 16
[TIME_WDAY] => 3
[TIME_YEAR] => 2008

Emulating ErrorDocuments with Mod_Rewrite

The ErrorDocument directive is helpful because an errordocument is called differently then a normal file, and it contains special variables to help an admin debug.

I’ve wanted to use a RewriteCond + a RewriteRule to cause an Apache ErrorDocument to be displayed for a long time… I finally figured it out. Simply use the HTTP STATUS CODE trick in combination with a simple RewriteRule to trigger an Apache ErrorDocument.

This code emulates the internal 404 process Apache goes through.. If the file is not found it requests the /test/trigger-error/404 internally which triggers the 404 ErrorDocument.

source: Crazy Advanced Mod_Rewrite

Running CherryPy behind Apache using mod_rewrite

Here are some myths about running CherryPy behind mod_rewrite:

Myth 1: using mod_rewrite will make my site slower

If you’re talking about raw HTTP speed then yes, using mod_rewrite does add a little bit of overhead. On my current laptop, a benchmark of CherryPy exposed gave 460 requests/second (2.2ms/req), and a benchmark of CherryPy running behind Apache with mod_rewrite gave 320 requests/second (3.1ms/req). This means that mod_rewrite adds 0.9ms per request… But for a typical web app, a page will take at least several tens or hundreds of milliseconds to build. So you can see that these extra 0.9ms won’t really matter much!

Also, keep in mind that Apache will serve static files directly, which will be faster than serving them from CherryPy.

Myth 2: I will lose some data about the client if I use mod_rewrite

When using mod_rewrite, requests to CherryPy will look like they’re coming from the local Apache server (the “Host” header will be “localhost:port” and the client IP address will be 127.0.0.1). However, if you use Apache2, it will pass to you the original “Host” header in the “X-Forwarded-Host” header. Also, it will pass to you the IP address of the remote client in the “X-Forwarded-For” header. So you still have access to all the data about the original request.

Configuring Apache

Let’s assume that CP application is listening on port 8000. The thing I did was add to the apache’s config file (usually /etc/apache/httpd.conf or /etc/httpd/conf/httpd.conf) the following lines (mod_rewrite works as well with .htaccess if you cannot edit your httpd.conf) :

RewriteEngine on
RewriteRule ^(.*) http://127.0.0.1:8000$1 [proxy]

In the proper !VirtualHost directive. Be careful with Directory directives because Apache will strip the directory prefix for pattern matching and not add it back. So the above configuration would result in Apache trying to proxy http://127.0.0.1:8000pagee instead of http://127.0.0.1:8000/page. You would remedy this situation by adding a ‘/’ or whatever prefix you need into the rewrite rule. For example:

RewriteEngine on
RewriteRule ^(.*) http://127.0.0.1:8000/$1 [proxy]

If you want to configure Apache to serve all your static files directly (and thus free CherryPy from this task), use the a configuration like this:

RewriteEngine on
RewriteRule ^/static/(.*) /home/user/files/static/$1 [last]
RewriteRule ^(.*) http://127.0.0.1:8000$1 [proxy]

If you don’t want to (or cannot) use Apache’s Virtual Hosts, just add one line after !RewriteEngine. For example, you want to map the requests to the http://www.example.info host to your !CherryPy, so you get:

RewriteEngine on
RewriteCond %{HTTP_HOST} www\.example\.info
RewriteRule ^(.*) http://127.0.0.1:8000$1 [proxy]

If your application is not running and a user tries to access it, Apache will give him 502 Proxy Error. So, there’s an easy way to start the application then: just add the !ErrorDocument directive that runs the CGI script starting your application and redirecting to it. You will also need to disable the mod_rewrite for that script (otherwise apache would try to get the CGI script from your CP application, and get another 502 error). So, I added 2 more lines to my configuration, and it now looks like this:

RewriteEngine on
RewriteCond  %{SCRIPT_FILENAME} !autostart\.cgi$
RewriteCond %{HTTP_HOST} www\.example\.info
RewriteRule ^(.*) http://127.0.0.1:8000/$1 [proxy]
ErrorDocument 502 /cgi-bin/autostart.cgi

The autostart.cgi file is a 5-line python script:

#!python
#!/usr/local/bin/python
print "Content-type: text/html\r\n"
print """Restarting site ..."""
import os
os.setpgid(os.getpid(), 0)
os.system('/usr/local/bin/python2.4 webserver.py &')

If you get "Forbidden - You don't have permission to access / on this server" errors, try enabling the proxy module.

Note: The os.setpgid(os.getpid(), 0) line seems to prevent Apache from killing the CP process after a period of inactivity (many thanks to Matt Lewis for this trick).

Beware the encoding bug

URL’s that are requested via HTTP must be escaped (%xx-encoded) before they are sent, but Apache2′s mod_rewrite unescapes path information which may generate invalid HTTP requests. In particular, spaces (which should be escaped as “%20“) are not. If CherryPy recieves a request with a raw space character in the URL, it chokes, because spaces are used to delimit the three parts of a request line (like “GET /path/to%20my/page HTTP/1.1“). A workaround to this is to add the following to your apache configuration:

# this cannot be on .htaccess (only on httpd.conf)
RewriteMap escape int:escape 

#and when writing RewriteRule:
RewriteRule ^(.*)$ http://localhost:6674/${escape:$1} [proxy]
#(i.e., use ${escape:$1} instead of $1)

AFAIK, this is a bug on mod_rewrite/apache since I’ve researched HTTP/1.1 and URI RFC’s and they all state that there must be only 2 spaces on the HTTP request line, i.e., CherryPy is parsing the request line correctly and Apache is sending invalid HTTP requests. Either way, I think this workaround will help people using CherryPy under apache’s modrewrite. I’ve only tested this on Apache2, I don’t know if RewriteMap int:escape exists on older versions of mod_rewrite. But the Apache people seem to be aware of this bug

Learn More at CherryPy

The Camping Server for Apache + FastCGI

  1. Install Apache 2.
  2. Install mod_fastcgi.
  3. Add to Apache’s httpd.conf:
     AddHandler fastcgi-script rb
     ScriptAlias / /usr/local/www/data/dispatch.rb/
  4. In dispatch.rb:
     #!ruby
     #!/usr/local/bin/ruby
     require 'rubygems'
     require 'camping/fastcgi'
     Camping::Models::Base.establish_connection :adapter => 'sqlite3',
       :database => "/tmp/camping.db"
     Camping::FastCGI.serve("/usr/local/data/examples/")

Serving One File

The above setup will serve a whole directory, just like TheCampingServer. If you only want to serve one app (at the root) change the last line in dispatch.rb to point to a single file.

 #!ruby
 Camping::FastCGI.serve("/usr/local/data/examples/blog.rb")

Mounting at a Subdirectory

You can certainly use ScriptAlias to attach the Camping app to a subdirectory, rather than root. If you are using URL() and R() in your code, the paths will change accordingly.

 ScriptAlias /myapp /usr/local/www/data/dispatch.rb/

FastCGI .htaccess

This is a basic FastCGI .htaccess file. The last line is the most important.

AddHandler fastcgi-script .fcgi 

Options +FollowSymLinks +ExecCGI  

RewriteEngine On  
RewriteRule ^$ index.html [QSA] 
RewriteRule ^([^.]+)$ $1.html [QSA] 
RewriteCond %{REQUEST_FILENAME} !-f 
RewriteRule ^(.*)$ dispatch.fcgi/$1 [QSA,L]

dispatch.fcgi

* Make sure your dispatch.fcgi is marked as executable! Run “chmod 755 dispatch.fcgi” if you’re not sure. * The second part of GEM_PATH should be your host’s installed gems location, the example below is taken from Dreamhost.

#!/usr/bin/ruby

ENV['GEM_PATH'] = '/path/to/my/gems:/usr/lib/ruby/gems/1.8'
ENV['GEM_HOME'] = '/path/to/my/gems'

Dir.chdir '/path/to/my_app'

require 'my_app'
MyApp.create

class ApacheFixer
  def initialize(app); @app = app; end

  def call(env)
    env['SCRIPT_NAME'] = '/'
    env['PATH_INFO'] = env['REQUEST_URI'][0..(env["REQUEST_URI"].index("?")||0)-1]
    @app.call(env)
  end
end

Rack::Handler::FastCGI.run ApacheFixer.new(MyApp)

Using CGI

If you’re having issues with FastCGI, try to get it working with CGI first. To do this, change the examples above:

  • In dispatch.fcgi, change “Rack::Handler::FastCGI” to “Rack::Handler::CGI”.
  • Rename dispatch.fcgi to dispatch.cgi.
  • Update the last line of .htaccess to point to dispatch.cgi instead of dispatch.fcgi.

Notes for Dreamhost

  • If you’re having trouble with timeouts, try getting this to work for CGI first. If CGI works, then FastCGI should work, and Dreamhost is just being stupid. Change it back to use FastCGI, and come back later. This worked for me a couple times, and I place the blame on Dreamhost.
  • Set up your own gem path that you can install to and edit manually.
  • If you followed the Dreamhost guide to making your own gem path, your gem path would be /home/username/.gems.
  • If you’re trying to install gems remotely, Dreamhost will probably kill the process before it finishes. For me, using the ‘nice’ command didn’t help. Get the gem files, scp them to your server, and install them locally (i.e. “gem install activesupport-2.1.0.gem”). This means installing dependencies in turn (activesupport, markaby, and metaid before camping).

Custom .htaccess rewrite rules in WordPress

Dan Marvelo
I’m working on a WordPress plugin to store content as posts in WordPress, but display the content a unique manner, outside of the post / page frame.

this included wanting a unique URL pattern to access that content. after some digging in WordPress code, I got some simple URL customization working.

there is some info on the codex page titled WP Rewite API, but seems to be in development, and is short on specific information.

the code

the main plugin file should have something similar to this:


function plugin_add_custom_urls() {
  add_rewrite_rule('(calendar)/[/]?([0-9]*)[/]?([0-9]*)$',
  'index.php?pagename=$matches[1]&var1=$matches[2]&var2=$matches[3]');

  add_rewrite_tag('%var1%', '[0-9]+');
  add_rewrite_tag('%var2%', '[0-9]+');
}

// runs the function in the init hook
add_action('init', 'plugin_add_custom_urls');

some explanation

just about every request to WordPress is translated to a request of the index.php with query string values. if you have the fancy permalinks enabled, Apache will redirect all the traffic to index.php, then at some point WordPress does the translation from the fancy URL to the query string.

using the add_rewrite_rule function adds your own translation rule to the default set that WordPress already has.

matching regular expression

the first parameter is the regular expression to match against the incoming request. if the request is “http://yoursite.com/calendar/12/2”, the expression attempts to match the portion after the domain name, or “calendar/12/2”.

this isn’t a tutorial on regular expressions, and I’m not the one to do it anyways. just note that the parentheses are important as these indicate portions of the URL that will be matched, and end up in the resulting matches array.

translation to index.php

the second parameter is the translated request to index.php with the query string values. the translation code runs the regular expression against the request, and comes up with the matches in the $matches array. then evaluates the second parameter to this function call, so $matches[1] in the string turns in to the value referenced in the array. in this example, that will be “calendar”.

in the above example, the “calendar/12/2” request turns in to “index.php?pagename=calendar&var1=12&var2=2” because $matches[1] = “12” and $matches[2] = “2”. $matches[0] = “calendar/12/2”, by the way.

recognising query string variables

the call to the add_rewrite_tag function introduces WordPress to the query string variables used in the second parameter of add_rewrite_rule, making them available in the $wp_query object and the get function.

a snippet from a function meant for a template:

function plugin_get_calendar() {
  global $wp_query;

  $var1 = $wp_query->get('var1');
  $var2 = $wp_query->get('var2');

}

some caveats

any rules added with the function are at the end of the list, and WordPress stops after finding the first match. if the URL is matched by one of the default WordPress rules, it won’t reach your custom rule.

the list of rules is stored as an option in the options table. when a request comes in to WordPress, it first checks for the row in the options table, and uses that list if it finds it. this means the rule list needs to be flushed after any changes to the add_rewrite_rule function call. this flush occurs when the Permalinks page is loaded in the WordPress admin, which calls the flush_rules function.

Reposted from: custom rewrite rules in WordPress (add_rewrite_rule and add_rewrite_tag)

Setup Zope behind Apache with SSL

Accessing CGI environment variables created by mod_ssl from within Plone

This way you will get HTTP_SSL_CLIENT_VERIFY, HTTP_SSL_CLIENT_S_DN_CN and HTTP_SSL_CLIENT_S_DN_Email environment variables in the request object.

Posted by mustapha

Problem:

You need to setup Zope behind Apache with SSL and you need to access some/all of the CGI environment variables set by the mod_ssl from within Plone. How to do it ?

To setup Zope behind Apache with SSL is not the hard part. I’ll give anyway an example of setting an apache virtualhost with SSL.

Apache doesn’t forward the mod_ssl CGI environement variables to Zope. Why ? Because Zope doesn’t support SSL until now.

When you setup apache with SSL as proxy for your Plone site, it (apache) receives HTTPS-requests from the outside but it sends HTTP-requests to Zope. That’s why you don’t get the SSL headers through to the proxied Plone site.

Certificates:

How to generate your certificate authority, the server certificate and a client certificate to test the setup is out of the scope of this post. Here are 2 links where you can get help for that. Just copy/past the commands if you don’t understand. You will finish with getting all certificates:

Apache VirtualHost:

Here is an example of setting a VirtualHost with SSL:


<VirtualHost *:443>
  ServerName my.server.com
  <LocationMatch "^[^/]">
      Deny from all
  </LocationMatch>

  SSLEngine on
  SSLCipherSuite HIGH:MEDIUM
  SSLProtocol all -SSLv2
  SSLCertificateFile       /etc/apache2/conf.d/server.cert
  SSLCertificateKeyFile    /etc/apache2/conf.d/server.key
  SSLCertificateChainFile  /etc/apache2/conf.d/authority.crt
  SSLCACertificateFile     /etc/apache2/conf.d/authority.crt

  SSLVerifyClient optional
  SSLVerifyDepth 1
  SSLOptions +stdEnvVars

  SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown 

  RewriteEngine on
  RewriteRule ^/(.*) http://127.0.0.1:8080/VirtualHostBase/https/my.server.com:443/site1/VirtualHostRoot/$1 [P,L]
</VirtualHost>

The most important line related to our problem is the line in red. This mod_ssl directive creates the standard set of SSL related CGI/SSI environment variables. Now, how to forward these variables over HTTP to Zope.

Forwarding the SSL variables:

1. The mod_headers way:

The easiest, not flexible and not secure way is to use mod_headers directives.
Be sure that mod_headers is installed and you have something like this line in your httpd.conf file:

LoadModule headers_module /usr/lib/apache2/modules/mod_headers.so

Now, just forward all the variables you need:

<VirtualHost *:443>
  ServerName my.server.com
 <LocationMatch "^[^/]">
       Deny from all
  </LocationMatch>

  SSLEngine on
  SSLCipherSuite HIGH:MEDIUM
  SSLProtocol all -SSLv2
  SSLCertificateFile       /etc/apache2/conf.d/server.cert
  SSLCertificateKeyFile    /etc/apache2/conf.d/server.key
  SSLCertificateChainFile  /etc/apache2/conf.d/authority.crt
  SSLCACertificateFile     /etc/apache2/conf.d/authority.crt
  SSLVerifyClient optional
  SSLVerifyDepth 1
  SSLOptions +stdEnvVars

  SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown 
  RequestHeader set SSL_CLIENT_VERIFY %{SSL_CLIENT_VERIFY}e
  RequestHeader set SSL_CLIENT_S_DN_CN %{SSL_CLIENT_S_DN_CN}e
  RequestHeader set SSL_CLIENT_S_DN_Email %{SSL_CLIENT_S_DN_Email}e

  RewriteEngine on
  RewriteRule ^/(.*) http://127.0.0.1:8080/VirtualHostBase/https/my.server.com:443/site1/VirtualHostRoot/$1 [P,L]

</VirtualHost>

Generate and User your own SSL Key in Apache

Do It Yourself SSL Guide

By Stephen Philbin

There are many people who want or need to have the connection between the browser and the Web server encrypted, but haven’t been able to set it up. This guide is intended to help people with the typical Apache on Linux setup to make encrypted connections available with a minimum of fuss, and if the encrypted connection isn’t for a commercial purpose, to do so without spending a penny.

Limitations

Sometimes hosting providers block the user from setting it up because the user needs to upgrade (pay more money for) the hosting account. Another possiblity is that the hosting provider doesn’t want users to have any hands-on control regardless of which hosting package you have with them. If you have a package that allows full root access or something similar, you’re unlikely to have any problems, however, it’s not always necessary to have full access as root to be able to set it up. In this article are alternatives to the hands-on approach you would use when logged in as root, but the best I can offer are general pointers. This is because most hosting providers offer some sort of control panel for administrative tasks, but this access can vary widely from one hosting provider to another.

Key Generation

As some of you might already know, a certificate is needed to enable an encrypted connection. The connection can be encrypted using the Secure Sockets Layer (SSL) or the Transport Layer Security (TLS) mechanism, but you don’t need to worry about which will be used because that will be agreed upon by the browser and Apache. Before we can obtain a certificate file, we must first generate a key file.

To generate a key file whilst logged in as root via Secure Shell (SSH), you need to enter the following command (or a variant of it that’s tailored to your preferences.):

openssl genrsa -out keyfilename.pem 2048

I’ll explain each part of the command. If you’re forced to use a control panel provided by your hosting provider, you’ll have a decent idea of what to look for and what to do. Some control panels hide this step from the user and combine this key generation step with the certificate generation step. If you can’t find any option for generating a key, but have an option for generating certificates, don’t panic. If you logged in as root, issued this command over SSH and got a message back saying something like: bash: openssl: command not found, then the OpenSSL program isn’t installed and you either need to install it yourself or have your hosting provider install it for you.

The first part of the command is the name of the program we’re using: OpenSSL. This could be something else to look for on a control panel if you can’t find keys or certificates. Either OpenSSL or perhaps just SSL. TLS might be another possibility, but an unlikely one.

Next up is the key type option. The two most popular types of keys are DSA and RSA. DSA keys are used for digital signatures and aren’t used for encryption; RSA keys can be used for both digital signatures and encryption. Here, we need to generate an RSA key and you should look out for this option if you’re using a control panel to generate the key.

The next two parts are actually a single instruction to OpenSSL. The -out parameter simply indicates that the following text indicates the location of where the file should be placed and what the name of the file should be. When issuing this command via SSH I recommend using an absolute pathname such as: /usr/local/apache2/apache_key.pem so you know exactly where it is once it’s been created. If you’re generating the key through a control panel, look for an option for specifying where it should be placed, or look for it telling you where it will be placed. Regardless of which method you use, make sure that it isn’t placed in a directory where Apache serves Web pages.

The last option of the command is the size if the key in bits. I use 2048 because it’s the recommended size based on current technology. You can increase the number to make it more secure if you prefer, but this means that you might take a performance hit when using SSL. A Certificate Authority (CA) might also require that you use a size specified by them, but you don’t need to worry about CA’s unless you’re intending to use SSL for commercial (or similar) purposes.

Another noteworthy option that’s not used in the command given above is the -des3 option. It’s used to add a protecting password to the key. This might sound like a good thing, but for the purposes of SSL in Apache, it’s not. If you were to use this option, then someone would have to somehow input this password every time an SSL connection is made. If you see an option for this in a control panel section for making keys, don’t use it.

Obtaining The Certificate

Depending on what you want to use SSL for, and whether or not you’re going to pay for your certificate, you’ll use one of two different methods to obtain your certificate. If you don’t want to pay for your certificate and you’re not bothered about a user’s browser presenting them with a warning that the certificate is untrustworthy until they tell their browser otherwise, you can create your own self-signed certificate. Such a certificate isn’t of much use for online transactions because your customers won’t have any confidence about your security, but it’s perfectly fine for personal use. If you want a certificate that your customers can use without warnings, you need to have a widely trusted CA sign your certificate for you. After you issue one of the commands to begin the process of obtaining either type of certificate, you also need to provide the information that will be contained in the certificate. I’ll explain each question that might be asked later on, but for right now, you’ll learn about the comands first.

Obtaining A Normal Certificate

Obtaining a certificate similar to those seen on most commercial sites (where they are automatically be trusted by browsers) requires two steps. The first step will be performed by you, but you’re not able to perform the second step. The CA (such as Verisign or GoDaddy) will perform the last step. The first step is to create what’s called a Certificate Signing Request (CSR). A CSR is a file that, once signed by a CA, will become your certificate. Here’s the command to create it:

openssl req -new -key keyfilename.pem -out certfilename.csr

Again, I’ll give you some information on each part of the command so you can translate this into actions in your control panel or just modify it if you want to change something, but it’s unlikely you’ll want to change anything other than the file names.

The first new command is req. This command is to indicate that we intend to use CSR management. If you’re using a control panel and you can’t find anything about a certificate (request) option, try looking for something like CSR (management) instead. The -new option is an obvious one. It simply means that we’re creating a new CSR rather than doing something to an existing one. The -key option specifies the location of the key file used with the certificate. You must alter this option to point to the location of your key file that you generated earlier in this guide. If you’re using a control panel you might be given a field in a form to specify the location of the key. If this is the case then do so. However, some popular control panels ask you to copy and paste the key into a text area. How you get the key into your clipboard for pasting in to the text area will depend on what can or cannot do with your host. In my experience the method most likely to be available is to copy the key to your computer, then open it in a text editor. You should use the most secure transfer method available to you, but if you’re having to do things this way your options are probably quite limited. Trying to open your key file on a Windows PC will almost certainly cause it to tell you that it doesn’t know what to do with the file. Instead, open the file with Notepad. I’ve opened the key in Notepad on Windows XP. The key was presented in text form, but it showed my test key over just two lines. Depending on your control panel, it might be OK to paste in the key as Notepad presents it, but you might have to make some changes after you paste it in to make it display correctly. The following is a demonstration of how a 2048 bit key would is often represented in text form:

-----BEGIN RSA PRIVATE KEY-----
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblahblahblah
blahblahblahblahblahblahblahblahblahblahblahblahblahblah
-----END RSA PRIVATE KEY-----

As you can see, it’s a block of text with start and end markers on lines of their own at the beginning and end of the key. The main key text appears as 25 lines that, with the exception of the last line, are 64 characters long. A key that has fewer bits will have fewer lines and a key with more bits will have more lines, but the line length stays the same.

The final part of the command, -out, serves the same purpose here as it did with the command we used to generate the key. Follow the same guideline of giving an absolute pathname (if possible) so you know exactly where it’s going to be placed.

Once you have your CSR file you need to find a CA to hand it over to for the second step: signing it. After they’ve done whatever checks they deem necessary, they’ll then sign your CSR and give you your new certificate.

Obtaining A Free Certificate

If you want to create a certificate of your own without having to involve a CA, you can perform both steps by yourself. This means that the user’s browser will present them with a huge: This certificate is self-signed! warning, but if this doesn’t concern you, then it doesn’t matter. Self-signed certificates can be a cheap alternative to CA signed certificates when you’re testing things out and experimenting, or if you’re the only person that needs a secure connection to your host. They can also be good for allowing regular users to use secured connections if they know they can trust you and you warn them about the certificate warnings in advance.

Here, the process of creating the CSR and having it signed are merged into one so you don’t create the CSR file. Instead, you just generate the certificate file directly. The following is a command to generate a self-signed certificate:

openssl req -new -x509 -key keyfilename.pem -out certfilename.pem -days 365

As you can see, it’s similair to the other command for creating a CSR that you would have signed by a CA, but it has two more options than the previous one. The first of the extra options is the -x509 option. This is the option that tells OpenSSL to output a self-signed certificate instead of a CSR. If you’re using a control panel to create a self-signed certificate be sure to look for, and use, an x509 option. The second of the extra options is the -days option. This option simply specifies how long (in days) the certificate is valid. Once the number of days has passed, you should generate a new certificate file and dispose of the old one.

W3C Talks and Presentations

Follow

Get every new post delivered to your Inbox.

Join 1,239 other followers

%d bloggers like this: