DRAFT


The presenting problem

An Apache project has an existing website which includes hundreds or even thousands of static .html pages of documentation and other materials. The project wants to be able to publish the website using Apache infrastructure without having to check many megabytes of content that is now in Subversion into Git.

The first plan was to use AliasMatch to map the current locations of the static .html pages, and use those aliases with Apache's .htaccess configuration file to work with Apache Web Service software. Unfortunately, AliasMatch is not allowed in .htaccess.

The solution

The second, successful plan, is to use "mod-rewrite".

The fix consists of two parts:

  1. Infra added Alias/Rewrite to the global server configuration file. All project servers can use it. It maps the URI path "/__root" to the file system path where the existing project web pages are hosted.
  2. The project changed their .htaccess to use directives that rewrite the above URLs to "/__root/....old-svn-website.../". This makes the paths independent from real file system paths. Below the URI path "/__root" the project can reach all project folders that are checked out on wb server. 

One downside is that, in theory, anyone could reach every Apache website by hand-crafting a URL like:

https://lucene.apache.org/__root/someotherproject/somehtml.

Apache HTTPD does not have this concept of defining the URI path as "internal" to prevent such access.

The back story

The details of the situation the Lucene project faced, and how the project and Infra crafted a solution, are in the comments on INFRA-19439: "Add a way to publish static HTML content of huge size (outside of pelican CMS) without checking it into Git."

There is no content with the specified labels