Many OpenOffice.org pages are published in a multi sub-domain structure. See http://wiki.services.openoffice.org/wiki/Infrastructure_Overview for details. With the exception that the main site is now hosted on kenai, this is probably accurate.
Active address list:
Project name |
URL |
Hosted at |
---|---|---|
About |
Kenai |
|
Bugzilla |
Kenai |
|
Development |
Kenai |
|
Distribution |
Kenai |
|
Documentation |
Kenai |
|
Download |
Kenai |
|
Main page |
Kenai |
|
Marketing |
Kenai |
|
Native pages list |
Kenai |
|
Projects list and individual addresses (146 projects) |
Kenai |
|
Support |
Kenai |
|
|
|
|
Extensions |
OSUOSL |
|
Forums |
Oracle |
|
Templates |
OSUOSL |
|
Wiki |
Oracle |
also see OpenOffice Domains
A sitemap of the webpages located on kenai.com is add'ed above. Same NLC projects are missed cause tecnical issues. (e.g. es.oo.o)
Archive create
Possible:
- Web content checkout via SVN URL.
In the AOOo project in https://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/ is a script and web project list that automates checkout and update.
Look for fetch-all-web.sh and web-list.txt. The text file needs to be edited. The script performs svn update on existing project directories to save time.
Here is the how to do it individually.
Syntax:
svn co https://svn.openoffice.org/svn/<$projectname>~webcontent your_local_dir
Example:
svn co https://svn.openoffice.org/svn/download~webcontent download --> to get all website content from the download project
Do it analog with the other projects. - Wiki: database dump (Clayton Cornell is able to help with this).
- Bugzilla: I hope, ORACLE will provide a database dump if not, we can use XML export. Bugzilla can import this XML's.
- Forums: As I know we have admins of the OOo user forums in our group, they can make a dump of the database via the PHPbb admin interface.
- Extensions and Templates: We really need to backup this. AFIAK the servers of this services are not hosted by ORACLE, they are hosted at OSUOSL.
- Use wget
Note: I (rbircher) have allready a script to make a serie checkout of all projects, the only thing that I need is a .txt file who lists all project names (line break separated)
Todo plan
- Create full sub-domains list (Substantial progress)
- Create archive (can do it in a people.apache.org account. (development, documentation, download, projects, and www take 2.7GB.)
- Selecting needed content
- Move contents to new page