I want the ability to have the stuff that I write in Open Office, (or any other HTML content creator) to work with my site in a way where I publish this to a specific directory in my site. What I need to do is to change the links in the content that is provided so that they will link up correctly when I put the files in the directory that I select. Typically their can be an image directory. But the way that Open Officed creates the HTML, it leaves all of the files in a single directory. Using a simple PHP script I am able to open the file of interest and then output a version of that file in which I have the links fixed so that they will work with the server.

The first way to do this would be to open up our file in a text editor and replace the links like that. Essentially we need to find the references to every file that is linked externally and have it link correctly. For example If in the file created by OO the line looks like this:

<IMG SRC="10_24_2004_html_m10e4466.png"

I need it to look like this:

<IMG SRC="notes/10_24_2004/10_24_2004_html_m10e4466.png"

This is easy to do with a PHP script and the following code will do this:

$fh = fopen('notes/10_24_2004/10_24_2004.html','r') or die ($php_errormsg);

$file_text = fread ($fh,filesize('notes/10_24_2004/10_24_2004.html'));

$patterns = '{<IMG SRC="}';

$replacements = '<IMG SRC="notes/10_24_2004/';

$doctored_page = $initial_page;

$diff_page = preg_replace($patterns,$replacements,$file_text,-1);

print ($diff_page);

At the end of this I print the page and that is server output from a page request.

This means that every time I want to serve up this page I need to do this process. The result is always the same unless the original page has been changed. And so a saner way to do this is to have the page be created once and then just read every time. If this was a high volume page we would worry about this and provide this optimazation. The process of compileing the page could occur in the script if the original page has changed.

That would, of course, modify my little script.

We start to see why pages are chached. There is a very large advantage, assuming that a page is the same, to doing this. In the case of our scripts running our site we need to understand that there are tradeoffs. The best solution to me, and there are a lot of ways to do this, is to have a little script that runs once for each page that is posted. This script could either be called from within a PHP page served live, or at the point of the author publishing the page. Meta code for this follows:


Assuming that this function exists as part of the serving script and the author just copies the files to the directory of interest:

If the modified page does not exisit create it.

If the modified version with a name that shows that it has been modified is older than the page of interest, then recreate the modified page.

Serve up the page.


As the pages and images must be copied over to the server, at this time it might be best to do all of the 'fixing up' of the html. In this case the publishing feautres of the website would do the fixing up of the page.

A page without links will not need to use this feature.


Another annoying feature of the HTML export was that the quotation characters didn't turn up the way that I expected. I thought that I had fixed this. But what I needed to do was to go back to the original open office document and change, with a global replace, the quote characters that were not correct. This was not a lot of work, but still seems like a waste of time. Now I will know to check for this. And I am not sure why the word processor behaves like this. Here are the three characters:


" and


U+-0022, U+201C, and U+201D


You can see fine in the open office doc what they look like. But only after I convert will I be able to see them in the web page.


I currently am adding the pages with separate scripts for each page. This allows me to have the ability to do different things for each page as needed. However, it might be better to have a single script that will take post data and then spawn off a local page.


I have investigated using the master document features of openOffice. Publishing this to a page doesn't make a single page that links to all of the other pages, it makes a single page that links to one other page. This doesn't seem that useful to me.


Here are some features that I want to add:


1. Each page is linked as the same page with different parameter list from the Post data.

2. Add the ability to log and have it time and date coded.

Password

Login

Session

switching between logs.

3. Add a link by uploading the html file.

4. Edit the lay out of the page from the page itself.



How do I upload the files?

How do I have people 'join' my site?

How do I charge them for items that they buy such as pictures?

How do I link to the disclousures?

How do I set up for easy transfer of gallery pictures?

How do I create new galleries?

What is the structure of the site?

Should I make a site map?

What would that map look like?

There are a lot of sites already out there.


Should every page always link back? Seems like they should. Or if they do not then they should be wrapped by something that does. The wrapper is the main site navigation.

A site can have different wrappers for different parts of the site. That gives the site a little more interesting look and feel. The site should also allow the assigning of the pictures for the site.


The user should be able to add new pictures and have the site convert them to a form that is useful on the web. If a picture is large then thumbnails should be automattically added.



The form interface is not a rich set of functionality. There are rudimentary parts that don't connect well into anything useful. This is an important effort that I need to make.


I need to work on the links to the pieces that need to be there. I am at a point where a picture would be useful. I should draw out the object model for the design. These are tedious efforts that can be automated. Perhaps I can get a class parser that will draw these for me.