Amillia Publishing Company Advertisement  ©
HOME RESUME ABOUT DEMOS Connect Message Mobile Right Column Mobile Left Column Mobile Poem Shards Mobile Coder's Edge Mobile
header_image copyright APC 2010

Paging Control

previousprevious click through previousprevious click through previousprevious click through previousprevious click through previousprevious click through previousprevious click through previousprevious click through previousprevious click through nextnext click through nextnext click through nextnext click through nextnext click through nextnext click through nextnext click through nextnext click through nextnewest column

Coders Edge

Modern Application Development

questions and answers on compiling Open Source Code

in a Linux environment, but it will work for any Open Source

Other source code

sed for doing major file modifications

I want to use sed to globally replace something within very many files.

Here is a sample sed that does something to grub.conf (from elsewhere in my notes, not my line of code, but from the Internet at?

sed -i '/root=/s|$| nouveau.modeset=0|' /boot/grub/grub.conf

I want to replace '<a' with '<div class="a_box"><a

and also replace </a> with </a></div>

for the whole file.

I figure two command like the one above as follows

sed -i '<a>/s|$|

sed -iback '^[<a]'

sed -iback 's/<a/<div class="a_box"><a/' myfi.xml

sed -iback 's/<\/a>/<\/a><div>/'

 

Export as HTML

When I export documents from open office, the code doesn't come out the wya I'd like it to. Instead it has names of classes for the various paragraphs that don't coorespoond wiht what I wnat to use. I could change the name of my styles or . . . I can write a small script that will allow me to modify the output to have the class definitions that I use and not the ones that are provide from the Open Office export feature.

#change_class.sh

#replace $2 with $3 in file $1

sed -iback 's/class="$2"/class="$3"/' $1

 

 

OK, and then I want to apply that to this file as output. And what you find when you do that is that you have replaced the informaotn that you were trying to provide if you give an example that is what you used.

 

  So, I want to replace class="code_df_frag" with class="code_frag".

I then save this. This file is saved as xml. And so when I go to replace the script will also replace what I have typed here if it meets the criteria code to be replaced. In this rare case one would need to do some hand coding. if not my little script would leave the above looking like

So, I want to replace class="code_frag" with class="code_frag".

The problem with automated global replace is that in rare cases it will not do what you want. I suppose I could automate with a DOM parser to make sure that I only modify content. And if I start using inline xml examples here, that might be confusing for a DOM processor. There is probably a good way to put some kind of <span> on the code so that it will know that this is  example markup, so don't treat it as part of object model, but treat it as text. These are rare cases. Unfortunately it is people who write about these things that might actually run into these problems. What we have is a rare but real problem that the text is confused with the markup.

For what I am doing it is easier to use a sed command. The above one doesn't work so I just wrote a command line one liner. This worked fine. And the problem that I mention  will hit this file if I run my little command on the exported xhtml version of this.

 

xsl

The export functinality utilizes xsl to output the xhtml file. It would be better for me to modify these scripts and have the export work the way I want it to, to have the output that is to my liking so that I don't have to write any sed scripts at all. Sed, as I pointed out above, does not use the DOM, so it is not easily to tell sed to change one type of token and not another. XSL will allow for modifications to class definitions without any chance of modifying such within the body of the document. IE: if I am writing about the mark up I don't want my conversion scripts changing the body of my text. DOM and xsl transform will assure that will be the case.

I suppose I could transform what I get from the first transform. Maybe it would be best to write a small C program and use the DOM. Or even php.

 

xerces is for DOM parsing. I have downloaded sample code from an article here:

http://onlamp.com/pub/a/onlamp/2005/09/08/xerces_dom.html?page=4

needed to load in some devel packages to make everything build.

 

OK, that code was written with an older version of xerces. One of the classes that the example used has been depricated. I needed to study the code and replace the one class for another. and so I learned a little bit about DOMLSSerializer. Changes in idiom are always a little bit confusing. DOM is a strange idiom in the first place. Once you get the hang of it . . . it all makes sense. Fortunately I've used the DOM before and xerces as well. So I am able to do this work without too much difficulty. Now I need to modify the example to allow for finding the class properties that I want to change and then to change them. Sounds easy. I will copy the example and modify it.

 

Alright, it's a while later and I have crafted a preliminary program in C++ (based upon a demo that I found on line) that allows me to modify a class name for an element to a new class name. Currently this program is hand coded to look for class names only for <p> elements. However, it will not be hard to modify to also pass in the element name to the program. Or have the program get a list of elements, attributes, what the new attribute should be like this:

 

in filename $1 look for $2 elements and find an attribute with name $3 and if it exists and it is $4 then change it to $5

 

Well, I may just make one that only works on class to make the code a little bit easier.  I want this tool in my arsenal so that I can rocket-launch some new pages These kinds of tools are useful but they don't get run without assistance. So it is useful to make the code solid and have it be exception safe. But the worst thing that can happen in the case of me using this code is that I might wreak a file that is temporary in the first place. I need to make the code that I get from OpenOffice match the code that I use on my website. And so I am not making C++ for wide distribution. It is useful to use the DOM for the reasons that I state above in my discussion about Sed verse DOM. However, using xerces-C++ was a review. I may have been done with this much quicker if I had used PHP instead. In PHP I wouldn't have had to mess around with all the string and char * stuff. Basically xercesc::XMLString is an important part of using xerces DOM. And we have three different creatures that we are dealing with:

1. C++ string class accessed through std::str.  We get this kind of string from the command line.

2. C string which are char *.  Useful for std::cout.

3. XMLCh* which is xerces way of storing strings.

 

In order to make things work, and to be able to debug from the command line, there was a lot of jockying of strings necessary in the code. Basically I could find how to create XMLCh* from char *. I could create char * from XMLCh* (which I then need to release when I am done). But I could find no direct transformation form XMLCh* to std::string.

So there was a lot of tranformation going on in the code.

 

Here is the function that I created:

// This function added by Bill Perilli

// The reset of this file is from someone listed at the top.

// This is from the Internet and I modded it to do what I need to do.

   void changeClass( const std::string& element_tag, const std::string& class_from, const std::string& class_to ){

    // find all div's with a class of class_from and put them in a node list.

    xercesc_3_0::DOMNodeList *div_nodes;

    //  const XMLCh div="div";

    //const std::string * div ="div";

   

    // Here is how to make cstr from a std::string:

    //   char * cstr_element_tag =  new char [element_tag.size()+1];

    // strcpy (cstr_element_tag, element_tag.c_str());

    //  delete[] cstr_element_tag; // it must be managed

   

   //    XMLCh* element_tag_str = xercesc_3_0::XMLString::transcode("p");

     XMLCh* element_tag_str = xercesc_3_0::XMLString::transcode(element_tag.c_str());

 

     char * tc_ets = xercesc_3_0::XMLString::transcode(element_tag_str);

     // std::cout does not print XMLCh*, so it needs conversion to something that

     // std::cout likes.

     // to make this work we don't really need c_str at all.

     // But we like to have it so we can debug from the command line.

     std::cout << tc_ets  << " is the element_tag_str." << std::endl;

     // if we could do this and have it work:  

     //  std::cout << element_tag_str << std::endl;

     // we would not need to convert to a c_str in the first place.

     // xercesc_3_0::XMLString::release must be called or we have

     // a memory leak

     xercesc_3_0::XMLString::release(& tc_ets);

 //    xercesc_3_0::XMLString::release(& element_tag_str);

 

     

     div_nodes =  finder_.xmlDoc_->getElementsByTagName(element_tag_str);  //(div_str);

     xercesc_3_0::XMLString::release(&element_tag_str);

     std::cout << div_nodes->getLength() << std::endl;

     XMLCh* class_str = xercesc_3_0::XMLString::transcode("class");

     XMLCh* class_from_str = xercesc_3_0::XMLString::transcode(class_from.c_str());

     XMLCh* class_to_str = xercesc_3_0::XMLString::transcode(class_to.c_str());

     

     

     for( int iElem=0; iElem < div_nodes->getLength(); iElem++ )

     {

        char * tc_attr;

        char * tc_attr_name;

       

xercesc_3_0::DOMNamedNodeMap * element_attrs = div_nodes->item(iElem)->getAttributes();

         //reveals that the element nodes are put into alphabetical order. Not what I would have expected.

if (element_attrs != NULL)  // getAttributes, if there are none, will return a null.

{  

//#define VERBOSE_NODE_TEXT  

#ifdef VERBOSE_NODE_TEXT

  // let's look at all of the attribules  

         std::cout << iElem <<  " will now list attributes of these nodes " <<  std::endl;

 

         for (int lcv = 0; lcv < element_attrs->getLength(); lcv ++)

  {  

        //   tc_attr = xercesc_3_0::XMLString::transcode(div_nodes->item(iElem)->getAttributes()->item(0)->getTextContent());

        //   tc_attr_name = xercesc_3_0::XMLString::transcode(div_nodes->item(iElem)->getAttributes()->item(0)->getNodeName());

             tc_attr = xercesc_3_0::XMLString::transcode(element_attrs->item(lcv)->getTextContent());

             tc_attr_name = xercesc_3_0::XMLString::transcode(element_attrs->item(lcv)->getNodeName());

    std::cout << "attribute # "<< lcv << " " << tc_attr_name << " : " << tc_attr << std::endl ;

             xercesc_3_0::XMLString::release(&tc_attr);

             xercesc_3_0::XMLString::release(&tc_attr_name);

  }

 

#endif  

//#undef VERBOSE_NODE_TEXT

 // is there a class node?

 

           xercesc_3_0::DOMAttr * attr_node_to_change =

             dynamic_cast< xercesc::DOMAttr* > (element_attrs->getNamedItem(class_str));

 // is this class node the one we are looking for?

 //

 if (attr_node_to_change != NULL)

 {

//#define VERBOSE_NODE_TEXT    

#ifdef VERBOSE_NODE_TEXT

  char * attr_val_as_read_str = xercesc_3_0::XMLString::transcode(attr_node_to_change->getValue());

  std::cout << "we are getting a value of " << attr_val_as_read_str << " from our attribute node."<< std::endl;

           xercesc_3_0::XMLString::release(&attr_val_as_read_str);

#endif    

    // If this element has a class name of class_from,

             // then change it to class_to

             // or leave the node alone

   if (xercesc_3_0::XMLString::compareString(attr_node_to_change->getValue(), class_from_str) == 0)

   {

     //xercesc_3_0::DOMElement::setAttribute (const XMLCh *name, const XMLCh * value);

       attr_node_to_change->setValue (class_to_str);

   }

 

#ifdef VERBOSE_NODE_TEXT

  attr_val_as_read_str = xercesc_3_0::XMLString::transcode(attr_node_to_change->getValue());

  std::cout << "now we are getting a value of " << attr_val_as_read_str << " from our attribute node."<< std::endl;

           xercesc_3_0::XMLString::release(&attr_val_as_read_str);

#endif    

 }

}

// to output the text content of our node

#ifdef VERBOSE_NODE_TEXT

char * text_con = xercesc_3_0::XMLString::transcode(div_nodes->item(iElem)->getTextContent());

        std::cout << text_con << std::endl;

        xercesc_3_0::XMLString::release(&text_con);

        // end of output text content of our node

#endif

#ifdef VERBOSE_NODE_TEXT

// To list all of the attributes of our node

// this code does not work. It is an exercise to make it work but

// it is doing the same thing we do above but in a different way (hopefully).

 //     xercesc_3_0::DOMAttr * attr_node =  ( div_nodes->item(iElem)->getAttributes())->getAttributeNode(class_str);

 //     XMLCh* class_value_str = attr_nodes->getValue(class_str);

 //     if (class_value_str)

 //      //{

//char * cv_cstr = xercesc_3_0::XMLString::transcode(class_value_str);

  //      std::cout << "Our string is " << cv_cstr << std::endl;

   //     xercesc_3_0::XMLString::release(&class_value_str);

    //    xercesc_3_0::XMLString::release(&cv_cstr);

       

    //    std::cout << "class found YAAA" << std:endl;    

      // }  

      //   for(int i = 0 ; i<attrs.getLength() ; i++) {

      //      Attr attribute = (Attr)attrs.item(i);    

      //    std::cout <<  " " + attribute.getName()+" = "+attribute.getValue() << std::endl;

      //   }

       

         

#endif  

//#undef VERBOSE_NODE_TEXT

       // thisChild = domParent.childNodes.item( iElem );

     

            }

          xercesc_3_0::XMLString::release(&class_str);

          xercesc_3_0::XMLString::release(&class_to_str);

 xercesc_3_0::XMLString::release(&class_from_str);

 

//         delete [] div_nodes; don't see delete used like this for DOMNodesList *

   

 // modify each of those so that the class property has it's name changed from

 // class_from to class_to

 

}  // end of changeClass

 

This sample is here for my own review. Naturally the function will need to be with in a class. I snipped the code from the sample and will use it for my own purposes. All I am really using from this code is the way that the author opens and saves the XML. the rest of the class that he crafted in this demo was not needed for my purposes.

OK, so I made my modifications to make the code as I said where it isn't just a change class tool but a tool for changing any attribute. It does not add attributes. It does not delete attributes. I have crafted a small script that I can now use to modify the documents easily. And now I will save this and test, and then, hopefully, post this in about ten minutes from now.

A custom of excellent software is your solution for Custom Expert Software WP April 4, 2011 My content presented for the mobile user: Message Mobile Right Column Mobile Left Column Mobile Poem Shards Mobile Coder's Edge Mobile

Paging Control

previousprevious click through previousprevious click through previousprevious click through previousprevious click through previousprevious click through previousprevious click through previousprevious click through previousprevious click through nextnext click through nextnext click through nextnext click through nextnext click through nextnext click through nextnext click through nextnext click through nextnewest column ConnectAmillia Publishing Company Advertisement  ©