Lodahl's blog: LibreOffice now has a built in XML-parser

03 December 2013

LibreOffice now has a built in XML-parser

LibreOffice is using a XML based document format so of cause there is a built in XML parser. But until now it has been quite cumbersome to deal with XML in macros. You need to manually traverse through the entire XML structure like this example (thanks to Andrew Pitonyak):

Function CreateDocumentHandler()
oDocHandler = CreateUnoListener( "DocHandler_", "com.sun.star.xml.sax.XDocumentHandler" )
glLocatorSet = False
CreateDocumentHandler() = oDocHandler
End Function

' Methods of our document handler call these
' global functions.
' These methods look strangely similar to
' a SAX event handler. ;-)
' These global routines are called by the Sax parser
' as it reads in an XML document.
' These subroutines must be named with a prefix that is
' followed by the event name of the com.sun.star.xml.sax.XDocumentHandler interface.

Sub DocHandler_characters( cChars As String )

if xNode = "lipsum" then
cChars= Left(cChars,len(cChars)-1)
if len(cChars)>1 then
cChars= cChars+ Chr$(13)
WriteLoremipsum (cChars, oWrite)
Else oWrite=0
End Sub

Sub DocHandler_ignorableWhitespace( cWhitespace As String )
End Sub

Sub DocHandler_processingInstruction( cTarget As String, cData As String )
End Sub

Sub DocHandler_startDocument()
End Sub

Sub DocHandler_endDocument()
End Sub

Sub DocHandler_startElement( cName As String, oAttributes As com.sun.star.xml.sax.XAttributeList )
xNode = cName
End Sub

Sub DocHandler_endElement( cName As String )
End Sub

Sub DocHandler_setDocumentLocator( oLocator As com.sun.star.xml.sax.XLocator )
' Save the locator object in a global variable.
' The locator object has valuable methods that we can
' call to determine
goLocator = oLocator
glLocatorSet = True
End Sub

This example above is from the extension Lorem Ipsum generator that you can download from here: http://extensions.libreoffice.org/extension-center/magenta-lorem-ipsum-generator

But now its much easier as LibreOffice 4.2 comes with two new spreadsheet functions called WEBSERVICE and FILTERXML. In a macro it is possible to call and use such built in spreadsheet functions even when you are working with text documents.

The example below does pretty much the same as the one above

Sub Main
svc = createUnoService( "com.sun.star.sheet.FunctionAccess" ) 'Create a service to use Calc functions
XML_String = svc.callFunction("WEBSERVICE",array("http://www.lipsum.com/feed/xml?amount=2&what=paras&start=Yes"))
Lipsum = svc.callFunction("FILTERXML", array(XML_String, "/feed/lipsum" ))
Print Lipsum
End Sub

I'm really looking forward play around with these nifty little features in Calc.

1 comment:

Olivier Hallot said...

Hello Leif

Can we write a bit about XML parsing and enhance our LibreOffice LocalHelp with basic instructions on how to make it work?

A feature like this with no help page is a hidden feature...