ActiveXML documentation

Introduction

This document is a report on the development of ActiveXML software. It is supposed to be useful for both: end users and developers.

ActiveXML is based on XML and Web services. An active document is an XML document with embedded service calls to Web services, which enrich the document upon activation. This technology in a peer-to-peer setting relies on the fact that peers communicate solely by means of Web services. The ActiveXML language has a predefined set of Web services for every peer, which enables distributed data management. Still, any peer can also provide other services, which actually causes the network to become a distributed database.

We can distinguish among several kinds of service calls:

Service calls to the Web services that are known to be in the peer-to-peer network. These Web services consist the interface of a specific peer, and information about it can be meaningfully used, for example, by optimizers. An example of such a service is a QUERY service
Service calls to unknown Web services. These can be any Web services on the Web. Though a service call behaves similarly (i.e. it enriches the active document) the information about the Web service is usually unavailable and cannot be used by the distributed system. For example, it is not trivial to optimize a call to some Amazon.com service
Continuous service calls. These are the calls to Web services that know how to stream back data, and notify about the end-of-stream. Basically, it is a matter of the communication protocol between a service call and the Web service. For instance, current implementation ships information about a service call (with the endpoint where to reply) to the Web service as a SOAP header
Simple service calls. These are the calls to one-shot Web services. Such Web services do not stream back the data asynchronously, but rather return it directly (if there is any) within the response message

Calling a Web service in any way is not a problem these days. There are libraries that allow to call a Web service synchronously, asynchronously, one-way or two-way. However, the promise of ActiveXML is to be able to call Web services declaratively. For instance, instead of creating software for calling some SOAP Web service, one could simply create an active document with a service call to that Web service and pass it to the engine to evaluate. Moreover, instead of creating software for piping Web services one could create an active document where one service call has another service call as a parameter. This shows that the correct usage of ActiveXML lies in creating, transforming, sending, receiving active documents and materializing service calls inside them.

In the following sections the current architecture of ActiveXML will be presented. The details about installation and usage will be provided. The developers will be given a more indepth information about the specific implementation points and organization of the code in the repository.

Architecture

In this section a high level architecture of ActiveXML is presented.

Organization of Components

There are 4 software components that make up an ActiveXML peer:

Web server. We use Tomcat 5.5
XML database. Currently we use eXist
Axis2 Web service engine
ActiveXML service calls execution engine

Axis2 and ActiveXML make up a Web application that is deployed in the Tomcat 5.5 Web server. Then the Web application is configured to connect to eXist database where each peer has its own repository of documents (a separate collection).

Actually the components can be organized in several ways. In the simplest (and most expensive) case everything is in the Web container (Tomcat) as Web applications. Hence the peer has its own Web server and its own database instance. It is, nevertheless, possible that several peers share one Web server and the same database instance. Possible configurations are depicted bellow (see how to create new peers). As the picture shows there can be several peers residing in one Web server while the database can be inside or outside the Web server. The peer-to-peer network is a network of such possible configurations where, as already mentioned, the peers communicate only by the means of Web services.

Deeper Look into a Peer

First of all, one can use arbitrary XML database for peer's repository. However, there is still no standard way for accessing XML databases, querying or updating them. Relational databases have JDBC, which is a standard way to connect to SQL data sources and post queries. XQJ, the analog of JDBC for XML datasources, is on the road, as well as standard for XML updates.

Taking such situation into account the decision to create our own API is reasonable. Therefore above XML database there is a DB access API layer (see the picture). This API can be seen as a very modest XQJ with updates using XUpdate. We currently have an implementation of this API for eXist, other databases will need their own implementations (drivers). It is important that other databases support XUpdate since the core ActiveXML services are using this language for updating active documents. Worth mentioning that it is relatively easy to write a driver for XML databases that support XML:DB interface, like eXist does.

Above the DB access API is the document management layer. A peer has a bunch of documents in its repository. Those documents may have service calls inside them. This roughly describes what constitutes the model part of the software (in MVC architecture). The controller, on the other hand, is a piece of software that manages the model. It sits in between the repository and ActiveXML clients (seet the picture) and updates the state of the model. For instance, document management layer is responsible for marking service calls as active/terminated, also for executing a scheduled service call, etc. In previous version of ActiveXML this layer also kept all the repository in memory as DOM objects. Now it only keeps references to the interesting parts of active documents using XPath and XQuery.

A peer is visible to the outside world only by the exposed interface, i.e. the Web services. This can be treated as the view part in MVC architecture. So when someone pushes a button (invokes peer's Web service), the state of the peer changes, and the model (some document) is updated by the controller.

Web services in ActiveXML are facilitated by Axis2. Its beautiful design allows plugging-in new Web services quite easily. This is essential as the functionality of a peers is exposed only by Web services. For instance, if peer is able to answer a query over streams it exposes this functionality as a Web service (let's call it GenericQueryService), so that other peers could call this Web service from within an active document. There are several Web services that are inherent from ActiveXML (shown as buttons in the picture):

Three algebra Web services, that enable distributed data management:
1. ReceiveOperator [shown as RECEIVE], through this Web service results come to a service call. Roughly speaking, when a Web service is invoked from within an active document, the response is redirected to this Web service. It is responsible for preparing the incoming result to be correctly merged into the document. For example, it assigns the timestamp and origin attributes, ensures that the ids of incoming and existing service calls don't overlap, etc.
2. SendOperator [shown as SEND], this Web service is sending data to an address specified. More specifically, it calls the respective ReceiveOperator and passes data to it. Normally this Web service is not used, as the ReceiveOperator can be called directly, therefore any Web service that would call RECEIVE can be titled as SEND. Although SendOperator usually is not invoked, service calls to this Web service are indispensable since they bare special semantics
3. NewNodeOperator [shown as NEW NODE], this is a utility Web service that allows to install (active) data in some peer's repository. It can install new document or append data as children for a specific identifiable node. The semantics of this Web service impose that the newly installed node is evaluated
MaterializationService [shown as MATERIALIZE] has the following operations:
1. evaluate, will materialize the document in a depth first manner. This means that when a service call has other service calls as parameters, the parameters will be materialized earlier. The ordering of execution can be explicitly overridden, which makes the active document look like a graph
2. evaluateNode, behaves much like evaluate but materializes the document starting at a specified node
3. activate, will activate a specified service call and will follow all the explicitly defined constraints. This operation will not follow a default (depth first) execution order

The above listed basic interface can be extended and is still under discussion. For example, do we need a Web service like InstallDocument, or should we add an input parameter to NewNodeOperator that would install a document without evaluating it; do we need an evaluateData operation for MaterializationService that would accept as input active document and evaluate it? All the other Web services are extensions of peer's functionality and can simply be considered as plug-ins. Beside the core ones currently there are following services available:

GenericQueryService, this Web service is a relatively simple implementation of stream processing engine. As an input it gets a query declaration and parameter streams. It can also be used to answer XQuery queries over the database
DummyStreamService is useful for testing. When called it streams back result of a specified query
OptimaxService, distributed query optimizer. Given an active document the Web service transforms it into a distributed plan. The documents that have service calls to GenericQueryService are candidates for optimization with this Web service

For more details on peer's Web services find examples and schemas presented later.

What is a Stream?

Now some clarification about streams. A stream is a channel of data between a service call and the Web service it is calling. The important thing is that data keeps arriving to the service call continuously (and asynchronously) until the end of stream. The procedure is as follows (see the picture):

A request comes to materialize some service call
The service call invokes the Web service and together with the message attaches a header with information about itself
The Web service repsonds with a message marked as STREAM (if it has what to stream and if it recognizes the header)
The service call stays active if the Web service response was marked as STREAM
Then it depends on the Web service when it sends data to the requesting service call.

In the picture above SEND is shown as a separate service, still, the streaming service can directly call RECEIVE. It is then said as behaving like SEND. So to put it simple, a stream is a channel between SEND and RECEIVE. A streaming service is responsible for SEND whereas a service call is the RECEIVE'er.

Worth noticing that a stream can be seen as a communication protocol, nothing else. Current implementation works as described (using markers), however, other implementations or extensions are apropriate. For example, it looks natural to use WS-Addressing for addressing the service call. Also a mechanism of handling sessions can be fruitfully utilized, because a service call is simply a client for a specific Web service.

Actually the concept of streams also captures non-streaming communication between service calls and Web services. In this case the stream consists of one item that is sent directly as a response to the service call and is implicitly marked with END_OF_STREAM. Therefore, materializing a service call ends up in 1) service call termination if the Web service responds with the END_OF_STREAM (it is implicitly assumed), or 2) it stays active if the Web service responds with the STREAM marker and terminates only when its RECEIVE detects the END_OF_STREAM.

A General Picture

To put things together what we have is the architecture shown in the picture. First, there is a peer-to-peer network of ActiveXML peers. The peers are extensible by plugged-in Web services. Note that Web services are only the interface to the underlying software, which can vary from a very complex system to a simple function. Although the plug-in architecture of a peer resembles much an ESB (Enterprise Service Bus) its purpose is quite different from just orchestrating Web services. In particular, the plugged-in systems are supposed to make use of ActiveXML. This means that the exposed Web services:

can stream data. For instance, it can send a message on some event ("alerter")
can try to materialize some service calls before responding to produce more results ("lazy")
can transform the requested query into an optimized one in order to bring results from the whole peer-to-peer network quicker ("optimax")
can evaluate an active document in a special way
can serve as an integrator of results from several Web services, i.e. by specifying several service calls inside an active document and doing a query over it
etc.

In essence using ActiveXML means 1) using special syntax to declaratively specify calls to Web services and 2) using the engine for materializing those service calls. Hence ideally service consumers would write active documents and materialize service calls inside them. While service providers would develop systems and expose them as Web services potentially internally also using ActiveXML.

Besides the peer-to-peer network, XML and Web services allow ActiveXML to enter the Web in general. As it is shown in the picture RECEIVE can get data from everywhere, that is, an active document can contain service calls to any Web service, be it on another peer or somewhere on the Web. Similarly it is not forbidden for external systems to call Web services inside the peer-to-peer network if it makes sense to them (because the returned data may be active). Moreover, since streaming is just a communication protocol nothing stops Web services outside the peer-to-peer network to stream data back to a requesting service call through its RECEIVE.

User Guide

In this section you will find information about configuration, creation of active documents and materialization of service calls. It will be shown what is visible to the user and what is happening behind the scenes.

Installation and Configuration

It is quite easy to start using ActiveXML when you know about what it is made of. The distribution is created in such a way that it includes everything that is needed for the start (except Java Runtime Environment), namely the 4 components: Tomcat, eXist, Axis2 and ActiveXML (see the compenents representation).

First thing to know is that the distribution is simply the Tomcat Web server with several Web applications included, specifically: eXist and MyPeer. The latter is actually integrated into Axis2 Web application in order to make use of the Web service repository management.

From the file structure you can easily see what was changed in Tomcat:

the root folder was renamed into 'ActiveXML'
unneeded Web applications under webapps/ were removed, added 'exist' and 'MyPeer'
some libraries were put to the shared classpath under shared/. The most important ones are Axis2 libraries and database libraries

The following configuration changes were performed for Tomcat:

in conf/catalina.properties paths to new shared folders were added (shared/db/*.jar, shared/axis2/*.jar)
in conf/server.xml ports were changed into 6969 (for HTTP requests) and 6960 (for Web server shutdown request)

So generally not much changed in Tomcat, therefore for things like changing a port, starting, stopping or anyhow configuring the Web server please follow Tomcat's documentation.

Perhaps it is also appropriate consider another Web container instead of Tomcat, for instance, Jetty to make the distribution smaller.

Running

Only 3 things are needed to know:

JAVA_HOME environment variable must be set. It must point to the JDK or JRE root directory. JRE_HOME variable name is also fine (e.g. JRE_HOME=C:\Program Files\Java\jre1.5.0_09). Usually for this purpose one can use set command line tool. This point implies that you have a JVM on your computer.
use bin/startup.bat or bin/startup.sh to start the Web server
use bin/shutdown.bat or bin/shutdown.sh to stop the Web server

That's it. 'ActiveXML' folder can be coppied to any writable media like the hard disc or a USB key and you can consider to have ActiveXML installed.

NOTE: on Linux be sure to make bin/*.sh files executable
NOTE: if you have another instance of Tomcat running on your machine, you will need additional steps, for example, setting the CATALINA_BASE variable.

Web Application: eXist [ ActiveXML/webapps/exist ]

In the full distribution eXist is included as downloaded (*.war=~23M, unzipped ~50M). Hence using an external database instance could considerably reduce the weight of the distribution and not only in size. For instance, think that you would want to cary the ActiveXML peer on a PDA or a USB key. Therefore one should consider the (3) type of component organization.

You can browse eXist database at http://localhost:6969/exist/admin/admin.xql (login: admin/exist). Alternatively you can try the Java interface at http://demo.exist-db.org/webstart/exist.jnlp which is, however, not always working.

Web Application: MyPeer [ ActiveXML/webapps/MyPeer ]

As it was mentioned ActiveXML is merged with Axis2 Web application. This choice is made in order to utilize the internals of Axis2 when dealing with Web services, which is crucial for ActiveXML. For deeper information please read Axis2 documentation.

For now there are 4 important things to know about Axis2 in ActiveXML distribution:

MyPeer is an extension of axis2.war
it has a Web services repository by default configured under WEB-INF/
- conf/axis2.xml, the repository configuration file. Here one can specify in/out phases in order to engage modules for a message flow, also one can configure transports, ports and everything else related to Web service mechanics
- modules/*.mar, a module is a collection of handlers that are used to process an incoming/outgoing message. The modules are pluggable and engageable during a specified phase of a message flow. *.mar (module archive) is simply a zipped collection of resources, like classfiles, libraries and module.xml. There are 2 tiny modules that belong to ActiveXML:
  - ServiceCallWrapper.mar, responsible for creating a valid message for the Web service refered in the service call. For instance, a user does not write a SOAP messages, it is left to this module. It also attaches information about the service call in headers
  - ActiveXMLServiceEntry.mar, makes the ActiveXML context available for Web services. Without it the engine nor repository would not be easily accessible
- services/*.aar, *.aar (axis archive) is a zipped collection of resources that consist a Web service. It can also be kept unzipped (the same holds for modules) like it is in case of services/OptimaxService/. So a Web service is indeed an independent piece of software, that has its own implementation, libraries and configuration, which enables pluggable architecture of ActiveXML
it has a user interface for managing Web services under axis2-web/; normally it will be accessible at http://localhost:6969/MyPeer/axis2-web (login: admin/axis2). Consult it for service endpoint, WSDL, XSD and other information about web services of the peer
all the Axis2 related libraries (from WEB-INF/lib/) are shared

The important files related only to ActiveXML are:

WEB-INF/lib/axml.jar, this is the engine
WEB-INF/web.xml, this is the standard Web application configuration file, it mostly has specification for servlets (both Axis2 and ActiveXML)
WEB-INF/classes/log4j.xml is the configuration of the logging system. The logs can be found under ActiveXML/logs/ together with standard Tomcat logs
META-INF/context.xml contains context specific configuration. Currently it only has the connection specification to the XML database. This connection is then accessed using JNDI (the procedure is kept to be very similar to connecting to SQL data sources). Note that other Web containers do not necessarilly have this file, so the JNDI resource must be specified elsewhere

index.jsp, axml-web/ contains the GUI for the user that wants to evaluate a document, browse the database, etc. However, it is very modest. This is due to the fact that it is still not clear what the user interface should look like. It is clear that one thing is the administration tools, still there also should be user tools. One of possible solutions could resemble the one proposed for Web mashups.
docs/ contains this documentation with some additional resources
examples/ have a collection of example active documents that are a good start to playing with the peer
xsd/*.xsd are the XML Schema files that declare the types known to ActiveXML engine. For instance, axml.xsd declares the grammar for service calls. The schemas can be used for validating active documents. JAXB is used for doing marshalling from XML to Java and vice versa

As one could guess to open a simple user interface of MyPeer the following path should be used: http://localhost:6969/MyPeer (after starting the Web server, of course).

context.xml, pay attention when using another Web container

logs

Shared Libraries [ ActiveXML/shared ]

Beside MyPeer there can be more peers configured to live in the same Web container (and possibly using the same database instance). Therefore it makes sense to share the heaviest libraries among all of them.

ActiveXML/shared/axis2/, ActiveXML/shared/classes/ contain everything that initially could be found under axis2.war/WEB-INF/lib/ and axis2.war/WEB-INF/classes/ (~17M when unzipped)
ActiveXML/shared/db/ contains libraries that enable uniform access to an XML database, i.e.: database access API and the implementations for a certain providers
ActiveXML/shared/lib/ keeps anything related but not compulsary to ActiveXML engine. For instance, it can contain libraries related to GUI or for some Web services

Creating a New Peer

A peer consists of a collection of (active) documents and a set of Web services. In the installation and configuration section it was shown, how the pieces are put together to have MyPeer up and running. If one wants to add more peers to his/her Web container the procedure is rather easy (can even be automatized):

make sure the Web server is stopped (use ActiveXML/bin/shutdown.*)
copy MyPeer inside ActiveXML/webapps with a given name (e.g. AnotherPeer)
change the following resources:
- ActiveXML/MyPeer/META-INF/context.xml, it was mentioned that here connection to the database is configured. In case of eXist it looks as follows:
  <Context> <Resource name="axml/repository" factory="fr.inria.gemo.axml.db.DataSourceFactory" type="fr.inria.gemo.axml.db.IDataSource" driverClassName="fr.inria.gemo.axml.db.exist.DataSourceImpl" url="xmldb:exist://localhost:6969/exist/xmlrpc/db/MyPeer" userName="admin" password="" /> </Context>
  It is enough to change the url of peer's repository into: url="xmldb:exist://localhost:6969/exist/xmlrpc/db/AnotherPeer". What actually happens is that MyPeer and AnotherPeer will have their own document repositories (collections) on the same instance of eXist (see the picture)
- ActiveXML/MyPeer/WEB-INF/classes/log4j.xml, replace all occurances of 'MyPeer' into 'AnotherPeer' in order to get separate log files for AnotherPeer
- ActiveXML/MyPeer/WEB-INF/web.xml, replace
  <display-name>MyPeer</display-name> into <display-name>AnotherPeer</display-name>
  because currently it is the only peer identification method. Usually in a peer-to-peer environment peers have unique ids. Although it is not implemented, normally a peer should obtain an id when it first joins the network. This is crucial for catalogging available Web services in the network
start the Web server (use ActiveXML/bin/startup.*). Now AnotherPeer should be accessible at http://localhost:6969/AnotherPeer

What is important to understand that after creation of AnotherPeer it lives totally in its own context: it has its own repository of documents, and its own repository of Web services.

Service Calls and Active Documents

ActiveXML documents (or simply active documents) are XML documents with embedded service calls. Embedding a service call is just a matter of following the syntax developed for service calls so that the ActiveXML documents engine would be able to recognize them and apply semantics. Without an engine active documents are normal XML documents. The syntax for service calls is given in axml.xsd. For example, a simple active document looks like this:

BEFORE:

As shown in the example, service call is an element in a special namespace and it has the following features:

id that, formally speaking, represents current nodeID (current peerID and current docID are the peer and document where this example lies in the repository) of the service call node. The id must be unique within the document, therefore incoming data may require fixing to conform to this requirement
return statement that specifies what to do with results. In the example it is shown that the results should be appended. One can also specify other things, e.g., for how long the results are valid.
the specification of how to reach the Web service. In the example it is shown that it will be called using a SOAP message with the provided payload (the root normally corresponds to the operation name). Besides one can call a Web service REST style, which roughly corresponds to submitting an XForm via GET or POST. Note that the payload can contain active data!

Beside the listed features service call can be setup to execute after other service call(s). It is also possible to impose periodical execution, however, this is not advised, because this feature still needs much thinking and may later be removed because it is very expensive.

AFTER:

<example xmlns:axml="http://futurs.inria.fr/gemo/axml/"> <axml:sc axml:id="version" activated="2007-10-22T19:17:33.921+02:00"> <axml:activation status="TERMINATED"/> <axml:return> <axml:append/> </axml:return> <axml:ws-soap endpoint="http://localhost:6969/MyPeer/services/Version"> <v:getVersion xmlns:v="unknown"/> </axml:ws-soap> </axml:sc> <ns:return xmlns:ns="http://axisversion.sample" axml:origin="version" axml:timestamp="2007-10-22T19:17:35.484+02:00"> Hello I am Axis2 version service , My version is 1.3 </ns:return> </example>

After materializing the service call in the example the document changed in the following way:

a result arrived. It was marked with the origin and timestamp. Note that the origin refers to the id of service call and this reference should not be broken when data is traveling around the network.
the service call was marked as TERMINATED and was assigned the activation timestamp

For information on where to put a document, how to activate a service call, evaluate the whole document, etc. please see the demo. Worth mentioning that it is done by calling peer's Web services.

Internals of Service Call Materialization[TODO]

Demo [TODO]

Interface of a Peer [TODO]

Developer Guide

Developers that want to dive into the code and better understand the architecture of ActiveXML should find useful information here. Before going into details it is worth mentioning that it would be a good idea to transform the project into a Maven2 project which would easen the way dependencies are downloaded and the project is built.

Organization of Sources

The project is developed using Eclipse, therefore it is to start contributing to the code one needs to checkout the whole project from the CVS(or SVN) repository and configure the used libraries. Currently (2007-11-27) the new version of ActiveXML is branched (almost forked) into the branch 'eXist' and also tagged as 'V2'. In other words to get the latest changes use the branch 'eXist' and to get the last working version use the tag 'V2'.

ActiveXML depends on the following libraries:

AXIS2 - contains all the .jar files from axis2.war/WEB-INF/lib/. As it was mentioned ActiveXML is using Axis2 internally to realize the server side
EXIST - contains three .jar files (exist.jar, xmldb.jar, xmlrpc-1.2-patched.jar). These contain the classes needed to implement a driver for the active documents repository using eXist XML database
GWT - contains gwt-dev-windows.jar (for Windows), gwt-user.jar. These libraries are used for creating AJAX Web application (servlets and GUI) using Google Web Toolkit. It is of course optional to do it this way, and the dependency is not very tight.
OPTIMAX - optimax.jar. Actually the dependency should be other way, i.e. Optimax should depend on ActiveXML because it is a plug-in. However, since it was developed in a hurry now Optimax is split into two parts, this should be fixed the sooner the better

When the code is downloaded, libraries are configured and finally the project compiles in Eclipse without errors it is fine to analyse the sources deeper. The code (end the whole software) can be divided into the following parts:

core - it is the implementation of the ActiveXML engine. It does not (and should not) depend on other parts. Although it is not a lot of code still it is very painful to change it.
- src contains all the interfaces that represent the general concepts of ActiveXML, such as a service call, materialization, merge function, result, etc. This is the most sensitive area because here the interfaces are defined. Changing an interface means possibly breaking the contract causing compatibility issues.
- impl is an implementation of what's inside src. The most important aspect of the implementation is a heavy usage of XQuery: in order to avoid loading all the documents into memory tiny skeleton objects are created; when data is needed a query is posted to the repository. This might become a bottleneck at some point due to excessive querying and parsing, therefore some careful investigation is needed.
- peer realizes the concept of a peer that has documents and services
- jaxb contains the generated code with some meta information. The code is generated out of ActiveXML related .xsd files using ActiveXML/binding.xml (it is an Ant file with tasks).
components - these are loosly coupled components that could have different implementations.
- components/web component is the Web application that helps to interact with the peer, in other words this is a GUI
- components/db component contains the DB access API and eXist driver
services - are the ActiveXML Web services (plug-ins). They are deployed into Axis2 repository
modules - are the collections of handlers (in Axis2 terms)
- modules/sc_wrapper is responsible for dealing with the outgoing message when a service call is activated. Currently the main purpose of it is to enrich the SOAP message with information headers about which service call is being called
- modules/svc_entry deals with an incoming message. Its main purpose is to initialize the ActiveXML context so that plug-ins (Web services) are able to reach information about documents and services available on the peer

Every listed item have associated build.xml in a respective folder. There are separate output folders (where the .class files go) for every component, service, module, and the core. So development goes like this:

Decide where the source should go. In most of the cases a new component or a service will need to be developed
Create a folder in respective place and make it as source (Eclipse: [right click]->Build Path...->Use as Source Folder)
Configure the output folder for the new sources (Eclipse: [right click]->Build Path...->Configure Output Folder). Normally it should be a bin folder inside
Update the respective build.xml to get a .jar, .aar or .mar out of the sources (see how to build)

Inside ActiveXML/resources there are all the .xsd files that define syntax for different things. The schemas are quite documented so they will not be covered here:

axml.xsd defines syntax for service calls
axml_related.xsd defines some types that are used by the core of the engine. For instance communication headers for SOAP messages
algebra.xsd defines syntax of SEND, RECEIVE, NEWNODE algebra services
materialization.xsd defines syntax for materialization services, e.g. activation of a service call, evaluation of a document
query.xsd defines syntax for query services, e.g. generic query, declarative query

Since the most difficult is axml.xsd it is advisable to use this schema when creating active documents from the scratch. For instance, Eclipse shows the autocomplete and documentation for elements and types when the the .xsd is included using xsi:schemaLocation attribute. Any other XML editing tool should also provide that.

Building and Creating a Distribution

When everything compiles in Eclipse still specific build procedure must be held in order to properly deploy ActiveXML. To understand the building process it is enough to go through all the build.xml files in the project. Before doing any builds or bindings ActiveXML/build.properties must be updated. Bellow the comments on build files is given:

ActiveXML/build.xml used for building everything into .jar, .aar, .mar and web components. By default the resulting resources are put under ActiveXML/build/. This build file is actually calling targets from several other build files:
- ActiveXML/services/build.xml used for creating Axis archives of the services that are being developed
- ActiveXML/modules/build.xml used for creating Axis module archives of the modules (collections of handlers) that are being developed
- ActiveXML/components/build.xml used for building other independent components. Moreover this build file creates actual peer webapps. It puts all the .jar, .aar, .mar files into right places of a webapp and this is how MyPeer is born
ActiveXML/distribution/build.xml used for creating the whole distribution that can be zipped and shipped for the end users. It transforms Tomcat into ActiveXML distribution by including certain built resources from ActiveXML/build/ and eXist into right places (see installation and configuration for details)

Development scripts[TODO]

For development use ant file buildDev.xml and buildDev.properties.

Static Architecture [TODO]

Functionality of ActiveXML

An active xml document contains service calls. They could be calls to any web service or to specific ActiveXML web services (MaterializationService, OptimaxService, GenericQueryService, DummyStreamService, algebra service: ReceiveOperator, SendOperator, NewNodeOperator). We would like to explain in this section the functionality of each web service, and the chain of execution from when a service call is activated until the results are received and appended to the active document.
We identify two parties:

Client: It contains an active axml document with service calls. It represents the place where a service call(sc) is materialized (the corresponding web service is called) and where the results are collected and saved into the document.
Server: The place where service calls are executed;

AXML syntax

It explains the syntax of xml schema files (.xsd) described in section Organization of sources

axml.xsd
The activation status of a service call has multiple possible value:

ENABLED is always set by default, when you don't specify anything it is there.
ACTIVATED:
FAILED is set when the service call execution fails, for instance, when the materializer catches an exception while calling the Web service.
DISABLED is used to disable a service call !!!! TODO

Materializers' description

A service call is materialized (call Web service, consume result) by materializers. A service call (sc) one can set the materializer class. This class is instantiated when creating a service call object. All the materializer related classes are in fr.inria.gemo.axml.model.sc.materialization:

DefaultOutOnlyMaterializer is used for simple service calls that do not have <return>, or have <return><void></return>
DefaultOutInMaterializer is used for simple service calls that have merge function specified inside <return>.
MaterializerForRECEIVE is assigned for sc that are RECEIVE. This materializer does not call any Web service but simply makes the service call active
MaterializerForSEND is assigned to SEND service calls. According to SEND semantics it sends everything underneath it until the end of stream. This means, that it is listening to the service calls bellow it and on new result it this materializer calls respective RECEIVE Web service.
MaterializerForContinuousQuery behaves similarly to SEND, i.e. it is watching the service calls underneath it, but instead of calling a RECEIVE on the server, it calls the respective GenericQueryService.

MaterializerUsingConstraints and DeepTreeMaterializerUsingConstraints is a helper for MaterializationService. These materializers are not designed for service calls.

Q: What happens if the materializer is not specified in the activexml document?

There is a IMaterializer interface which has one method: materialize(). There is a class implementing this interface which is able to deal with a batch of materializers, i.e. given a batch of IMaterializer instances it calls in parallel their materialize() method, it returns when all the threads join. This way you can combine simple and batched materializers to have any order of materialization.
There is also another flavor of batch materializers: materializer with constraints. It keeps a buffer of materializers which constrain its execution with afterActivated and afterTerminated, so every time something is activated or terminated it tries to execute or waits till the constraints are satisfied.
The materializers with constraints are used by MaterializationService. When you want to evaluate a document such a materializer is created by collecting the afterActivated and afterTerminated constraints, then materialize() is called and all the service calls are materialized with the given constraints.
So to materialize service calls you simply create materializer objects for them. The materializers can extend functionality of a service call, as it is with SEND and RECEIVE (therefore you need to specify the special materializers for them), but in general there is a default materializer which looks at the service call, creates the SOAP message, calls the Web service and deals with the result.

ActiveXML Web services' implementation

GenericQueryService

Activation: Chain of execution

Creating a Plug-In Web Service [TODO]

How to Develop Without Interfering With the Existing Code

ActiveXML currently requires two kinds of development: 1) debugging or optimization and 2) plug-ins or applications. The former kind implies changes in the current code, however, one should perfectly know what she's doing. The latter is the final goal of how code should be contributed to ActiveXML. Good systems are always pluggable platforms and ActiveXML must become one of those because of its SOA nature.

As it was mentioned earlier plug-ins in ActiveXML correspond to Web services. Every Web service can utilize the internals of the ActiveXML (activation of service calls) and bring new functionality. For example, Optimax is heavily using the engine for optimizing active documents in a distributed environment (please refer to Optimax documentation for details). Another example, could be a catalog service that would crawl the distributed network and catalog all the Web services that it finds.

It is a good place here to remind that ActiveXML also is extensible on a service call level. Axis2 allows to plug modules that would intercept a traveling SOAP message. This is one way of decorating the behavior of a Web service (and therefore a service call). On the other hand it is not always appropriate to use Axis2 handlers because the chain of execution design pattern is not always suitable which is the case, for instance, in algebra services. Therefore it is possible to attach a specific implementation of the AbstractPrimitiveMaterializer that would capture the logic of materialization of a service call. For instance, MaterializerForSEND, MaterializerForReceive, MaterializerForContinuousQuery use this feature.

ActiveXML documentation [under development]

Table of Contents

Referenced Resources