SOAP over binary XML
Abstract:
As one of core components of Web Service technologies,
SOAP has evloved into the most widely supported messaging format
and protocol for use with XML Web services.
Generally SOAP is bound with Http protocol,
over which the SOAP message encoded as a textual XML document
is sent between client and server.
XML processing can be slow and memory consuming,
however, especially for scientific data.
Consequently SOAP has been regarded as a poor-performance
messaging protocol for scientific applications.
Binary XML provides an alternative solution to use more efficient
encodings of XML, thus the SOAP messages.
By having SOAP utilize the binary XML encoding,
we can gain the high performance of Web service
with minimal sacrification of interoperability brougt by the XML and SOAP.
In this paper we present a generic implemenation of SOAP message system,
which supports both the textual XML and binary XML as
the encoding of the SOAP message.
We show that performance is comparable and even challengeable
to that of commonly used practice of handling control and data
separately in most of scientific applications.
The internel data model is based on the XML infoset,
but has been augmented with atomic, typed values.
This allows our API to represent numbers in their native,
machine form,
rather than as a character string.
Our API is DOM-like,
but more closely follows the XML infoset.
Conncepts:
| Message stream type |
X::MessageStream |
the type of the stream of messages |
| Server connection type |
X::ServerConnection |
the connectio type on the server side |
| Client connection type |
X::ClientConnection |
the connection type on the client side |
| Server singleton type |
X::ServerSingleton |
the singleton type to represent the server instance |
| send the SOAP request to server |
x.send_request(ch, env) |
ch is the X::ClientChannel type, env is a soap envelope which is going to be sent via the ch; |
| receive the SOAP response from the server |
SoapEnvelope x.receive_response(ch) |
ch is the X::ClientChannel type from which the response will be received,
the return value will be the envople of the SOAP response; |
| receive the SOAP reqeust from the client |
SoapEnvelope x.receive_reqeust(ch) |
ch is the X::ServerChannel type from which the request will be received,
the return value will be the envople of the SOAP resquest; |
| send the SOAP response back to client |
x.send_response(ch,env) |
ch is the X::ServerChannel type, representing the channel at server side,env is the envelope of SOAP response message; |
Figure:
serialization size
|
|
The size of bxsa file and netCDF file is almost same
Client Machine is bleu;
Server machine is brick;
Two solution:
- BXSA:
client send the request and data in one SOAP request,
which is encoded to be a binary format by using BXSA.
The encoded binary data is sent to the server via the XBS, a binary raw transportation protocol, to the server;
Server get the reqeust, verify the request and data.
If every thing is OK, server send back the SOAP response to indicate the result;
- Mixed solution:
client generate the data and save it into a netCDF file,
which can be accessed by remote machine via various protocols (like http or gridftp or ftp)
then client send the request, whose only content is the URL to the netCDF file to the server,
When the server get the reqeust, extract the URL then retrieve the netCDF file to local file system.
Then server read the local netCDF file , verify its content.
If every thing is OK, server send back the SOAP response to indicate the result;
Noet in the above mixed solution, we are using pull-based (i.e. server pulls the data from client),
we also can adopt push-based, that is client pushs the data file to the server;
The test shows two appraches have same performance.
When the data size is small (less then 1k doubles and integer in the data),
the invocation performance comparation is
Figure:
invocation performance for small binary data
|
|
GridFtp takes too much time compared with other two solutions,
It is because its transportation layer is SSL.
So to make some sense, we just compare the BXSA against the Http + SOAP mixed solution.
Figure:
invocation performance for small binary data
|
|
From the above diagram , for mixed solution when the data size is relative small
the extras File I/O is the performance killer.
When the data size is big (up to Mega doubles and integer in the data),
the invocation performance comparation is
Figure:
invocation performance for big binary data
|
|
Now for both solution, the networking transfering will dominate the performance cost.
Since the data size of BXSA and netCDF are similier,
BXSA solution is close to the mixed solution
(either via GridFTP or via Http)
For bxsa to invoke a web service with 1M elements
(the bxsa file size is around 12.58M),
- client:
- DM building
- Serialization + Socket I/O writting;
- server:
- Socket I/O reading + deserailization + DM building
In soap + http + netCDF solution,
for 1M elements the netCDF file size is around 12.58M
(almost same as the bxsa serialization result).
The basic steps involved in the client and server are
- client:
- DM building + serialization + FIle I/O writting;
- send SOAP request
- File I/O reading + Socket I/O writing
- server:
- process SOAP request
- Socket I/O reading + File I/O writting
- File I/O reading + deserailization + DM building
Wei Lu
2005-10-19