Broken by design: MTOM processing in Spring-WS
Introduction
There are several known issues related to MTOM (or more generally XOP) processing in Spring-WS. This article identifies the common root cause of these issues and discusses a possible (long term) solution.
Since Spring-WS has two implementations, one based on SAAJ and another one based on Apache Axiom let’s start by examining how these two libraries handle MTOM.
MTOM processing in SAAJ
SAAJ doesn’t perform any kind of XOP decoding (beyond MIME processing): xop:Include
elements are simply represented as
SOAPElement
instances in the DOM tree, which means that it is the responsibility of the
application code to perform XOP decoding. The only support for XOP/MTOM in SAAJ is provided by the
SOAPMessage#getAttachment(SOAPElement)
method which can be used to retrieve the
AttachmentPart
referenced by an xop:Include
element.
MTOM processing in Apache Axiom
Apache Axiom has a well defined XOP processing model (at least since version 1.2.9 which fixed AXIOM-122 and AXIOM-255):
-
XOP decoding is performed by Axiom and XOP unaware code will simply see base64 encoded data wherever
xop:Include
elements appeared in the original message:-
In the object model created by Axiom,
xop:Include
elements are represented asOMText
nodes (which produce base64 encoded data on demand, but internally store references to MIME parts). -
XMLStreamReader
instances returned bygetXMLStreamReader
will produceCHARACTERS
events forxop:Include
elements (and perform base64 encoding on demand). -
SAXSource
instances returned bygetSAXSource
will producecharacters
events forxop:Include
elements.
-
-
At the same time, Axiom provides APIs that allow XOP aware code to access the binary data in an optimized way:
-
OMText
has agetDataHandler
method to retrieve the binary data directly from the MIME part. -
XMLStreamReader
instances returned bygetXMLStreamReader
implement an extension that allows application code to check whether aCHARACTERS
event corresponds to binary data and to retrieve that binary data if applicable.
-
-
Axiom also has an API to get an XOP encoded
XMLStreamReader
which can be used when integrating Axiom based code with libraries that are XOP aware, but that don’t support the Axiom API directly. A good example for this is JAXB2.
This is the optimal processing model because XOP unaware code will just work (although not necessarily with the best performance) and it is still possible to write code that processes MTOM messages in a highly optimized way (including full streaming support for binary data, introduced in Axiom 1.2.13).
What’s the problem with Spring-WS?
The problem with Spring-WS is that in contrast to SAAJ and Axiom, it doesn’t have a well defined MTOM processing model.
Namely, it is unspecified whether the getPayloadSource
method defined by the WebServiceMessage
interface should
return a Source
object representing XOP decoded or encoded data. This is not just a missing detail in the documentation;
the problem is that this method is both implemented and used inconsistently in Spring-WS itself:
-
The SAAJ based implementation returns a
DOMSource
that points directly to the relevant part of the DOM tree produced by SAAJ, i.e. it returns XOP encoded data. -
The Axiom based implementation returns a
Source
constructed from theXMLStreamReader
instance returned by Axiom’sgetXMLStreamReader
method, i.e. it returns XOP decoded data. -
MarshallingUtils
(which is used byMarshallingPayloadMethodProcessor
) passes the return value ofgetPayloadSource
to an XOP aware API, i.e. it assumes that it represents XOP encoded data. That assumption is not correct if the Axiom based implementation is used, with as consequence that binary data is retrieved from Axiom as base64 encoded strings, only to be immediately decoded again by the unmarshaller, resulting in poor performance and out of memory conditions for large attachments. -
PayloadValidatingInterceptor
basically passes the return value ofgetPayloadSource
directly to ajavax.xml.validation.Validator
. Since that API is not XOP aware, this means thatPayloadValidatingInterceptor
implicitly assumes thatgetPayloadSource
returns XOP decoded data. For SAAJ that assumption is incorrect, which is the root cause for SWS-242.
Possible solutions
In the previous section we have seen that the problems with MTOM processing in Spring-WS are caused by a flaw in the
design of the Spring-WS API. There is therefore no easy fix and a proper solution will require changes to the API.
Namely the getPayloadSource
method (or a new, overloaded version of that method) should have an argument that allows
the caller to specify whether it expects it to return XOP encoded or decoded data, i.e. whether it is prepared to handle
xop:Include
elements or expects to get base64 encoded data instead.
That new API would be easy to implement in the Axiom based implementation because Axiom already provides the necessary
APIs for that. The case is less trivial for SAAJ because that API doesn’t perform any XOP decoding itself. A solution
would be to let the SAAJ based implementation return a SAXSource
or StAXSource
that performs the necessary
transformations if the caller requests an XOP decoded Source
. Note that this would only be necessary if the message
is actually an MTOM message. In all other cases, the implementation could simply return a DOMSource
as usual.
Note that the problem related to MTOM is not the only issue with the getPayloadSource
method. There are at least two
other issues that could be addressed at the same time as the XOP decoding problem:
-
The documentation of the
getPayloadSource
specifies the following:Depending on the implementation, [the payload source] can be retrieved multiple times, or just a single time.
This doesn’t make sense. The decision whether the payload is to be preserved (so that a subsequent call succeeds) or not should not be left to the implementation. Instead, this should be specified by the caller. E.g. if the calling code intents to replace the payload with something else (as would be the case for
PayloadTransformingInterceptor
), then it knows that there is no need to preserve the original payload. On the other hand, interceptors such asPayloadValidatingInterceptor
must preserve the original payload and should be able to instructgetPayloadSource
to take care of that. -
In the current Spring-WS API, it is completely up to the implementation of the
getPayloadSource
method to decide which type ofSource
object (DOMSource
,SAXSource
orStAXSource
) to return. However, that choice may have implications for the performance of the calling code because the “wrong” choice may require the caller to perform additional conversion to get the representation it needs. To avoid this problem, the caller should be given the opportunity to specify a preference for the type of the returnedSource
.
Additional issues on the client side
In order to support large attachments a Web service stack needs to provide mechanisms to process them without copying them in their entirety into memory. There are two techniques commonly used for this:
-
Streaming. This means that the Web service stack hands an
InputStream
to the application code that reads the encoded content directly from the HTTP response stream and decodes it on the fly. Note that this requires that the Web service stack keeps the HTTP request active until the application code has finished reading the attachments. -
Offloading to disk. In this case, the content of the attachment is copied to disk instead of keeping it in memory. This is typically implemented using a threshold so that small attachments can still be kept in memory, thus avoiding the I/O overhead. Note that this requires a reliable cleanup mechanism that removes the temporary files once they are no longer needed.
None of this is supported by the SAAJ API. On the other hand, Axiom has always supported offloading to disk, and streaming support was added in 1.2.13. However, the design issues described in the previous sections prevent Spring-WS from leveraging these capabilities. In addition to that (i.e. even if these design flaws were fixed), there are two other issues that occur on the client side:
-
SWS-707 causes the HTTP transport used by
WebServiceTemplate
to read the entire response into a byte array. Strictly speaking this only occurs if the response has noContent-Length
header, but since MTOM messages with large attachments are typically sent used chunked encoding, this is almost always the case. That issue makes streaming or offloading to disk pointless because processing the attachments still requires an amount of heap memory equal to the size of the attachments. Note that this is not a design problem though, because the issue would be easy to fix. -
Both streaming and offloading to disk require cleanup after the application code has finished processing the attachments. With the current design of the
marshalSendAndReceive
methods inWebServiceTemplate
there is no reliable way to do that because they return control to the application code before the cleanup can happen and at the same time there is no mechanism that allows the application code to inform theWebServiceTemplate
instance that it is done processing the attachments. A possible solution here would be to havemarshalSendAndReceive
methods that instead of returning the response, pass the response to a callback provided by the application code. The cleanup would then be performed after the callback exits.