Broken by design: MTOM processing in Spring-WS
There are several known issues related to MTOM (or more generally XOP) processing in Spring-WS. This article identifies the common root cause of these issues and discusses a possible (long term) solution.
Since Spring-WS has two implementations, one based on SAAJ and another one based on Apache Axiom let’s start by examining how these two libraries handle MTOM.
MTOM processing in SAAJ
SAAJ doesn’t perform any kind of XOP decoding (beyond MIME processing):
xop:Include elements are simply represented as
SOAPElement instances in the DOM tree, which means that it is the responsibility of the
application code to perform XOP decoding. The only support for XOP/MTOM in SAAJ is provided by the
SOAPMessage#getAttachment(SOAPElement) method which can be used to retrieve the
AttachmentPart referenced by an
MTOM processing in Apache Axiom
XOP decoding is performed by Axiom and XOP unaware code will simply see base64 encoded data wherever
xop:Includeelements appeared in the original message:
In the object model created by Axiom,
xop:Includeelements are represented as
OMTextnodes (which produce base64 encoded data on demand, but internally store references to MIME parts).
XMLStreamReaderinstances returned by
xop:Includeelements (and perform base64 encoding on demand).
At the same time, Axiom provides APIs that allow XOP aware code to access the binary data in an optimized way:
getDataHandlermethod to retrieve the binary data directly from the MIME part.
XMLStreamReaderinstances returned by
getXMLStreamReaderimplement an extension that allows application code to check whether a
CHARACTERSevent corresponds to binary data and to retrieve that binary data if applicable.
Axiom also has an API to get an XOP encoded
XMLStreamReaderwhich can be used when integrating Axiom based code with libraries that are XOP aware, but that don’t support the Axiom API directly. A good example for this is JAXB2.
This is the optimal processing model because XOP unaware code will just work (although not necessarily with the best performance) and it is still possible to write code that processes MTOM messages in a highly optimized way (including full streaming support for binary data, introduced in Axiom 1.2.13).
What’s the problem with Spring-WS?
The problem with Spring-WS is that in contrast to SAAJ and Axiom, it doesn’t have a well defined MTOM processing model.
Namely, it is unspecified whether the
getPayloadSource method defined by the
WebServiceMessage interface should
Source object representing XOP decoded or encoded data. This is not just a missing detail in the documentation;
the problem is that this method is both implemented and used inconsistently in Spring-WS itself:
The SAAJ based implementation returns a
DOMSourcethat points directly to the relevant part of the DOM tree produced by SAAJ, i.e. it returns XOP encoded data.
The Axiom based implementation returns a
Sourceconstructed from the
XMLStreamReaderinstance returned by Axiom’s
getXMLStreamReadermethod, i.e. it returns XOP decoded data.
MarshallingUtils(which is used by
MarshallingPayloadMethodProcessor) passes the return value of
getPayloadSourceto an XOP aware API, i.e. it assumes that it represents XOP encoded data. That assumption is not correct if the Axiom based implementation is used, with as consequence that binary data is retrieved from Axiom as base64 encoded strings, only to be immediately decoded again by the unmarshaller, resulting in poor performance and out of memory conditions for large attachments.
PayloadValidatingInterceptorbasically passes the return value of
getPayloadSourcedirectly to a
javax.xml.validation.Validator. Since that API is not XOP aware, this means that
PayloadValidatingInterceptorimplicitly assumes that
getPayloadSourcereturns XOP decoded data. For SAAJ that assumption is incorrect, which is the root cause for SWS-242.
In the previous section we have seen that the problems with MTOM processing in Spring-WS are caused by a flaw in the
design of the Spring-WS API. There is therefore no easy fix and a proper solution will require changes to the API.
getPayloadSource method (or a new, overloaded version of that method) should have an argument that allows
the caller to specify whether it expects it to return XOP encoded or decoded data, i.e. whether it is prepared to handle
xop:Include elements or expects to get base64 encoded data instead.
That new API would be easy to implement in the Axiom based implementation because Axiom already provides the necessary
APIs for that. The case is less trivial for SAAJ because that API doesn’t perform any XOP decoding itself. A solution
would be to let the SAAJ based implementation return a
StAXSource that performs the necessary
transformations if the caller requests an XOP decoded
Source. Note that this would only be necessary if the message
is actually an MTOM message. In all other cases, the implementation could simply return a
DOMSource as usual.
Note that the problem related to MTOM is not the only issue with the
getPayloadSource method. There are at least two
other issues that could be addressed at the same time as the XOP decoding problem:
The documentation of the
getPayloadSourcespecifies the following:
Depending on the implementation, [the payload source] can be retrieved multiple times, or just a single time.
This doesn’t make sense. The decision whether the payload is to be preserved (so that a subsequent call succeeds) or not should not be left to the implementation. Instead, this should be specified by the caller. E.g. if the calling code intents to replace the payload with something else (as would be the case for
PayloadTransformingInterceptor), then it knows that there is no need to preserve the original payload. On the other hand, interceptors such as
PayloadValidatingInterceptormust preserve the original payload and should be able to instruct
getPayloadSourceto take care of that.
In the current Spring-WS API, it is completely up to the implementation of the
getPayloadSourcemethod to decide which type of
StAXSource) to return. However, that choice may have implications for the performance of the calling code because the “wrong” choice may require the caller to perform additional conversion to get the representation it needs. To avoid this problem, the caller should be given the opportunity to specify a preference for the type of the returned
Additional issues on the client side
In order to support large attachments a Web service stack needs to provide mechanisms to process them without copying them in their entirety into memory. There are two techniques commonly used for this:
Streaming. This means that the Web service stack hands an
InputStreamto the application code that reads the encoded content directly from the HTTP response stream and decodes it on the fly. Note that this requires that the Web service stack keeps the HTTP request active until the application code has finished reading the attachments.
Offloading to disk. In this case, the content of the attachment is copied to disk instead of keeping it in memory. This is typically implemented using a threshold so that small attachments can still be kept in memory, thus avoiding the I/O overhead. Note that this requires a reliable cleanup mechanism that removes the temporary files once they are no longer needed.
None of this is supported by the SAAJ API. On the other hand, Axiom has always supported offloading to disk, and streaming support was added in 1.2.13. However, the design issues described in the previous sections prevent Spring-WS from leveraging these capabilities. In addition to that (i.e. even if these design flaws were fixed), there are two other issues that occur on the client side:
SWS-707 causes the HTTP transport used by
WebServiceTemplateto read the entire response into a byte array. Strictly speaking this only occurs if the response has no
Content-Lengthheader, but since MTOM messages with large attachments are typically sent used chunked encoding, this is almost always the case. That issue makes streaming or offloading to disk pointless because processing the attachments still requires an amount of heap memory equal to the size of the attachments. Note that this is not a design problem though, because the issue would be easy to fix.
Both streaming and offloading to disk require cleanup after the application code has finished processing the attachments. With the current design of the
WebServiceTemplatethere is no reliable way to do that because they return control to the application code before the cleanup can happen and at the same time there is no mechanism that allows the application code to inform the
WebServiceTemplateinstance that it is done processing the attachments. A possible solution here would be to have
marshalSendAndReceivemethods that instead of returning the response, pass the response to a callback provided by the application code. The cleanup would then be performed after the callback exits.