When serializing bytes from a MessageLite
, one can use toByteString
to get a ByteString
, effectively an immutable wrapper over a byte[]
. This ByteString can be passed around and deserialized into a message using MyMessage.Builder.mergeFrom(ByteString)
.
ByteString#toStringUtf8
copies UTF-8 encoded byte data living inside the ByteString
to a java.lang.String
, replacing any invalid UTF-8 byte sequences with � (the Unicode replacement character).
In this circumstance, a protocol message is being serialized to a ByteString
, then immediately turned into a Java String
using the toStringUtf8
method. However, serialized protocol buffers are arbitrary binary data and not UTF-8-encoded data. Thus, the resulting String
may not match the actual serialized bytes from the protocol message.
Instead of holding the serialized protocol message in a Java String
, carry around the actual bytes in a ByteString
, byte[]
, or some other equivalent container for arbitrary binary data.