As described in Compatibility for Graphs and Checkpoints, TensorFlow marks each kind of data with version information in order to maintain backwards compatibility even across major releases in some cases.
This document describes the versioning mechanism in more detail, and explains how to use it to change data formats safely.
Consider the case of TensorFlow graphs serialized via the GraphDef protobuf. We have a number of competing constraints:
For GraphDefs, we support backwards compatibility for 6 months and forwards compatibility for 3 weeks in limited situations. For backwards compatibility, this means that we can only remove functionality 6 months after we stop producing data using that functionality. Similarly, in the limited situations where we support forwards compatibility, we can add functionality only 3 weeks after TensorFlow can consume data using that functionality.
In order to implement these semantics, we need to know when data is produced so that we can know when to enforce changes in formats. The versioning system described below achieves that goal in a manner that supports both backwards and forwards compatibility (when they apply).
For checkpoints, we have no plans to make either backwards or forwards incompatible changes, but still attach versions to checkpoints in case we ever do have to make a change.
Since different data formats evolve at different rates, we have a separate integer versioning scheme for each kind of data, and these schemes are separate from the overall version of TensorFlow.
For now, there are data versions for GraphDefs (serialized computation graphs) and checkpoints (serialized variable state). Both versioning schemes are defined in core/public/version.h
. Whenever a new version is added, a note should be made in that header recording what changed and when.
In the discussion below, we consider version information for data, binaries that produce that data (producers), and binaries that consume that data (consumers):
producer
) and a minimum consumer version that they are compatible with (min_consumer
).consumer
) and a minimum producer version that they are compatible with (min_producer
).VersionDef versions
field which records the producer
that made the data, the min_consumer
that it is compatible with, and a list of bad_consumers
versions that are disallowed.By default, when a producer makes some data, the data inherits the producer's producer
and min_consumer
versions. bad_consumers
can be set if specific consumer versions are known to contain bugs and must be avoided. A consumer can accept a piece of data if
consumer
>= data's min_consumer
producer
>= consumer’s min_producer
consumer
not in data's bad_consumers
Since both producers and consumers come from the same TensorFlow code base, core/public/version.h
contains a main binary version which is treated as either producer
or consumer
depending on context and both min_consumer
and min_producer
(needed by producers and consumers, respectively). Specifically,
TF_GRAPH_DEF_VERSION
, TF_GRAPH_DEF_VERSION_MIN_CONSUMER
, and TF_GRAPH_DEF_VERSION_MIN_PRODUCER
.TF_CHECKPOINT_VERSION
, TF_CHECKPOINT_VERSION_MIN_CONSUMER
, and TF_CHECKPOINT_VERSION_MIN_PRODUCER
.We now discuss examples of using this versioning mechanism to make various changes to the GraphDef format. Our goal is to be backwards compatible for six months, which means that data produced by TensorFlow at time T
must be consumable by TensorFlow at time T + 6 months
. If forwards compatibility is desired, the data must be consumable at time T - 3 weeks
.
Adding a new op:
Adding a new op and switching existing Python wrappers to use it:
min_consumer
, since models which do not use this op should not break.Removing an op or restricting the functionality of an op:
REGISTER_OP(...).Deprecated(deprecated_at_version, message)
.min_producer
to the GraphDef version from (2) and remove the functionality entirely.Changing the functionality of an op:
SomethingV2
or similar and go through the process of adding it and switching existing Python wrappers to use it (may take 3 weeks if forwards compatibility is desired).min_consumer
to rule out consumers with the old op, add back the old op as an alias for SomethingV2
, and go through the process to switch existing Python wrappers to use it (may take 3 weeks).SomethingV2
.Banning a single consumer version that cannot run safely:
bad_consumers
for all new GraphDefs. If possible, add to bad_consumers
only for GraphDefs which contain a certain op or similar.