Skip to content

Tensorflow Transform 0.3.0

Compare
Choose a tag to compare
@zoyahav zoyahav released this 11 Oct 03:20
· 894 commits to master since this release

Release 0.3.0

Major Features and Improvements

  • Added hash_strings mapper.
  • Write vocabularies as asset files instead of constants in the SavedModel.

Bug Fixes and Other Changes

  • 'tft.tfidf' now adds 1 to idf values so that terms in every document in the
    corpus have a non-zero tfidf value.
  • Performance and memory usage improvement when running with Beam runners that
    use multi-threaded workers.
  • Performance optimizations in ExampleProtoCoder.
  • Depends on apache-beam[gcp]>=2.1.1,<3.
  • Depends on protobuf>=3.3.0<4.
  • Depends on six>=1.9,<1.11.

Breaking changes

  • Requires pre-installed TensorFlow >= 1.3.
  • Removed tft.map use tft.apply_function instead (as needed).
  • Removed tft.tfidf_weights use tft.tfidf instead.
  • beam_metadata_io.WriteMetadata now requires a second pipeline argument
    (see examples).
  • A Beam bug will now affect users who call AnalyzeAndTransformDataset in
    certain circumstances. Roughly speaking, if you call beam.Pipeline() at
    some point (as all our examples do) you will not experience this bug. The
    bug is characterized by an error similar to
    KeyError: (u'AnalyzeAndTransformDataset/AnalyzeDataset/ComputeTensorValues/Extract[Maximum:0]', None)
    This bug will be fixed in Beam 2.2.