Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bfloat to client #4521

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion Dockerfile.QA
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,8 @@ RUN rm -f /usr/bin/python && \
ln -s /usr/bin/python3 /usr/bin/python

RUN pip3 install --upgrade wheel setuptools && \
pip3 install --upgrade numpy pillow attrdict future grpcio requests gsutil awscli six grpcio-channelz
pip3 install --upgrade numpy pillow attrdict future grpcio requests gsutil awscli six grpcio-channelz && \
pip3 install --upgrade bfloat16

# L0_http_fuzz is hitting similar issue with boofuzz with latest version (0.4.0):
# https://github.com/jtpereyda/boofuzz/issues/529
Expand Down
1 change: 1 addition & 0 deletions Dockerfile.sdk
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,7 @@ COPY qa/images/mug.jpg images/mug.jpg
# are not needed for building but including them allows this image to
# be used to run the client examples.
RUN pip3 install --upgrade numpy pillow attrdict && \
pip3 install --upgrade bfloat16 && \
find install/python/ -maxdepth 1 -type f -name \
"tritonclient-*linux*.whl" | xargs printf -- '%s[all]' | \
xargs pip3 install --upgrade
Expand Down
27 changes: 16 additions & 11 deletions qa/L0_backend_identity/identity_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,18 +31,25 @@
import sys
import requests as httpreq
from builtins import range
import tritongrpcclient as grpcclient
import tritonhttpclient as httpclient
from tritonclientutils import np_to_triton_dtype
import tritonclient.grpc as grpcclient
import tritonclient.http as httpclient
from tritonclient.utils import np_to_triton_dtype

FLAGS = None

def test_bf16_raw_http(shape):
def test_bf16_http(shape):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# FIXME: Use identity_bf16 model in test above once proper python client
# support is added, and remove this raw HTTP test. See DLIS-3720.

This helper was only meant to be a temp workaround until python client bf16 support was added.

Can we instead add to the existing test below now that there is bf16 support in the client? I believe there's a identity_bf16 model that will work now, or if not could make a very simple python identity model.

This would:

  1. test both HTTP/GRPC
  2. test the bf16 support in client library workflow: triton_to_np_dtype, np_to_triton_dtype, etc.

if ("tensorflow" not in sys.modules):
from bfloat16 import bfloat16
else:
# known incompatability issue here:
# https://github.com/GreenWaves-Technologies/bfloat16/issues/2
# Can solve when numpy officially supports bfloat16
# https://github.com/numpy/numpy/issues/19808
print("error: tensorflow is included in module. This module cannot " \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an error/check that should be done by the client as well? The comment is helpful, but users may run into issues if this results in incompatibility.

Copy link
Contributor

@rmccorm4 rmccorm4 Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to try/except on the specific error we're expecting here. Maybe it won't always happen when tensorflow is installed.

try:
  from bfloat16 import bfloat16
except <specific error we expect with tensorflow>:
  print("known issue tensorflow pypi, link to github issue")
except Exception as e:
  print("unexpected error importing bf16:", e)

"co-exist with pypi version of bfloat16")
sys.exit(1)
model = "identity_bf16"
# Using fp16 data as a WAR since it is same byte_size as bf16
# and is supported by numpy for ease-of-use. Since this is an
# identity model, it's OK that the bytes are interpreted differently
input_data = (16384 * np.random.randn(*shape)).astype(np.float16)
input_data = (np.random.randn(*shape)).astype(bfloat16)
input_bytes = input_data.tobytes()
headers = {'Inference-Header-Content-Length': '0'}
r = httpreq.post("http://localhost:8000/v2/models/{}/infer".format(model),
Expand Down Expand Up @@ -264,7 +271,5 @@ def test_bf16_raw_http(shape):
print("error: expected 'param2' == False")
sys.exit(1)

# FIXME: Use identity_bf16 model in test above once proper python client
# support is added, and remove this raw HTTP test. See DLIS-3720.
test_bf16_raw_http([2, 2])
test_bf16_http([2, 2])