Added support for new model databricks/dbrx-base #82

quic-akuruvil · 2024-08-07T12:53:08Z

Model changes for databricks/dbrx-base KV range gather based model
Test case for the above model
Skip saving onnx file as external data for large model > 400GB
Changes in config function for multiple models with overlapping HF config.json file attributes

ochougul

Add test case in tests/transformers/test_transformer_pytorch_transforms.py

ochougul · 2024-08-13T18:39:01Z

QEfficient/transformers/modeling_utils.py

@@ -114,6 +123,12 @@
    GPT2Block: QEffGPT2Block,
    GPT2Attention: QEffGPT2Attention,
    GPT2LMHeadModel: QEffGPT2LMHeadModel,
+    # Dbrx model layers


Please add in QEfficient/transformers/pytorch_transforms.py::KVCacheTransform too.
We will be deprecating this after 1.18 release.

ochougul · 2024-08-13T18:39:32Z

QEfficient/transformers/models/dbrx/modeling_dbrx.py

+
+from QEfficient.transformers.modeling_attn_mask_utils import _create_causal_mask
+
+DBRX_ATTENTION_CLASSES = {


Not used anywhere?

ochougul · 2024-08-13T18:41:48Z

QEfficient/utils/_utils.py

@@ -219,7 +219,16 @@ def get_padding_shape_from_config(config, batch_size, seq_len):
    ):  # Check for num_key_value_heads (Llama/Mistral)
        n_heads = config.num_key_value_heads
        d_head = config.hidden_size // config.num_attention_heads
-    elif hasattr(config, "n_heads"):  # Check for n_heads and d_model in the config (MPT Model)
+    elif (


isn't this condition same as line 231-233?
We can move those line here, and remove 223-226?

Actually no, the MPT model needed more specific parameters. Because simply testing for n_heads causes dbrx also, to satisfy the condition. MPT and dbrx has similar config.json. Hence current check fails, and wrong config will be set for the model.

irajagop · 2024-08-28T08:13:00Z

QEfficient/exporter/export_utils.py

+    # Save model to single weight file
+    params = sum(p.numel() for p in pt_model.parameters())
+    model_size = math.ceil((params * 4) / Constants.GB)
+    if model_size < 380:
+        info("ONNX model uses external data. Saving external data as single weight file.")
+        loaded_model = onnx.load(f"{gen_models_path}_tmp/{model_base_name}.onnx")
+        os.makedirs(f"{gen_models_path}", exist_ok=True)
+        shutil.rmtree(f"{gen_models_path}_tmp")
+        info("Clearing files .. ")
+        onnx.save_model(
+            loaded_model,
+            os.path.join(gen_models_path, f"{model_base_name}.onnx"),
+            save_as_external_data=True,
+            all_tensors_to_one_file=True,
+            location=f"{model_base_name}.onnxweights.data",
+            size_threshold=1024,
+            convert_attribute=False,
+        )
+        onnx.checker.check_model(os.path.join(gen_models_path, f"{model_base_name}.onnx"))
+    else:
+        info("Skip saving external data as a single file.")
+        if os.path.exists(f"{gen_models_path}"):
+            shutil.rmtree(f"{gen_models_path}")
+        shutil.move(f"{gen_models_path}_tmp", f"{gen_models_path}")


This can be done with SplitTensorsTransform, now merged into main. Is this change tested with other models also?

Added support for new model databricks/dbrx-base

26a1142

Signed-off-by: Ann <[email protected]>

quic-akuruvil requested review from quic-mamta, irajagop, anujgupt-github, vbaddi and ochougul as code owners August 7, 2024 12:53

quic-akuruvil self-assigned this Aug 7, 2024

ochougul mentioned this pull request Aug 12, 2024

Transforms refactor #84

Merged

ochougul requested changes Aug 13, 2024

View reviewed changes

ochougul added enhancement New feature or request model-enablement labels Aug 14, 2024

irajagop requested changes Aug 28, 2024

View reviewed changes

quic-akuruvil added the wip Work in progress label Nov 6, 2024

quic-rishinr self-requested a review as a code owner January 10, 2025 07:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for new model databricks/dbrx-base #82

Added support for new model databricks/dbrx-base #82

quic-akuruvil commented Aug 7, 2024

ochougul left a comment

ochougul Aug 13, 2024

ochougul Aug 13, 2024

ochougul Aug 13, 2024

quic-akuruvil Aug 14, 2024 •

edited

Loading

irajagop Aug 28, 2024


		from QEfficient.transformers.modeling_attn_mask_utils import _create_causal_mask

		DBRX_ATTENTION_CLASSES = {

Added support for new model databricks/dbrx-base #82

Are you sure you want to change the base?

Added support for new model databricks/dbrx-base #82

Conversation

quic-akuruvil commented Aug 7, 2024

ochougul left a comment

Choose a reason for hiding this comment

ochougul Aug 13, 2024

Choose a reason for hiding this comment

ochougul Aug 13, 2024

Choose a reason for hiding this comment

ochougul Aug 13, 2024

Choose a reason for hiding this comment

quic-akuruvil Aug 14, 2024 • edited Loading

Choose a reason for hiding this comment

irajagop Aug 28, 2024

Choose a reason for hiding this comment

quic-akuruvil Aug 14, 2024 •

edited

Loading