From fd6d6ecc3e7e7614f17f4752e1716226d08ab468 Mon Sep 17 00:00:00 2001 From: Ali Tarik Date: Thu, 30 Nov 2023 13:06:27 +0300 Subject: [PATCH 1/5] update notebooks --- demo/tutorials/misc/HF_Callback_NER.ipynb | 658 ++++++++++++++++++ .../HF_Callback_Text_Classification.ipynb | 96 +-- 2 files changed, 712 insertions(+), 42 deletions(-) create mode 100644 demo/tutorials/misc/HF_Callback_NER.ipynb diff --git a/demo/tutorials/misc/HF_Callback_NER.ipynb b/demo/tutorials/misc/HF_Callback_NER.ipynb new file mode 100644 index 000000000..08c357cbe --- /dev/null +++ b/demo/tutorials/misc/HF_Callback_NER.ipynb @@ -0,0 +1,658 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![image.png]()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Comparing_Models_Notebook.ipynb)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**LangTest** is an open-source python library designed to help developers deliver safe and effective Natural Language Processing (NLP) models. Whether you are using **John Snow Labs, Hugging Face, Spacy** models or **OpenAI, Cohere, AI21, Hugging Face Inference API and Azure-OpenAI** based LLMs, it has got you covered. You can test any Named Entity Recognition (NER), Text Classification model using the library. We also support testing LLMS for Question-Answering, Summarization, Clinical-Tests and Security tasks on benchmark datasets. The library supports 50+ out of the box tests. These tests fall into robustness, accuracy, bias, representation, toxicity, translation, performance, security, clinical and fairness test categories.\n", + "\n", + "Metrics are calculated by comparing the model's extractions in the original list of sentences against the extractions carried out in the noisy list of sentences. The original annotated labels are not used at any point, we are simply comparing the model against itself in a 2 settings." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Getting started with LangTest" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!pip install \"langtest[johnsnowlabs,transformers,spacy]\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# LangTestCallback and Its Parameters\n", + "\n", + "The LangTestCallback class is a testing class for Natural Language Processing (NLP) models. It evaluates the performance of a NLP model on a given task using test data and generates a report with test results. It can be imported from the LangTest library in the following way." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "#Import Harness from the LangTest library\n", + "from langtest.callback import LangTestCallback" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "It imports the callback class from within the module, that is designed to provide a blueprint or framework for conducting NLP testing, and instances of the callback class can be customized or configured for different testing scenarios or environments then provided to the trainer.\n", + "\n", + "Here is a list of the different parameters that can be passed to the LangTestCallback function:\n", + "\n", + "
\n", + "\n", + "| Parameter | Description |\n", + "| --------------------- | ----------- |\n", + "| **task** | Task for which the model is to be evaluated (text-classification or ner) |\n", + "| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: |\n", + "| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n", + "| **print_reports** | A bool value that specifies if the reports should be printed. |\n", + "| **save_reports** | A bool value that specifies if the reports should be saved. |\n", + "| **run_each_epoch** | A bool value that specifies if the tests should be run after each epoch or the at the end of training |\n", + "\n", + "
\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Preparing for training" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "H_93RQeoDVsW" + }, + "outputs": [], + "source": [ + "!pip install datasets\n", + "!pip install transformers[torch]\n", + "!pip install tensorflow -U" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "vFzwOHkqC7tQ", + "outputId": "d7dccbc0-1691-43a5-879a-0fc04e6b5a60" + }, + "outputs": [], + "source": [ + "import torch\n", + "from torch.utils.data import Dataset, DataLoader\n", + "from transformers import BertForTokenClassification, BertTokenizerFast, Trainer, TrainingArguments\n", + "from seqeval.metrics import accuracy_score, f1_score, precision_score, recall_score\n", + "import numpy as np\n", + "\n", + "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n", + "\n", + "# Load dataset\n", + "file_path = \"conll03.conll\"\n", + "\n", + "def read_conll_file(file_path):\n", + " with open(file_path, \"r\") as file:\n", + " lines = file.readlines()\n", + " return lines[::2]\n", + "\n", + "lines = read_conll_file(file_path)\n", + "\n", + "# Preprocess dataset\n", + "def preprocess_conll(lines):\n", + " tokens = []\n", + " labels = []\n", + " token_list = []\n", + " label_list = []\n", + "\n", + " for line in lines:\n", + " if line.startswith(\"-DOCSTART-\") or line == \"\\n\":\n", + " if token_list:\n", + " tokens.append(token_list)\n", + " labels.append(label_list)\n", + " token_list = []\n", + " label_list = []\n", + " else:\n", + " token, _, _, label = line.strip().split()\n", + " token_list.append(token)\n", + " label_list.append(label)\n", + "\n", + " return tokens, labels\n", + "\n", + "tokens, labels = preprocess_conll(lines)\n", + "\n", + "class NERDataset(Dataset):\n", + " def __init__(self, tokens, labels, tokenizer, max_length=128):\n", + " self.tokens = tokens\n", + " self.labels = labels\n", + " self.tokenizer = tokenizer\n", + " self.max_length = max_length\n", + "\n", + " self.label_map = {label: i for i, label in enumerate(sorted(set([lbl for doc_labels in labels for lbl in doc_labels])))}\n", + " self.id2label = {v: k for k, v in self.label_map.items()}\n", + "\n", + " def __len__(self):\n", + " return len(self.tokens)\n", + "\n", + " def __getitem__(self, idx):\n", + " token_list = self.tokens[idx]\n", + " label_list = self.labels[idx]\n", + "\n", + " encoded = self.tokenizer(token_list, is_split_into_words=True, padding=\"max_length\", truncation=True, max_length=self.max_length, return_tensors=\"pt\")\n", + " token_ids = encoded.input_ids.squeeze(0)\n", + " attention_mask = encoded.attention_mask.squeeze(0)\n", + "\n", + " label_ids = [self.label_map[label] for label in label_list]\n", + " label_ids = [-100] + label_ids + [-100] # Account for [CLS] and [SEP] tokens\n", + " label_ids += [-100] * (self.max_length - len(label_ids)) # Pad labels\n", + "\n", + " return {\n", + " \"input_ids\": token_ids,\n", + " \"attention_mask\": attention_mask,\n", + " \"labels\": torch.tensor(label_ids, dtype=torch.long),\n", + " }\n", + "\n", + "# Initialize tokenizer and dataset\n", + "tokenizer = BertTokenizerFast.from_pretrained(\"dslim/bert-base-NER\")\n", + "train_dataset = NERDataset(tokens, labels, tokenizer)\n", + "\n", + "# Initialize model\n", + "model = BertForTokenClassification.from_pretrained(\n", + " \"dslim/bert-base-NER\",\n", + " num_labels=len(train_dataset.label_map),\n", + " id2label=train_dataset.id2label,\n", + " label2id=train_dataset.label_map,\n", + " ignore_mismatched_sizes=True,\n", + ")\n", + "\n", + "# Initialize the classifier layer with the correct number of labels\n", + "model.classifier = torch.nn.Linear(model.config.hidden_size, len(train_dataset.label_map))\n", + "\n", + "# Move the model to the appropriate device\n", + "model.to(device)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Creating a LangTestCallback instance" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After loading the model and tokenizer from huggingface, we can get to the training part of our process. We will utilize `transformers.Trainer` for easily integrating our callback into the training process. We will also use `transformers.TrainingArguments` to specify the training arguments.\n", + "\n", + "We can store the config in a dictionary and pass it to the LangTestCallback function for easier use and visual appeal. The config will be used in this notebook is below:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "id": "ZqT9vZQiC7tS" + }, + "outputs": [], + "source": [ + "config = {\n", + " \"tests\": {\n", + " \"defaults\": {\n", + " \"min_pass_rate\": 1.0\n", + " },\n", + " \"robustness\": {\n", + " \"add_typo\": {\"min_pass_rate\": 0.7},\n", + " \"uppercase\": {\"min_pass_rate\": 0.7},\n", + " \"american_to_british\": {\"min_pass_rate\": 0.7},\n", + " },\n", + " \"accuracy\": {\n", + " \"min_micro_f1_score\": {\n", + " \"min_score\": 0.7\n", + " }\n", + " },\n", + " \"bias\": {\n", + " \"replace_to_female_pronouns\": {\n", + " \"min_pass_rate\": 0.7\n", + " },\n", + " \"replace_to_low_income_country\": {\n", + " \"min_pass_rate\": 0.7\n", + " }\n", + " }\n", + " }\n", + "}\n", + "my_callback = LangTestCallback(task=\"ner\", data={\"data_source\":\"sample.conll\"}, config=config, save_reports=True, run_each_epoch=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Creating the Trainer" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As mentioned earlier, we create a TrainingArguments object to specify the training arguments. We will also create a Trainer object to train our model. Then we can pass the LangTestCallback object to the Trainer object as a callback. LangTestCallback initilizes the harness object and generates the testcases using .generate() after the trainer is initialized." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "id": "325jnkfxfCPF" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Test Configuration : \n", + " {\n", + " \"tests\": {\n", + " \"defaults\": {\n", + " \"min_pass_rate\": 1.0\n", + " },\n", + " \"robustness\": {\n", + " \"add_typo\": {\n", + " \"min_pass_rate\": 0.7\n", + " },\n", + " \"uppercase\": {\n", + " \"min_pass_rate\": 0.7\n", + " },\n", + " \"american_to_british\": {\n", + " \"min_pass_rate\": 0.7\n", + " }\n", + " },\n", + " \"accuracy\": {\n", + " \"min_micro_f1_score\": {\n", + " \"min_score\": 0.7\n", + " }\n", + " },\n", + " \"bias\": {\n", + " \"replace_to_female_pronouns\": {\n", + " \"min_pass_rate\": 0.7\n", + " },\n", + " \"replace_to_low_income_country\": {\n", + " \"min_pass_rate\": 0.7\n", + " }\n", + " }\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "# Training arguments\n", + "training_args = TrainingArguments(\n", + " output_dir=\"./results\",\n", + " num_train_epochs=2,\n", + " per_device_train_batch_size=64,\n", + " logging_dir=\"./logs\",\n", + " logging_steps=100,\n", + " save_steps=1000,\n", + " learning_rate=3e-5,\n", + " weight_decay=0.01,\n", + ")\n", + "\n", + "# Initialize trainer\n", + "trainer = Trainer(\n", + " model=model,\n", + " args=training_args,\n", + " train_dataset=train_dataset,\n", + " tokenizer=tokenizer,\n", + " callbacks=[my_callback],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Training" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The actual training step is very simple. We just need to call the train() method of the Trainer object. We can also pass the training arguments to the train() method but its default values are OK in this case." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We have the reports printed and also saved under the reports folder. The reports are saved in the form of a MD file. The reports folder is created in the same directory as the notebook.We have the reports printed and also saved under the reports folder. The reports are saved in the form of a MD file. The reports folder is created in the same directory as the notebook." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 1000 + }, + "id": "PzAaW4CPC7tV", + "outputId": "f11d85ee-36f2-4341-a761-1d305391790c" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generating testcases...: 100%|██████████| 3/3 [00:00\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
categorytest_typefail_countpass_countpass_rateminimum_pass_ratepass
0robustnessadd_typo6514769%70%False
1robustnessuppercase1296935%70%False
2accuracymin_micro_f1_score01100%100%True
3biasreplace_to_female_pronouns111761%70%False
4biasreplace_to_low_income_country444450%70%False
\n", + "" + ], + "text/plain": [ + " category test_type fail_count pass_count \\\n", + "0 robustness add_typo 65 147 \n", + "1 robustness uppercase 129 69 \n", + "2 accuracy min_micro_f1_score 0 1 \n", + "3 bias replace_to_female_pronouns 11 17 \n", + "4 bias replace_to_low_income_country 44 44 \n", + "\n", + " pass_rate minimum_pass_rate pass \n", + "0 69% 70% False \n", + "1 35% 70% False \n", + "2 100% 100% True \n", + "3 61% 70% False \n", + "4 50% 70% False " + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Running testcases... : 100%|██████████| 527/527 [00:13<00:00, 39.39it/s]\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
categorytest_typefail_countpass_countpass_rateminimum_pass_ratepass
0robustnessadd_typo6514769%70%False
1robustnessuppercase1296935%70%False
2accuracymin_micro_f1_score01100%100%True
3biasreplace_to_female_pronouns111761%70%False
4biasreplace_to_low_income_country444450%70%False
\n", + "
" + ], + "text/plain": [ + " category test_type fail_count pass_count \\\n", + "0 robustness add_typo 65 147 \n", + "1 robustness uppercase 129 69 \n", + "2 accuracy min_micro_f1_score 0 1 \n", + "3 bias replace_to_female_pronouns 11 17 \n", + "4 bias replace_to_low_income_country 44 44 \n", + "\n", + " pass_rate minimum_pass_rate pass \n", + "0 69% 70% False \n", + "1 35% 70% False \n", + "2 100% 100% True \n", + "3 61% 70% False \n", + "4 50% 70% False " + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{'train_runtime': 2029.1679, 'train_samples_per_second': 1.67, 'train_steps_per_second': 0.027, 'train_loss': 0.7498808260317202, 'epoch': 2.0}\n" + ] + }, + { + "data": { + "text/plain": [ + "TrainOutput(global_step=54, training_loss=0.7498808260317202, metrics={'train_runtime': 2029.1679, 'train_samples_per_second': 1.67, 'train_steps_per_second': 0.027, 'train_loss': 0.7498808260317202, 'epoch': 2.0})" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Training the model\n", + "trainer.train()" + ] + } + ], + "metadata": { + "accelerator": "TPU", + "colab": { + "machine_shape": "hm", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.11" + }, + "orig_nbformat": 4 + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb b/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb index f09e05301..6697acf5f 100644 --- a/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb +++ b/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb @@ -142,7 +142,23 @@ }, "outputs": [], "source": [ - "train_dataset = dataset[\"train\"].select(range(25)).map(tokenize, batched=True)\n" + "train_dataset = dataset[\"train\"].select(range(25)).map(tokenize, batched=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Creating a LangTestCallback Instance" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After loading the model and tokenizer from huggingface, we can get to the training part of our process. We will utilize `transformers.Trainer` for easily integrating our callback into the training process. We will also use `transformers.TrainingArguments` to specify the training arguments.\n", + "\n", + "We can store the config in a dictionary and pass it to the LangTestCallback function for easier use and visual appeal. The config will be used in this notebook is below:" ] }, { @@ -191,9 +207,23 @@ "my_callback = LangTestCallback(task=\"text-classification\", data={\"data_source\":\"sample.csv\"}, config=config, save_reports=True, run_each_epoch=True)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Creating the Trainer" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As mentioned earlier, we create a TrainingArguments object to specify the training arguments. We will also create a Trainer object to train our model. Then we can pass the LangTestCallback object to the Trainer object as a callback. LangTestCallback initilizes the harness object and generates the testcases using .generate() after the trainer is initialized." + ] + }, { "cell_type": "code", - "execution_count": 29, + "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" @@ -201,46 +231,7 @@ "id": "1ID52Si2C7tT", "outputId": "ebb08896-b6a5-4d80-ef1f-9c0bdaec7730" }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Test Configuration : \n", - " {\n", - " \"tests\": {\n", - " \"defaults\": {\n", - " \"min_pass_rate\": 1.0\n", - " },\n", - " \"robustness\": {\n", - " \"add_typo\": {\n", - " \"min_pass_rate\": 0.7\n", - " },\n", - " \"uppercase\": {\n", - " \"min_pass_rate\": 0.7\n", - " },\n", - " \"american_to_british\": {\n", - " \"min_pass_rate\": 0.7\n", - " }\n", - " },\n", - " \"accuracy\": {\n", - " \"min_micro_f1_score\": {\n", - " \"min_score\": 0.7\n", - " }\n", - " },\n", - " \"bias\": {\n", - " \"replace_to_female_pronouns\": {\n", - " \"min_pass_rate\": 0.7\n", - " },\n", - " \"replace_to_low_income_country\": {\n", - " \"min_pass_rate\": 0.7\n", - " }\n", - " }\n", - " }\n", - "}\n" - ] - } - ], + "outputs": [], "source": [ "# Training arguments\n", "training_args = TrainingArguments(\n", @@ -257,6 +248,27 @@ ")" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Training" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The actual training step is very simple. We just need to call the train() method of the Trainer object. We can also pass the training arguments to the train() method but its default values are OK in this case." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Below, we have the reports printed and also saved under the reports folder. The reports are saved in the form of a MD file. The reports folder is created in the same directory as the notebook." + ] + }, { "cell_type": "code", "execution_count": 30, From bca83b6cfdb07d3311a1eb6afa8bb30b22d56065 Mon Sep 17 00:00:00 2001 From: Ali Tarik Date: Thu, 30 Nov 2023 14:41:49 +0300 Subject: [PATCH 2/5] add website page for langtestcallback --- docs/pages/docs/hf_callback.md | 40 ++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 docs/pages/docs/hf_callback.md diff --git a/docs/pages/docs/hf_callback.md b/docs/pages/docs/hf_callback.md new file mode 100644 index 000000000..329cd843a --- /dev/null +++ b/docs/pages/docs/hf_callback.md @@ -0,0 +1,40 @@ +--- +layout: docs +seotitle: LangTestCallback | LangTest | John Snow Labs +title: LangTestCallback +permalink: /docs/pages/docs/hf-callback +key: docs-callback +modify_date: "2023-03-28" +header: true +--- + +
+ +LangTest also has a callback class that can be used in training to evaluate the model after each epoch or at the end of training. This callback class is called `LangTestCallback` and is imported from `langtest.callback`. + +```python +from langtest.callback import LangTestCallback +my_callback = LangTestCallback(task, config, data) + +trainer = Trainer( + model=model, + args=training_args, + train_dataset=train_dataset, + eval_dataset=eval_dataset, + callbacks=[my_callback] +) +``` + +LangTestCallback takes the following parameters: + +| Parameter | Description | +| ------------------ || +| **task** | Task for which the model is to be evaluated (text-classification or ner) | +| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys:
- **data_source** (mandatory): The source of the data.
- **subset** (optional): The subset of the data.
- **feature_column** (optional): The column containing the features.
- **target_column** (optional): The column containing the target labels.
- **split** (optional): The data split to be used.
- **source** (optional): Set to 'huggingface' when loading Hugging Face dataset. | +| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. | +| **print_reports** | A bool value that specifies if the reports should be printed. | +| **save_reports** | A bool value that specifies if the reports should be saved. If `True`, all generated reports will be saved under `reports/reportXXX.md` | +| **run_each_epoch** | A bool value that specifies if the tests should be run after each epoch or the at the end of training | + + +
\ No newline at end of file From 0cbe241092e265295007ce016e43b14c01fe77b5 Mon Sep 17 00:00:00 2001 From: Ali Tarik Date: Thu, 30 Nov 2023 14:48:30 +0300 Subject: [PATCH 3/5] add langtestcallback to navigation --- docs/_data/navigation.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/_data/navigation.yml b/docs/_data/navigation.yml index 284c053f0..21055af36 100644 --- a/docs/_data/navigation.yml +++ b/docs/_data/navigation.yml @@ -56,6 +56,8 @@ docs-menu: url: /docs/pages/docs/report - title: MlFlow Tracking url: /docs/pages/docs/ml_flow + - title: LangTestCallback + url: /docs/pages/docs/hf-callback - title: Saving & Loading url: /docs/pages/docs/save From a73627451b3b3e1b48253cb0d1493c5b134f8f5d Mon Sep 17 00:00:00 2001 From: Ali Tarik Date: Thu, 30 Nov 2023 15:06:50 +0300 Subject: [PATCH 4/5] update notebooks --- demo/tutorials/misc/HF_Callback_NER.ipynb | 2 +- .../HF_Callback_Text_Classification.ipynb | 6002 ++++++++++++++++- 2 files changed, 5868 insertions(+), 136 deletions(-) diff --git a/demo/tutorials/misc/HF_Callback_NER.ipynb b/demo/tutorials/misc/HF_Callback_NER.ipynb index 08c357cbe..6f45325e7 100644 --- a/demo/tutorials/misc/HF_Callback_NER.ipynb +++ b/demo/tutorials/misc/HF_Callback_NER.ipynb @@ -36,7 +36,7 @@ "metadata": {}, "outputs": [], "source": [ - "!pip install \"langtest[johnsnowlabs,transformers,spacy]\"" + "!pip install \"langtest[transformers]\"" ] }, { diff --git a/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb b/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb index 6697acf5f..4b0155374 100644 --- a/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb +++ b/demo/tutorials/misc/HF_Callback_Text_Classification.ipynb @@ -2,21 +2,27 @@ "cells": [ { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "jivErFCkFgxe" + }, "source": [ "![image.png]()" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "VGFU1I5BFgxg" + }, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Comparing_Models_Notebook.ipynb)" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "V48ExCf-Fgxg" + }, "source": [ "**LangTest** is an open-source python library designed to help developers deliver safe and effective Natural Language Processing (NLP) models. Whether you are using **John Snow Labs, Hugging Face, Spacy** models or **OpenAI, Cohere, AI21, Hugging Face Inference API and Azure-OpenAI** based LLMs, it has got you covered. You can test any Named Entity Recognition (NER), Text Classification model using the library. We also support testing LLMS for Question-Answering, Summarization, Clinical-Tests and Security tasks on benchmark datasets. The library supports 50+ out of the box tests. These tests fall into robustness, accuracy, bias, representation, toxicity, translation, performance, security, clinical and fairness test categories.\n", "\n", @@ -25,7 +31,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "y-cUyOo7Fgxg" + }, "source": [ "# Getting started with LangTest" ] @@ -33,15 +41,19 @@ { "cell_type": "code", "execution_count": null, - "metadata": {}, + "metadata": { + "id": "5PSN8rcjFgxg" + }, "outputs": [], "source": [ - "!pip install \"langtest[johnsnowlabs,transformers,spacy]\"" + "!pip install \"langtest[transformers]\"==1.9.0rc1" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "L0Iqr61XFgxh" + }, "source": [ "# LangTestCallback and Its Parameters\n", "\n", @@ -50,8 +62,10 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": {}, + "execution_count": 1, + "metadata": { + "id": "Aw_gRAOsFgxh" + }, "outputs": [], "source": [ "#Import Harness from the LangTest library\n", @@ -60,7 +74,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "POQ5a8EhFgxh" + }, "source": [ "It imports the callback class from within the module, that is designed to provide a blueprint or framework for conducting NLP testing, and instances of the callback class can be customized or configured for different testing scenarios or environments then provided to the trainer.\n", "\n", @@ -96,22 +112,317 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": { "colab": { - "base_uri": "https://localhost:8080/" + "base_uri": "https://localhost:8080/", + "height": 401, + "referenced_widgets": [ + "0f339b7dda014a65acfe241384167c53", + "6db1f4ec80fb4fa7a5f77c2909b3c755", + "49b76e2c8d4c484d9e8e78f55dd34f2b", + "65d02c0924da4eca835fa9cb89c7d7dd", + "c7d2e34de0714971984e8a4e72c85f16", + "028376ac680b43c1a294407e9ab6a5af", + "fadcdd015f6541fbaadf975dd13cd95f", + "5c8ad99ebaa347f79905a557c9212381", + "4b4dad9da390409d81fed270278bf4c7", + "e7c48e205424478a8f015c77902fb134", + "d97d6cbf86544494885c3d347febc852", + "2e978e6bd19047f78c897106c31966ed", + "04886efe65104ef1ade8715a59ef1f41", + "8144f7eb5c6e4882b0cd2eb2f968c882", + "6c316019bb8a4097b9a2a88dda964bc7", + "91f16706c6134d6a8ef4fa07aa38ad83", + "9f1b265bba014211a8eae834cb4727bd", + "27ca1877f22148808a9ac400a42a7a26", + "d868c8883fee4bfc9b589da2fc7275fd", + "b36a094e04ab41e59e61a0614b1c7b26", + "b3fe91b4f396478eb71c1a3e6fbd0d6f", + "22deb7749a3940048448989ef5b91ab4", + "a3aac8ab3e1440388759e76e0b8d400b", + "fb68bc3a68c74db09891f09a797fad4d", + "fd3c248bc0d64c1b9852f47d4ee7eb17", + "33c598afbc834c938618d85007f8b36c", + "2cc56448beb44502ba5cbc7f7b5ae057", + "484d6ef0cb9548c495b7f89adddb8799", + "f49c8b3f47ff4488bb611ab129609a06", + "49bb38a9f972468a96fdd539b0b5a6d6", + "4ce641041f28400c9b3ba584cef00e1a", + "aa368056a83d44819674ac55b679fad1", + "be594efe148c496e92175408918e9eff", + "c0fdf2412b154e5f8400e7ad50321f61", + "f6564aa137504aa6ac8752464146498e", + "a4e85dc8d4ab42d190d09afbab4d9b67", + "b262b3042a6a4c21a990fe49a8d2c4ef", + "5a426c228731404bbcf202a6c103c14b", + "af95713290f34a4d83088951fcaa8d38", + "658c4a4bdb35461d8f06a5cdfdccb877", + "02c486f6d14549068dc96d68618a0ec6", + "b8d5164256ac40bb8633f8317aab279a", + "31f2a68fc9194a5b80ede2bef35e2bd7", + "973ecb3108c84ca89b28f2597a8ff474", + "59753f8944b9414d927c7e2eea445b51", + "8ffc2b1fb14449d1bb03a3b5bed928bc", + "e75a36f8a97847c4a52c6f139ab660d0", + "a1cc0e10853c47a1901c823ce8d008de", + "ec96e2943e3c4907874f391aafe78628", + "701c8d36283248f4a0df3dd72941e603", + "d84c5dc84e7e440ca66250a705fe4a16", + "5b0a2ab061cc402ebf12db7d4dceb7dd", + "f7d45e8dd0eb4146af7d2ccbea4c4edd", + "4456b5fe19474e7cb4cfe70679d6005f", + "9613e943891c428bbf3f8847d9c61a31", + "bca602866a86460ca5cc99c5d1a5a415", + "90737d28781c47909746ac3fd524769a", + "b66e3b7156ee4bd295243141c20019bc", + "6cecb165f14441ecae43cc59a0a0b56f", + "ef91b2f8818044ca8ed144dbaa963fcd", + "440c6c8b303343c8a8cf53d4d9768a6e", + "93577d92cf684871b5026252e8123c84", + "0c91977ea08a4efb91280ff178e4850b", + "8406d74b98094a32b2f1028f6e317ff4", + "3f30534555e34458983889694fda3671", + "e0924770d76a49b499f3e08b9f24202a", + "aaf2bee8382c445b9f02631e1a46adda", + "f61ca1a64d294128a06f88afee1ca0e1", + "35e06f0a72bb456c81b846a9eae1b91f", + "ace61e4ac7cd4f64ba93aa173ce87287", + "8d0f1ffbf93b45d5950426321b82b785", + "9ecc27897dba4231bd6b82031a1cc3db", + "2e69383236104fb38191d990e3a54c81", + "8cdb6d85160747a1a451322f2e60cbc9", + "b6cfacc547d4499a9e887b6314a0e00f", + "e095bb6ae9514753a6f6e90d72028857", + "41e09c4061784455a67eb0b7cf2d31fc", + "6d44d06737f64b85a0aa9efc813bcf61", + "3763a876e5184a4da9b7816839d0b523", + "45b6355d8db24edbb5614c2f1d944619", + "cd4d230cf56843afa1eab8939ba8c5df", + "d9ea2f0150654e60bd4cbf8b6d8c5fcf", + "4921d93aa19a4b9f9664463b743aa5e4", + "b3442e661da6452ba7cff46bf9c01ef4", + "3ebc999924bd46dea7f8e21677edbf07", + "4611ce6627b640d8b6cfc95242839b3c", + "3bd4ebc1706640eea35d8c1c56541288", + "2b5e9cecebb54328a3020a80ff0d9a9a", + "08a6428e9c9f48689a847dd8063487a2", + "30635c8d36244136b2c9485fce6a3675", + "5ed3e59771104d849dff50eeb6c8b519", + "4c9d20de836b49aa87724addef5bd550", + "b2ab93acd0104da0a8b4b2ab18a51e22", + "9b2695098c88496aa33967a2b76b5578", + "1f100078ce8444478d0c60eeefbff273", + "c2c53496e66048adb1dbb426da8c0bed", + "043d81856e5746708162fa6fbbf4d3c4", + "d0e1c56bccd64f73831bce780c53ec90", + "265e2d76584849479b9e936ed9ce8654", + "79fd37e8fe454e9a83dba65f31c8cb77", + "593c8bc430454eb284169a0686fe6bfb", + "c436532599d04f3b9f53d366a6f59b20", + "1e2fd13373574364b05e8c008e44abff", + "913a768ab4d54c4eb0d963bdeb67fe06", + "c960d9b2bbe94e35afefc7efd9f416ba", + "47caa679e4c34ccc811103714a35ddd3", + "5716068861814d4584a43121242952b1", + "1e2cd1e45e4b4032943aec565dbe6cfe", + "00590bb60bdf448a89e3d0d0f65314ec", + "1cff3f1f268a42ea846e9e3f5f53d8b3", + "f3298d68cecf45fe89b3939d89853594", + "f19e578ca2284ed6a7de268b5175429f", + "8aee44326aae4c3b842277f55b739250", + "1de22c20d44843aaa82ab635f1dd3f9c", + "4bd988dbbb9a4c4195922684d1fd2159", + "5ed1e548a4df4a76a8e2e0c2e49b8d2c", + "9963664b1a2b4137846494d225695932", + "51f764124bbd426b856b7a4b5de965e0", + "2332277384d040cd8a4a62213dd20cda", + "907194999ce6489c9ea9ebc969c7c77e", + "a9dfcd4f75384d76baac18ab4d687ac5", + "155d29c05c1748be98f104c210d23b0f", + "ae80685a02474640ac6325cb61206118", + "e2dda125709548aabdb66af2524b49ea", + "13936ef1cf7241dcacdcd74f788fca0b", + "1151b117e3f543db965ba430560e828f", + "f55a39438d33482faa837412c3eedb28", + "e9d0dab76a69477d8cef7613fd8e25b4", + "29a17d1315ff46caa87c791d87494f07", + "421e40ff53d244c5979d762593f8bccb", + "3882b1ea5f0f494f924e07f44e9e4386", + "143dea23ae7843c7b17a4242a4cee4c5" + ] }, "id": "vFzwOHkqC7tQ", - "outputId": "d7dccbc0-1691-43a5-879a-0fc04e6b5a60" + "outputId": "76bbd4c6-4309-40ed-d7b6-c61a30f55b46" }, "outputs": [ { - "name": "stderr", - "output_type": "stream", - "text": [ - "Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']\n", - "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n" - ] + "output_type": "display_data", + "data": { + "text/plain": [ + "tokenizer_config.json: 0%| | 0.00/28.0 [00:00\n", "
\n", - " \n", - " \n", - " [12/12 01:00, Epoch 3/3]\n", - "
\n", - " \n", + "\n", + "
\n", " \n", - " \n", - " \n", - " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", " \n", " \n", " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", " \n", - "
StepTraining Loss
categorytest_typefail_countpass_countpass_rateminimum_pass_ratepass
0robustnessadd_typo0189100%70%True
1robustnessuppercase0200100%70%True
2robustnessamerican_to_british062100%70%True
3accuracymin_micro_f1_score100%100%False
4biasreplace_to_female_pronouns0171100%70%True
5biasreplace_to_low_income_country028100%70%True

" - ], - "text/plain": [ - "" + "\n", + "\n", + "

\n", + "\n", + "
\n", + " \n", + "\n", + " \n", + "\n", + " \n", + "
\n", + "\n", + "\n", + "
\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "
\n", + "\n", + "
\n", + " \n" ] }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "Running testcases... : 100%|██████████| 64/64 [00:17<00:00, 3.63it/s]\n" - ] + "metadata": {} }, { - "name": "stdout", "output_type": "stream", - "text": [ - " category test_type fail_count pass_count \\\n", - "0 robustness add_typo 0 16 \n", - "1 robustness uppercase 0 18 \n", - "2 robustness american_to_british 0 9 \n", - "3 accuracy min_micro_f1_score 1 0 \n", - "4 bias replace_to_female_pronouns 0 16 \n", - "5 bias replace_to_low_income_country 0 4 \n", - "\n", - " pass_rate minimum_pass_rate pass \n", - "0 100% 70% True \n", - "1 100% 70% True \n", - "2 100% 70% True \n", - "3 0% 100% False \n", - "4 100% 70% True \n", - "5 100% 70% True \n" - ] - }, - { "name": "stderr", - "output_type": "stream", "text": [ - "\rRunning testcases... : 0%| | 0/64 [00:00\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
categorytest_typefail_countpass_countpass_rateminimum_pass_ratepass
0robustnessadd_typo0189100%70%True
1robustnessuppercase0200100%70%True
2robustnessamerican_to_british062100%70%True
3accuracymin_micro_f1_score100%100%False
4biasreplace_to_female_pronouns0171100%70%True
5biasreplace_to_low_income_country028100%70%True
\n", + "
\n", + "
\n", + "\n", + "
\n", + " \n", + "\n", + " \n", + "\n", + " \n", + "
\n", + "\n", + "\n", + "
\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "
\n", + "\n", + "
\n", + " \n" + ] + }, + "metadata": {} }, { - "name": "stderr", "output_type": "stream", + "name": "stderr", "text": [ - "\rRunning testcases... : 0%| | 0/64 [00:00\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
categorytest_typefail_countpass_countpass_rateminimum_pass_ratepass
0robustnessadd_typo0189100%70%True
1robustnessuppercase0200100%70%True
2robustnessamerican_to_british062100%70%True
3accuracymin_micro_f1_score100%100%False
4biasreplace_to_female_pronouns0171100%70%True
5biasreplace_to_low_income_country028100%70%True
\n", + "
\n", + "
\n", + "\n", + "
\n", + " \n", + "\n", + " \n", + "\n", + " \n", + "
\n", + "\n", + "\n", + "
\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "
\n", + "\n", + "
\n", + " \n" + ] + }, + "metadata": {} }, { - "name": "stderr", "output_type": "stream", + "name": "stdout", "text": [ - "\n" + "{'train_runtime': 281.2316, 'train_samples_per_second': 0.267, 'train_steps_per_second': 0.043, 'train_loss': 0.14822853604952493, 'epoch': 3.0}\n" ] }, { + "output_type": "execute_result", "data": { "text/plain": [ - "TrainOutput(global_step=12, training_loss=0.0014002219152947266, metrics={'train_runtime': 67.7653, 'train_samples_per_second': 1.107, 'train_steps_per_second': 0.177, 'total_flos': 19733329152000.0, 'train_loss': 0.0014002219152947266, 'epoch': 3.0})" + "TrainOutput(global_step=12, training_loss=0.14822853604952493, metrics={'train_runtime': 281.2316, 'train_samples_per_second': 0.267, 'train_steps_per_second': 0.043, 'train_loss': 0.14822853604952493, 'epoch': 3.0})" ] }, - "execution_count": 30, "metadata": {}, - "output_type": "execute_result" + "execution_count": 7 } ], "source": [ @@ -459,8 +1741,4458 @@ "pygments_lexer": "ipython3", "version": "3.10.11" }, - "orig_nbformat": 4 + "orig_nbformat": 4, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "0f339b7dda014a65acfe241384167c53": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_6db1f4ec80fb4fa7a5f77c2909b3c755", + "IPY_MODEL_49b76e2c8d4c484d9e8e78f55dd34f2b", + "IPY_MODEL_65d02c0924da4eca835fa9cb89c7d7dd" + ], + "layout": "IPY_MODEL_c7d2e34de0714971984e8a4e72c85f16" + } + }, + "6db1f4ec80fb4fa7a5f77c2909b3c755": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_028376ac680b43c1a294407e9ab6a5af", + "placeholder": "​", + "style": "IPY_MODEL_fadcdd015f6541fbaadf975dd13cd95f", + "value": "tokenizer_config.json: 100%" + } + }, + "49b76e2c8d4c484d9e8e78f55dd34f2b": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5c8ad99ebaa347f79905a557c9212381", + "max": 28, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_4b4dad9da390409d81fed270278bf4c7", + "value": 28 + } + }, + "65d02c0924da4eca835fa9cb89c7d7dd": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e7c48e205424478a8f015c77902fb134", + "placeholder": "​", + "style": "IPY_MODEL_d97d6cbf86544494885c3d347febc852", + "value": " 28.0/28.0 [00:00<00:00, 2.16kB/s]" + } + }, + "c7d2e34de0714971984e8a4e72c85f16": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "028376ac680b43c1a294407e9ab6a5af": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "fadcdd015f6541fbaadf975dd13cd95f": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "5c8ad99ebaa347f79905a557c9212381": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4b4dad9da390409d81fed270278bf4c7": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "e7c48e205424478a8f015c77902fb134": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d97d6cbf86544494885c3d347febc852": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "2e978e6bd19047f78c897106c31966ed": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_04886efe65104ef1ade8715a59ef1f41", + "IPY_MODEL_8144f7eb5c6e4882b0cd2eb2f968c882", + "IPY_MODEL_6c316019bb8a4097b9a2a88dda964bc7" + ], + "layout": "IPY_MODEL_91f16706c6134d6a8ef4fa07aa38ad83" + } + }, + "04886efe65104ef1ade8715a59ef1f41": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_9f1b265bba014211a8eae834cb4727bd", + "placeholder": "​", + "style": "IPY_MODEL_27ca1877f22148808a9ac400a42a7a26", + "value": "config.json: 100%" + } + }, + "8144f7eb5c6e4882b0cd2eb2f968c882": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d868c8883fee4bfc9b589da2fc7275fd", + "max": 570, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_b36a094e04ab41e59e61a0614b1c7b26", + "value": 570 + } + }, + "6c316019bb8a4097b9a2a88dda964bc7": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_b3fe91b4f396478eb71c1a3e6fbd0d6f", + "placeholder": "​", + "style": "IPY_MODEL_22deb7749a3940048448989ef5b91ab4", + "value": " 570/570 [00:00<00:00, 39.3kB/s]" + } + }, + "91f16706c6134d6a8ef4fa07aa38ad83": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9f1b265bba014211a8eae834cb4727bd": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "27ca1877f22148808a9ac400a42a7a26": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "d868c8883fee4bfc9b589da2fc7275fd": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b36a094e04ab41e59e61a0614b1c7b26": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "b3fe91b4f396478eb71c1a3e6fbd0d6f": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "22deb7749a3940048448989ef5b91ab4": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "a3aac8ab3e1440388759e76e0b8d400b": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_fb68bc3a68c74db09891f09a797fad4d", + "IPY_MODEL_fd3c248bc0d64c1b9852f47d4ee7eb17", + "IPY_MODEL_33c598afbc834c938618d85007f8b36c" + ], + "layout": "IPY_MODEL_2cc56448beb44502ba5cbc7f7b5ae057" + } + }, + "fb68bc3a68c74db09891f09a797fad4d": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_484d6ef0cb9548c495b7f89adddb8799", + "placeholder": "​", + "style": "IPY_MODEL_f49c8b3f47ff4488bb611ab129609a06", + "value": "vocab.txt: 100%" + } + }, + "fd3c248bc0d64c1b9852f47d4ee7eb17": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_49bb38a9f972468a96fdd539b0b5a6d6", + "max": 231508, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_4ce641041f28400c9b3ba584cef00e1a", + "value": 231508 + } + }, + "33c598afbc834c938618d85007f8b36c": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_aa368056a83d44819674ac55b679fad1", + "placeholder": "​", + "style": "IPY_MODEL_be594efe148c496e92175408918e9eff", + "value": " 232k/232k [00:00<00:00, 5.09MB/s]" + } + }, + "2cc56448beb44502ba5cbc7f7b5ae057": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "484d6ef0cb9548c495b7f89adddb8799": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f49c8b3f47ff4488bb611ab129609a06": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "49bb38a9f972468a96fdd539b0b5a6d6": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4ce641041f28400c9b3ba584cef00e1a": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "aa368056a83d44819674ac55b679fad1": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "be594efe148c496e92175408918e9eff": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "c0fdf2412b154e5f8400e7ad50321f61": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_f6564aa137504aa6ac8752464146498e", + "IPY_MODEL_a4e85dc8d4ab42d190d09afbab4d9b67", + "IPY_MODEL_b262b3042a6a4c21a990fe49a8d2c4ef" + ], + "layout": "IPY_MODEL_5a426c228731404bbcf202a6c103c14b" + } + }, + "f6564aa137504aa6ac8752464146498e": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_af95713290f34a4d83088951fcaa8d38", + "placeholder": "​", + "style": "IPY_MODEL_658c4a4bdb35461d8f06a5cdfdccb877", + "value": "tokenizer.json: 100%" + } + }, + "a4e85dc8d4ab42d190d09afbab4d9b67": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_02c486f6d14549068dc96d68618a0ec6", + "max": 466062, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_b8d5164256ac40bb8633f8317aab279a", + "value": 466062 + } + }, + "b262b3042a6a4c21a990fe49a8d2c4ef": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_31f2a68fc9194a5b80ede2bef35e2bd7", + "placeholder": "​", + "style": "IPY_MODEL_973ecb3108c84ca89b28f2597a8ff474", + "value": " 466k/466k [00:00<00:00, 23.9MB/s]" + } + }, + "5a426c228731404bbcf202a6c103c14b": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "af95713290f34a4d83088951fcaa8d38": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "658c4a4bdb35461d8f06a5cdfdccb877": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "02c486f6d14549068dc96d68618a0ec6": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b8d5164256ac40bb8633f8317aab279a": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "31f2a68fc9194a5b80ede2bef35e2bd7": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "973ecb3108c84ca89b28f2597a8ff474": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "59753f8944b9414d927c7e2eea445b51": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_8ffc2b1fb14449d1bb03a3b5bed928bc", + "IPY_MODEL_e75a36f8a97847c4a52c6f139ab660d0", + "IPY_MODEL_a1cc0e10853c47a1901c823ce8d008de" + ], + "layout": "IPY_MODEL_ec96e2943e3c4907874f391aafe78628" + } + }, + "8ffc2b1fb14449d1bb03a3b5bed928bc": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_701c8d36283248f4a0df3dd72941e603", + "placeholder": "​", + "style": "IPY_MODEL_d84c5dc84e7e440ca66250a705fe4a16", + "value": "model.safetensors: 100%" + } + }, + "e75a36f8a97847c4a52c6f139ab660d0": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5b0a2ab061cc402ebf12db7d4dceb7dd", + "max": 440449768, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_f7d45e8dd0eb4146af7d2ccbea4c4edd", + "value": 440449768 + } + }, + "a1cc0e10853c47a1901c823ce8d008de": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_4456b5fe19474e7cb4cfe70679d6005f", + "placeholder": "​", + "style": "IPY_MODEL_9613e943891c428bbf3f8847d9c61a31", + "value": " 440M/440M [00:01<00:00, 295MB/s]" + } + }, + "ec96e2943e3c4907874f391aafe78628": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "701c8d36283248f4a0df3dd72941e603": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d84c5dc84e7e440ca66250a705fe4a16": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "5b0a2ab061cc402ebf12db7d4dceb7dd": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f7d45e8dd0eb4146af7d2ccbea4c4edd": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "4456b5fe19474e7cb4cfe70679d6005f": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9613e943891c428bbf3f8847d9c61a31": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "bca602866a86460ca5cc99c5d1a5a415": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_90737d28781c47909746ac3fd524769a", + "IPY_MODEL_b66e3b7156ee4bd295243141c20019bc", + "IPY_MODEL_6cecb165f14441ecae43cc59a0a0b56f" + ], + "layout": "IPY_MODEL_ef91b2f8818044ca8ed144dbaa963fcd" + } + }, + "90737d28781c47909746ac3fd524769a": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_440c6c8b303343c8a8cf53d4d9768a6e", + "placeholder": "​", + "style": "IPY_MODEL_93577d92cf684871b5026252e8123c84", + "value": "Downloading builder script: 100%" + } + }, + "b66e3b7156ee4bd295243141c20019bc": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_0c91977ea08a4efb91280ff178e4850b", + "max": 4314, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_8406d74b98094a32b2f1028f6e317ff4", + "value": 4314 + } + }, + "6cecb165f14441ecae43cc59a0a0b56f": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_3f30534555e34458983889694fda3671", + "placeholder": "​", + "style": "IPY_MODEL_e0924770d76a49b499f3e08b9f24202a", + "value": " 4.31k/4.31k [00:00<00:00, 334kB/s]" + } + }, + "ef91b2f8818044ca8ed144dbaa963fcd": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "440c6c8b303343c8a8cf53d4d9768a6e": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "93577d92cf684871b5026252e8123c84": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "0c91977ea08a4efb91280ff178e4850b": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "8406d74b98094a32b2f1028f6e317ff4": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "3f30534555e34458983889694fda3671": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "e0924770d76a49b499f3e08b9f24202a": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "aaf2bee8382c445b9f02631e1a46adda": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_f61ca1a64d294128a06f88afee1ca0e1", + "IPY_MODEL_35e06f0a72bb456c81b846a9eae1b91f", + "IPY_MODEL_ace61e4ac7cd4f64ba93aa173ce87287" + ], + "layout": "IPY_MODEL_8d0f1ffbf93b45d5950426321b82b785" + } + }, + "f61ca1a64d294128a06f88afee1ca0e1": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_9ecc27897dba4231bd6b82031a1cc3db", + "placeholder": "​", + "style": "IPY_MODEL_2e69383236104fb38191d990e3a54c81", + "value": "Downloading metadata: 100%" + } + }, + "35e06f0a72bb456c81b846a9eae1b91f": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_8cdb6d85160747a1a451322f2e60cbc9", + "max": 2166, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_b6cfacc547d4499a9e887b6314a0e00f", + "value": 2166 + } + }, + "ace61e4ac7cd4f64ba93aa173ce87287": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e095bb6ae9514753a6f6e90d72028857", + "placeholder": "​", + "style": "IPY_MODEL_41e09c4061784455a67eb0b7cf2d31fc", + "value": " 2.17k/2.17k [00:00<00:00, 170kB/s]" + } + }, + "8d0f1ffbf93b45d5950426321b82b785": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9ecc27897dba4231bd6b82031a1cc3db": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "2e69383236104fb38191d990e3a54c81": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "8cdb6d85160747a1a451322f2e60cbc9": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b6cfacc547d4499a9e887b6314a0e00f": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "e095bb6ae9514753a6f6e90d72028857": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "41e09c4061784455a67eb0b7cf2d31fc": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "6d44d06737f64b85a0aa9efc813bcf61": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_3763a876e5184a4da9b7816839d0b523", + "IPY_MODEL_45b6355d8db24edbb5614c2f1d944619", + "IPY_MODEL_cd4d230cf56843afa1eab8939ba8c5df" + ], + "layout": "IPY_MODEL_d9ea2f0150654e60bd4cbf8b6d8c5fcf" + } + }, + "3763a876e5184a4da9b7816839d0b523": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_4921d93aa19a4b9f9664463b743aa5e4", + "placeholder": "​", + "style": "IPY_MODEL_b3442e661da6452ba7cff46bf9c01ef4", + "value": "Downloading readme: 100%" + } + }, + "45b6355d8db24edbb5614c2f1d944619": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_3ebc999924bd46dea7f8e21677edbf07", + "max": 7590, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_4611ce6627b640d8b6cfc95242839b3c", + "value": 7590 + } + }, + "cd4d230cf56843afa1eab8939ba8c5df": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_3bd4ebc1706640eea35d8c1c56541288", + "placeholder": "​", + "style": "IPY_MODEL_2b5e9cecebb54328a3020a80ff0d9a9a", + "value": " 7.59k/7.59k [00:00<00:00, 608kB/s]" + } + }, + "d9ea2f0150654e60bd4cbf8b6d8c5fcf": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4921d93aa19a4b9f9664463b743aa5e4": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b3442e661da6452ba7cff46bf9c01ef4": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "3ebc999924bd46dea7f8e21677edbf07": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4611ce6627b640d8b6cfc95242839b3c": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "3bd4ebc1706640eea35d8c1c56541288": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "2b5e9cecebb54328a3020a80ff0d9a9a": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "08a6428e9c9f48689a847dd8063487a2": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_30635c8d36244136b2c9485fce6a3675", + "IPY_MODEL_5ed3e59771104d849dff50eeb6c8b519", + "IPY_MODEL_4c9d20de836b49aa87724addef5bd550" + ], + "layout": "IPY_MODEL_b2ab93acd0104da0a8b4b2ab18a51e22" + } + }, + "30635c8d36244136b2c9485fce6a3675": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_9b2695098c88496aa33967a2b76b5578", + "placeholder": "​", + "style": "IPY_MODEL_1f100078ce8444478d0c60eeefbff273", + "value": "Downloading data: 100%" + } + }, + "5ed3e59771104d849dff50eeb6c8b519": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c2c53496e66048adb1dbb426da8c0bed", + "max": 84125825, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_043d81856e5746708162fa6fbbf4d3c4", + "value": 84125825 + } + }, + "4c9d20de836b49aa87724addef5bd550": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d0e1c56bccd64f73831bce780c53ec90", + "placeholder": "​", + "style": "IPY_MODEL_265e2d76584849479b9e936ed9ce8654", + "value": " 84.1M/84.1M [00:05<00:00, 22.9MB/s]" + } + }, + "b2ab93acd0104da0a8b4b2ab18a51e22": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9b2695098c88496aa33967a2b76b5578": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1f100078ce8444478d0c60eeefbff273": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "c2c53496e66048adb1dbb426da8c0bed": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "043d81856e5746708162fa6fbbf4d3c4": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "d0e1c56bccd64f73831bce780c53ec90": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "265e2d76584849479b9e936ed9ce8654": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "79fd37e8fe454e9a83dba65f31c8cb77": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_593c8bc430454eb284169a0686fe6bfb", + "IPY_MODEL_c436532599d04f3b9f53d366a6f59b20", + "IPY_MODEL_1e2fd13373574364b05e8c008e44abff" + ], + "layout": "IPY_MODEL_913a768ab4d54c4eb0d963bdeb67fe06" + } + }, + "593c8bc430454eb284169a0686fe6bfb": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c960d9b2bbe94e35afefc7efd9f416ba", + "placeholder": "​", + "style": "IPY_MODEL_47caa679e4c34ccc811103714a35ddd3", + "value": "Generating train split: 100%" + } + }, + "c436532599d04f3b9f53d366a6f59b20": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5716068861814d4584a43121242952b1", + "max": 25000, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_1e2cd1e45e4b4032943aec565dbe6cfe", + "value": 25000 + } + }, + "1e2fd13373574364b05e8c008e44abff": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_00590bb60bdf448a89e3d0d0f65314ec", + "placeholder": "​", + "style": "IPY_MODEL_1cff3f1f268a42ea846e9e3f5f53d8b3", + "value": " 25000/25000 [00:08<00:00, 924.97 examples/s]" + } + }, + "913a768ab4d54c4eb0d963bdeb67fe06": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "c960d9b2bbe94e35afefc7efd9f416ba": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "47caa679e4c34ccc811103714a35ddd3": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "5716068861814d4584a43121242952b1": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1e2cd1e45e4b4032943aec565dbe6cfe": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "00590bb60bdf448a89e3d0d0f65314ec": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1cff3f1f268a42ea846e9e3f5f53d8b3": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "f3298d68cecf45fe89b3939d89853594": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_f19e578ca2284ed6a7de268b5175429f", + "IPY_MODEL_8aee44326aae4c3b842277f55b739250", + "IPY_MODEL_1de22c20d44843aaa82ab635f1dd3f9c" + ], + "layout": "IPY_MODEL_4bd988dbbb9a4c4195922684d1fd2159" + } + }, + "f19e578ca2284ed6a7de268b5175429f": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5ed1e548a4df4a76a8e2e0c2e49b8d2c", + "placeholder": "​", + "style": "IPY_MODEL_9963664b1a2b4137846494d225695932", + "value": "Generating test split: 100%" + } + }, + "8aee44326aae4c3b842277f55b739250": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_51f764124bbd426b856b7a4b5de965e0", + "max": 25000, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_2332277384d040cd8a4a62213dd20cda", + "value": 25000 + } + }, + "1de22c20d44843aaa82ab635f1dd3f9c": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_907194999ce6489c9ea9ebc969c7c77e", + "placeholder": "​", + "style": "IPY_MODEL_a9dfcd4f75384d76baac18ab4d687ac5", + "value": " 25000/25000 [00:08<00:00, 7100.44 examples/s]" + } + }, + "4bd988dbbb9a4c4195922684d1fd2159": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "5ed1e548a4df4a76a8e2e0c2e49b8d2c": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9963664b1a2b4137846494d225695932": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "51f764124bbd426b856b7a4b5de965e0": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "2332277384d040cd8a4a62213dd20cda": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "907194999ce6489c9ea9ebc969c7c77e": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "a9dfcd4f75384d76baac18ab4d687ac5": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "155d29c05c1748be98f104c210d23b0f": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_ae80685a02474640ac6325cb61206118", + "IPY_MODEL_e2dda125709548aabdb66af2524b49ea", + "IPY_MODEL_13936ef1cf7241dcacdcd74f788fca0b" + ], + "layout": "IPY_MODEL_1151b117e3f543db965ba430560e828f" + } + }, + "ae80685a02474640ac6325cb61206118": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_f55a39438d33482faa837412c3eedb28", + "placeholder": "​", + "style": "IPY_MODEL_e9d0dab76a69477d8cef7613fd8e25b4", + "value": "Generating unsupervised split: 100%" + } + }, + "e2dda125709548aabdb66af2524b49ea": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_29a17d1315ff46caa87c791d87494f07", + "max": 50000, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_421e40ff53d244c5979d762593f8bccb", + "value": 50000 + } + }, + "13936ef1cf7241dcacdcd74f788fca0b": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_3882b1ea5f0f494f924e07f44e9e4386", + "placeholder": "​", + "style": "IPY_MODEL_143dea23ae7843c7b17a4242a4cee4c5", + "value": " 50000/50000 [00:10<00:00, 6954.67 examples/s]" + } + }, + "1151b117e3f543db965ba430560e828f": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f55a39438d33482faa837412c3eedb28": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "e9d0dab76a69477d8cef7613fd8e25b4": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "29a17d1315ff46caa87c791d87494f07": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "421e40ff53d244c5979d762593f8bccb": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "3882b1ea5f0f494f924e07f44e9e4386": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "143dea23ae7843c7b17a4242a4cee4c5": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "3718d0ead5f34ff0a06f1eade292ebe8": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HBoxModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_785a0f13d36a4d268908f9b0af68cebe", + "IPY_MODEL_90fe6e4229c148da8066adbd40b0beab", + "IPY_MODEL_208ac9d586804957a001b2d134141b78" + ], + "layout": "IPY_MODEL_bf798174994947b5b707b1f119dae0d7" + } + }, + "785a0f13d36a4d268908f9b0af68cebe": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5c84ec53402147fcab06e7078b272746", + "placeholder": "​", + "style": "IPY_MODEL_0ffe58cec1f549438742975bf4154316", + "value": "Map: 100%" + } + }, + "90fe6e4229c148da8066adbd40b0beab": { + "model_module": "@jupyter-widgets/controls", + "model_name": "FloatProgressModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_dafcd2ddd2fa481d8c173d015446640c", + "max": 25, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_df1252165b6041209d072390de947697", + "value": 25 + } + }, + "208ac9d586804957a001b2d134141b78": { + "model_module": "@jupyter-widgets/controls", + "model_name": "HTMLModel", + "model_module_version": "1.5.0", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_9480c6601cd34acc84d1029e3fd36156", + "placeholder": "​", + "style": "IPY_MODEL_98b6d8654f6a42559c413de778557e33", + "value": " 25/25 [00:00<00:00, 295.91 examples/s]" + } + }, + "bf798174994947b5b707b1f119dae0d7": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "5c84ec53402147fcab06e7078b272746": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "0ffe58cec1f549438742975bf4154316": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "dafcd2ddd2fa481d8c173d015446640c": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "df1252165b6041209d072390de947697": { + "model_module": "@jupyter-widgets/controls", + "model_name": "ProgressStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "9480c6601cd34acc84d1029e3fd36156": { + "model_module": "@jupyter-widgets/base", + "model_name": "LayoutModel", + "model_module_version": "1.2.0", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "98b6d8654f6a42559c413de778557e33": { + "model_module": "@jupyter-widgets/controls", + "model_name": "DescriptionStyleModel", + "model_module_version": "1.5.0", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + } + } + } }, "nbformat": 4, "nbformat_minor": 0 -} +} \ No newline at end of file From d6f79e07ea708cc8a7a588f07f663bae4b25b71f Mon Sep 17 00:00:00 2001 From: ArshaanNazir Date: Thu, 30 Nov 2023 19:35:42 +0530 Subject: [PATCH 5/5] update templatic augmentaation NB --- .../Templatic_Augmentation_Notebook.ipynb | 2655 ++++++++++++++++- 1 file changed, 2654 insertions(+), 1 deletion(-) diff --git a/demo/tutorials/misc/Templatic_Augmentation_Notebook.ipynb b/demo/tutorials/misc/Templatic_Augmentation_Notebook.ipynb index 739c9461b..3c6c85f40 100644 --- a/demo/tutorials/misc/Templatic_Augmentation_Notebook.ipynb +++ b/demo/tutorials/misc/Templatic_Augmentation_Notebook.ipynb @@ -1 +1,2654 @@ -{"cells":[{"cell_type":"markdown","metadata":{"id":"e7PsSmy9sCoR"},"source":["![image.png]()"]},{"cell_type":"markdown","metadata":{"id":"MhgkQYQiEvZt"},"source":["[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Templatic_Augmentation_Notebook.ipynb)"]},{"cell_type":"markdown","metadata":{"id":"WJJzt3RWhEc6"},"source":["**LangTest** is an open-source python library designed to help developers deliver safe and effective Natural Language Processing (NLP) models. Whether you are using **John Snow Labs, Hugging Face, Spacy** models or **OpenAI, Cohere, AI21, Hugging Face Inference API and Azure-OpenAI** based LLMs, it has got you covered. You can test any Named Entity Recognition (NER), Text Classification model using the library. We also support testing LLMS for Question-Answering and Summarization tasks on benchmark datasets. The library supports 50+ out of the box tests. These tests fall into robustness, accuracy, bias, representation, toxicity and fairness test categories.\n","\n","Metrics are calculated by comparing the model's extractions in the original list of sentences against the extractions carried out in the noisy list of sentences. The original annotated labels are not used at any point, we are simply comparing the model against itself in a 2 settings."]},{"cell_type":"markdown","metadata":{"id":"26qXWhCYhHAt"},"source":["# Getting started with LangTest on John Snow Labs"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"oGIyE43uhTxH"},"outputs":[],"source":["!pip install langtest[johnsnowlabs]"]},{"cell_type":"markdown","metadata":{"id":"yR6kjOaiheKN"},"source":["# Harness and its Parameters\n","\n","The Harness class is a testing class for Natural Language Processing (NLP) models. It evaluates the performance of a NLP model on a given task using test data and generates a report with test results.Harness can be imported from the LangTest library in the following way."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"lTzSJpMlhgq5"},"outputs":[],"source":["#Import Harness from the LangTest library\n","from langtest import Harness"]},{"cell_type":"markdown","metadata":{"id":"sBcZjwJBhkOw"},"source":["It imports the Harness class from within the module, that is designed to provide a blueprint or framework for conducting NLP testing, and that instances of the Harness class can be customized or configured for different testing scenarios or environments.\n","\n","Here is a list of the different parameters that can be passed to the Harness function:\n","\n","
\n","\n","\n","\n","| Parameter | Description |\n","| - | - |\n","| **task** | Task for which the model is to be evaluated (text-classification or ner) |\n","| **model** | Specifies the model(s) to be evaluated. This parameter can be provided as either a dictionary or a list of dictionaries. Each dictionary should contain the following keys:
  • model (mandatory): \tPipelineModel or path to a saved model or pretrained pipeline/model from hub.
  • hub (mandatory): Hub (library) to use in back-end for loading model from public models hub or from path
|\n","| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys:
  • data_source (mandatory): The source of the data.
  • subset (optional): The subset of the data.
  • feature_column (optional): The column containing the features.
  • target_column (optional): The column containing the target labels.
  • split (optional): The data split to be used.
  • source (optional): Set to 'huggingface' when loading Hugging Face dataset.
|\n","| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n","\n","\n","
\n","
"]},{"cell_type":"markdown","metadata":{"id":"JFhJ9CcbsKqN"},"source":["# Real-World Project Workflows\n","\n","In this section, we dive into complete workflows for using the model testing module in real-world project settings."]},{"cell_type":"markdown","metadata":{"id":"UtxtE6Y0r4CJ"},"source":["## Robustness Testing\n","\n","In this example, we will be testing a model's robustness. We will be applying 2 tests: add_typo and lowercase. The real-world project workflow of the model robustness testing and fixing in this case goes as follows:\n","\n","1. Train NER model on original CoNLL training set\n","\n","2. Test NER model robustness on CoNLL test set\n","\n","3. Augment CoNLL training set based on test results\n","\n","4. Train new NER model on augmented CoNLL training set\n","\n","5. Test new NER model robustness on the CoNLL test set from step 2\n","\n","6. Compare robustness of new NER model against original NER model"]},{"cell_type":"markdown","metadata":{"id":"I21Jmq79jgC6"},"source":["#### Load Train and Test CoNLL"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":1477,"status":"ok","timestamp":1692342633486,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"6uW22VqJje8E","outputId":"ff7e597d-9ec3-41ce-e006-0c251dc96183"},"outputs":[{"name":"stdout","output_type":"stream","text":["--2023-08-18 07:10:30-- https://raw.githubusercontent.com/JohnSnowLabs/langtest/main/langtest/data/conll/sample.conll\n","Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n","Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 50519 (49K) [text/plain]\n","Saving to: ‘sample.conll’\n","\n","\rsample.conll 0%[ ] 0 --.-KB/s \rsample.conll 100%[===================>] 49.33K --.-KB/s in 0.003s \n","\n","2023-08-18 07:10:30 (15.6 MB/s) - ‘sample.conll’ saved [50519/50519]\n","\n","--2023-08-18 07:10:30-- https://raw.githubusercontent.com/JohnSnowLabs/langtest/main/demo/data/conll03.conll\n","Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n","Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 827443 (808K) [text/plain]\n","Saving to: ‘conll03.conll’\n","\n","conll03.conll 100%[===================>] 808.05K --.-KB/s in 0.02s \n","\n","2023-08-18 07:10:31 (42.3 MB/s) - ‘conll03.conll’ saved [827443/827443]\n","\n"]}],"source":["# Load test CoNLL\n","!wget https://raw.githubusercontent.com/JohnSnowLabs/langtest/main/langtest/data/conll/sample.conll\n","\n","# Load train CoNLL\n","!wget https://raw.githubusercontent.com/JohnSnowLabs/langtest/main/demo/data/conll03.conll"]},{"cell_type":"markdown","metadata":{"id":"MNtH_HOUt_PL"},"source":["#### Step 1: Train NER Model"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"jRnEmCfPhsZs"},"outputs":[],"source":["from johnsnowlabs import nlp"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":337965,"status":"ok","timestamp":1692342977578,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"bHXeP18sGp-g","outputId":"7ba0e6d9-0675-44d1-b601-98d415230949"},"outputs":[{"name":"stdout","output_type":"stream","text":["Warning::Spark Session already created, some configs may not take.\n","small_bert_L2_128 download started this may take some time.\n","Approximate size to download 16.1 MB\n","[OK!]\n"]}],"source":["ner_model = nlp.load('bert train.ner').fit(dataset_path=\"/content/conll03.conll\")\n"]},{"cell_type":"markdown","metadata":{"id":"kKgXC7cvuyar"},"source":["#### Step 2: Test NER Model Robustness "]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":832,"status":"ok","timestamp":1692342978351,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"RVk9NWn7u-Lm","outputId":"73756c32-b1ec-42f7-ddf2-e33204b9a5dc"},"outputs":[{"name":"stdout","output_type":"stream","text":["Test Configuration : \n"," {\n"," \"tests\": {\n"," \"defaults\": {\n"," \"min_pass_rate\": 1.0\n"," },\n"," \"robustness\": {\n"," \"add_typo\": {\n"," \"min_pass_rate\": 0.7\n"," },\n"," \"american_to_british\": {\n"," \"min_pass_rate\": 0.7\n"," }\n"," },\n"," \"accuracy\": {\n"," \"min_micro_f1_score\": {\n"," \"min_score\": 0.7\n"," }\n"," },\n"," \"bias\": {\n"," \"replace_to_female_pronouns\": {\n"," \"min_pass_rate\": 0.7\n"," },\n"," \"replace_to_low_income_country\": {\n"," \"min_pass_rate\": 0.7\n"," }\n"," },\n"," \"fairness\": {\n"," \"min_gender_f1_score\": {\n"," \"min_score\": 0.6\n"," }\n"," },\n"," \"representation\": {\n"," \"min_label_representation_count\": {\n"," \"min_count\": 50\n"," }\n"," }\n"," }\n","}\n"]}],"source":["harness = Harness(task=\"ner\", model={\"model\": ner_model, \"hub\": \"johnsnowlabs\"}, data={\"data_source\":\"sample.conll\"})"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":18,"status":"ok","timestamp":1692342978353,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"mynkAUwZyuFN","outputId":"bca2f807-40f2-4767-f176-33103c31a9e3"},"outputs":[{"data":{"text/plain":["{'tests': {'defaults': {'min_pass_rate': 0.65},\n"," 'robustness': {'add_typo': {'min_pass_rate': 0.73},\n"," 'lowercase': {'min_pass_rate': 0.65}}}}"]},"execution_count":7,"metadata":{},"output_type":"execute_result"}],"source":["harness.configure({\n"," 'tests': {\n"," 'defaults': {'min_pass_rate': 0.65},\n","\n"," 'robustness': {\n"," 'add_typo': {'min_pass_rate': 0.73},\n"," 'lowercase':{'min_pass_rate': 0.65},\n"," }\n"," }\n","})"]},{"cell_type":"markdown","metadata":{"id":"ZPU46A7WigFr"},"source":["Here we have configured the harness to perform two robustness tests (add_typo and lowercase) and defined the minimum pass rate for each test."]},{"cell_type":"markdown","metadata":{"id":"MomLlmTwjpzU"},"source":["\n","#### Generating the test cases.\n","\n","\n"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":27812,"status":"ok","timestamp":1692343006155,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"UiUNzTwF89ye","outputId":"4dc12bb6-808c-4d6b-824b-439cb3e81128"},"outputs":[{"name":"stderr","output_type":"stream","text":["Generating testcases...: 100%|██████████| 1/1 [00:00<00:00, 263.51it/s]\n"]},{"data":{"text/plain":[]},"execution_count":8,"metadata":{},"output_type":"execute_result"}],"source":["harness.generate()"]},{"cell_type":"markdown","metadata":{"id":"UiMIF-o49Bg_"},"source":["harness.generate() method automatically generates the test cases (based on the provided configuration)"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":423},"executionInfo":{"elapsed":25,"status":"ok","timestamp":1692343006156,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"p0tTwFfc891k","outputId":"b8741a7a-c1cd-4b30-d081-0a92c9c522f7"},"outputs":[{"data":{"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
categorytest_typeoriginaltest_case
0robustnessadd_typoSOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI...SOCCER - JABAN GET LUCKY WIN , CHINA IN SURPRI...
1robustnessadd_typoNadim LadkiNadim Ladkl
2robustnessadd_typoAL-AIN , United Arab Emirates 1996-12-06AL-AIN , United Atab Emirates 1996-12-06
3robustnessadd_typoJapan began the defence of their Asian Cup tit...Japan began the defence of their Asian Cup tit...
4robustnessadd_typoBut China saw their luck desert them in the se...But China saw their luck desert them in the se...
...............
447robustnesslowercasePortuguesa 1 Atletico Mineiro 0portuguesa 1 atletico mineiro 0
448robustnesslowercaseCRICKET - LARA ENDURES ANOTHER MISERABLE DAY .cricket - lara endures another miserable day .
449robustnesslowercaseRobert Galvinrobert galvin
450robustnesslowercaseMELBOURNE 1996-12-06melbourne 1996-12-06
451robustnesslowercaseAustralia gave Brian Lara another reason to be...australia gave brian lara another reason to be...
\n","

452 rows × 4 columns

\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","
\n","
\n"],"text/plain":[" category test_type original \\\n","0 robustness add_typo SOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI... \n","1 robustness add_typo Nadim Ladki \n","2 robustness add_typo AL-AIN , United Arab Emirates 1996-12-06 \n","3 robustness add_typo Japan began the defence of their Asian Cup tit... \n","4 robustness add_typo But China saw their luck desert them in the se... \n",".. ... ... ... \n","447 robustness lowercase Portuguesa 1 Atletico Mineiro 0 \n","448 robustness lowercase CRICKET - LARA ENDURES ANOTHER MISERABLE DAY . \n","449 robustness lowercase Robert Galvin \n","450 robustness lowercase MELBOURNE 1996-12-06 \n","451 robustness lowercase Australia gave Brian Lara another reason to be... \n","\n"," test_case \n","0 SOCCER - JABAN GET LUCKY WIN , CHINA IN SURPRI... \n","1 Nadim Ladkl \n","2 AL-AIN , United Atab Emirates 1996-12-06 \n","3 Japan began the defence of their Asian Cup tit... \n","4 But China saw their luck desert them in the se... \n",".. ... \n","447 portuguesa 1 atletico mineiro 0 \n","448 cricket - lara endures another miserable day . \n","449 robert galvin \n","450 melbourne 1996-12-06 \n","451 australia gave brian lara another reason to be... \n","\n","[452 rows x 4 columns]"]},"execution_count":9,"metadata":{},"output_type":"execute_result"}],"source":["harness.testcases()"]},{"cell_type":"markdown","metadata":{"id":"nRgq7e-g9Gev"},"source":["harness.testcases() method gives the produced test cases in form of a pandas data frame."]},{"cell_type":"markdown","metadata":{"id":"IaPBjl_R9slh"},"source":["#### Saving test configurations, data, test cases"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"ba0MYutC96CN"},"outputs":[],"source":["harness.save(\"saved_test_configurations\")"]},{"cell_type":"markdown","metadata":{"id":"groBqKuD9I34"},"source":["#### Running the tests"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":81932,"status":"ok","timestamp":1692343088818,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"CHQHRbQb9EDi","outputId":"44621987-fd79-46bf-cf6e-beba8cc7dcee"},"outputs":[{"name":"stderr","output_type":"stream","text":["Running testcases... : 100%|██████████| 452/452 [01:22<00:00, 5.51it/s]\n"]},{"data":{"text/plain":[]},"execution_count":11,"metadata":{},"output_type":"execute_result"}],"source":["harness.run()"]},{"cell_type":"markdown","metadata":{"id":"71zHGe2q9O6G"},"source":["Called after harness.generate() and is to used to run all the tests. Returns a pass/fail flag for each test."]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":545},"executionInfo":{"elapsed":51,"status":"ok","timestamp":1692343088821,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"keBNodfJ894u","outputId":"4f0aea52-ae9a-4bad-b0a7-d87a42a324b1"},"outputs":[{"data":{"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
categorytest_typeoriginaltest_caseexpected_resultactual_resultpass
0robustnessadd_typoSOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI...SOCCER - JABAN GET LUCKY WIN , CHINA IN SURPRI...japan: LOC, china: LOCjaban: PER, china: LOCFalse
1robustnessadd_typoNadim LadkiNadim Ladklnadim ladki: PERnadim ladkl: PERTrue
2robustnessadd_typoAL-AIN , United Arab Emirates 1996-12-06AL-AIN , United Atab Emirates 1996-12-06al-ain: LOC, united arab emirates: LOCal-ain: LOC, united atab emirates: LOCTrue
3robustnessadd_typoJapan began the defence of their Asian Cup tit...Japan began the defence of their Asian Cup tit...japan: LOC, asian cup: MISC, syria: LOCjapan: LOC, asian cup: MISC, syria: LOC, champ...True
4robustnessadd_typoBut China saw their luck desert them in the se...But China saw their luck desert them in the se...china: LOC, uzbekistan: LOCchina: LOC, uzbekistan: LOCTrue
........................
447robustnesslowercasePortuguesa 1 Atletico Mineiro 0portuguesa 1 atletico mineiro 0portuguesa: ORG, atletico: ORG, mineiro: ORGportuguesa: ORG, atletico: ORG, mineiro: ORGTrue
448robustnesslowercaseCRICKET - LARA ENDURES ANOTHER MISERABLE DAY .cricket - lara endures another miserable day .lara: PERlara: PERTrue
449robustnesslowercaseRobert Galvinrobert galvinrobert galvin: PERrobert galvin: PERTrue
450robustnesslowercaseMELBOURNE 1996-12-06melbourne 1996-12-06melbourne: LOCmelbourne: LOCTrue
451robustnesslowercaseAustralia gave Brian Lara another reason to be...australia gave brian lara another reason to be...australia: LOC, brian lara: PER, west: LOCaustralia: LOC, brian lara: PER, west: LOCTrue
\n","

452 rows × 7 columns

\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","
\n","
\n"],"text/plain":[" category test_type original \\\n","0 robustness add_typo SOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI... \n","1 robustness add_typo Nadim Ladki \n","2 robustness add_typo AL-AIN , United Arab Emirates 1996-12-06 \n","3 robustness add_typo Japan began the defence of their Asian Cup tit... \n","4 robustness add_typo But China saw their luck desert them in the se... \n",".. ... ... ... \n","447 robustness lowercase Portuguesa 1 Atletico Mineiro 0 \n","448 robustness lowercase CRICKET - LARA ENDURES ANOTHER MISERABLE DAY . \n","449 robustness lowercase Robert Galvin \n","450 robustness lowercase MELBOURNE 1996-12-06 \n","451 robustness lowercase Australia gave Brian Lara another reason to be... \n","\n"," test_case \\\n","0 SOCCER - JABAN GET LUCKY WIN , CHINA IN SURPRI... \n","1 Nadim Ladkl \n","2 AL-AIN , United Atab Emirates 1996-12-06 \n","3 Japan began the defence of their Asian Cup tit... \n","4 But China saw their luck desert them in the se... \n",".. ... \n","447 portuguesa 1 atletico mineiro 0 \n","448 cricket - lara endures another miserable day . \n","449 robert galvin \n","450 melbourne 1996-12-06 \n","451 australia gave brian lara another reason to be... \n","\n"," expected_result \\\n","0 japan: LOC, china: LOC \n","1 nadim ladki: PER \n","2 al-ain: LOC, united arab emirates: LOC \n","3 japan: LOC, asian cup: MISC, syria: LOC \n","4 china: LOC, uzbekistan: LOC \n",".. ... \n","447 portuguesa: ORG, atletico: ORG, mineiro: ORG \n","448 lara: PER \n","449 robert galvin: PER \n","450 melbourne: LOC \n","451 australia: LOC, brian lara: PER, west: LOC \n","\n"," actual_result pass \n","0 jaban: PER, china: LOC False \n","1 nadim ladkl: PER True \n","2 al-ain: LOC, united atab emirates: LOC True \n","3 japan: LOC, asian cup: MISC, syria: LOC, champ... True \n","4 china: LOC, uzbekistan: LOC True \n",".. ... ... \n","447 portuguesa: ORG, atletico: ORG, mineiro: ORG True \n","448 lara: PER True \n","449 robert galvin: PER True \n","450 melbourne: LOC True \n","451 australia: LOC, brian lara: PER, west: LOC True \n","\n","[452 rows x 7 columns]"]},"execution_count":12,"metadata":{},"output_type":"execute_result"}],"source":["harness.generated_results()"]},{"cell_type":"markdown","metadata":{"id":"57lqGecA9UXG"},"source":["This method returns the generated results in the form of a pandas dataframe, which provides a convenient and easy-to-use format for working with the test results. You can use this method to quickly identify the test cases that failed and to determine where fixes are needed."]},{"cell_type":"markdown","metadata":{"id":"jPvPCr_S9Zb8"},"source":["#### Report of the tests"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":112},"executionInfo":{"elapsed":43,"status":"ok","timestamp":1692343088822,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"gp57HcF9yxi7","outputId":"b29fc543-331d-4b7e-c599-1e23b2cd6982"},"outputs":[{"data":{"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
categorytest_typefail_countpass_countpass_rateminimum_pass_ratepass
0robustnessadd_typo5816874%73%True
1robustnesslowercase0226100%65%True
\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","
\n","
\n"],"text/plain":[" category test_type fail_count pass_count pass_rate minimum_pass_rate \\\n","0 robustness add_typo 58 168 74% 73% \n","1 robustness lowercase 0 226 100% 65% \n","\n"," pass \n","0 True \n","1 True "]},"execution_count":13,"metadata":{},"output_type":"execute_result"}],"source":["harness.report()"]},{"cell_type":"markdown","metadata":{"id":"7rpJ3QbPinkT"},"source":["It summarizes the results giving information about pass and fail counts and overall test pass/fail flag."]},{"cell_type":"markdown","metadata":{"id":"3g-s1Gikv65h"},"source":["#### Step 3: Augment CoNLL Training Set Based on Robustness Test Results"]},{"cell_type":"markdown","metadata":{"id":"JqMbXhF11rmX"},"source":["Templatic Augmentation is a technique that allows you to generate new training data by applying a set of predefined templates to the original training data. The templates are designed to introduce noise into the training data in a way that simulates real-world conditions. The augmentation process is controlled by a configuration file that specifies the augmentation templates to be used and the proportion of the training data to be augmented. The augmentation process is performed by the augment() method of the **Harness** class.\n","\n","**Augumentation with templates**\n","\n","Templatic augmentation is controlled by templates to be used with training data to be augmented. The augmentation process is performed by the augment() method of the **Harness** class.\n","\n","```\n","templates = [\"The {ORG} company is located in {LOC}\", \"The {ORG} company is located in {LOC} and is owned by {PER}\"]\n","\n","```\n"]},{"cell_type":"markdown","metadata":{"id":"PI75iT-F1rmX"},"source":["The `.augment()` function takes the following parameters:\n","\n","- `training_data` (dict): (Required) Specifies the source of the original training data. It should be a dictionary containing the necessary information about the dataset.\n","- `save_data_path` (str): (Required) Name of the file to store the augmented data. The augmented dataset will be saved in this file.\n","- `templates` (list): List of templates(string) or conll file to be used for augmentation."]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":7166,"status":"ok","timestamp":1692343095954,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"EBTz4Fqev7xX","outputId":"5828a60c-04f6-4018-e4e9-ff79b43558a5"},"outputs":[{"data":{"text/plain":[]},"execution_count":14,"metadata":{},"output_type":"execute_result"}],"source":["data_kwargs = {\n"," \"data_source\" : \"conll03.conll\",\n"," }\n","\n","harness.augment(\n"," training_data=data_kwargs,\n"," save_data_path='augmented_conll03.conll',\n"," templates=[\"The {ORG} company is located in {LOC}\", \"The {ORG} company is located in {LOC} and is owned by {PER}\"],\n"," )"]},{"cell_type":"markdown","metadata":{"id":"O2HL6Gip0ST0"},"source":["Essentially it applies perturbations to the input data based on the recommendations from the harness reports. Then this augmented_dataset is used to retrain the original model so as to make the model more robust and improve its performance."]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":35,"status":"ok","timestamp":1692343095957,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"tKOgWXL145WR","outputId":"1a739981-5444-48a8-8832-c24c1b1511c2"},"outputs":[{"name":"stdout","output_type":"stream","text":["The -X- -X- O\n","LG -X- -X- B-ORG\n","company -X- -X- O\n","is -X- -X- O\n","located -X- -X- O\n","in -X- -X- O\n","Iraq -X- -X- B-LOC\n","\n","The -X- -X- O\n","Charlton -X- -X- B-ORG\n","company -X- -X- O\n","is -X- -X- O\n","located -X- -X- O\n","in -X- -X- O\n","Afghanistan -X- -X- B-LOC\n","\n","The -X- -X- O\n","Dow -X- -X- B-ORG\n","Chemical -X- -X- I-ORG\n","Co -X- -X- I-ORG\n"]}],"source":["!head -n 20 augmented_conll03.conll"]},{"cell_type":"markdown","metadata":{"id":"z4aCF0kYwL4w"},"source":["#### Step 4: Train New NER Model on Augmented CoNLL"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":171669,"status":"ok","timestamp":1692343267610,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"WvRFmf3PGz3k","outputId":"a09ac6ea-7eb3-4c98-c839-f0925cdde057"},"outputs":[{"name":"stdout","output_type":"stream","text":["Warning::Spark Session already created, some configs may not take.\n","Warning::Spark Session already created, some configs may not take.\n","small_bert_L2_128 download started this may take some time.\n","Approximate size to download 16.1 MB\n","[OK!]\n"]}],"source":["augmented_ner_model = nlp.load('bert train.ner').fit(dataset_path= \"augmented_conll03.conll\")"]},{"cell_type":"markdown","metadata":{"id":"QK8o7XaI_ZAf"},"source":["#### Load saved test configurations, data"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":20448,"status":"ok","timestamp":1692343287998,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"UpaSjj05_fPd","outputId":"cec4e7a9-a81e-46ac-f5b9-81df3991e012"},"outputs":[{"name":"stdout","output_type":"stream","text":["Test Configuration : \n"," {\n"," \"tests\": {\n"," \"defaults\": {\n"," \"min_pass_rate\": 0.65\n"," },\n"," \"robustness\": {\n"," \"add_typo\": {\n"," \"min_pass_rate\": 0.73\n"," },\n"," \"lowercase\": {\n"," \"min_pass_rate\": 0.65\n"," }\n"," }\n"," }\n","}\n"]},{"name":"stderr","output_type":"stream","text":["Generating testcases...: 100%|██████████| 1/1 [00:00<00:00, 506.68it/s]\n"]}],"source":["harness = Harness.load(\"saved_test_configurations\",model=augmented_ner_model, task=\"ner\")"]},{"cell_type":"markdown","metadata":{"id":"9aif5bl_G0GZ"},"source":["#### Step 5: Test New NER Model Robustness"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":70937,"status":"ok","timestamp":1692343358875,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"StrOVtMoAQpf","outputId":"2b264ad3-ce80-458e-91dc-8f13672fe95f"},"outputs":[{"name":"stderr","output_type":"stream","text":["Running testcases... : 100%|██████████| 452/452 [01:10<00:00, 6.42it/s]\n"]},{"data":{"text/plain":[]},"execution_count":18,"metadata":{},"output_type":"execute_result"}],"source":["harness.run()"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":562},"executionInfo":{"elapsed":82,"status":"ok","timestamp":1692343358877,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"znh2xqQmAWHf","outputId":"513f8838-2ba6-4cb1-adf8-20f19afea37b"},"outputs":[{"data":{"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
categorytest_typeoriginaltest_caseexpected_resultactual_resultpass
0robustnessadd_typoSOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI...SOCCER - JAPAN GET LUCKY WIN , CHINA IN SURYRI...soccer - japan get lucky win , china in surpri...soccer - japan get lucky win , china in suryri...True
1robustnessadd_typoNadim LadkiNadin Ladkinadim ladki: ORGnadin ladki: ORGTrue
2robustnessadd_typoAL-AIN , United Arab Emirates 1996-12-06AL-AIN , United Arab Rmirates 1996-12-06al-ain: PER, , united arab emirates 1996-12-06...al-ain , united arab rmirates 1996-12-06: ORGFalse
3robustnessadd_typoJapan began the defence of their Asian Cup tit...Japan began the defence of their Asian Cyp tit...japan began: ORG, defence of their asian cup t...japan began: ORG, defence of their asian cyp t...True
4robustnessadd_typoBut China saw their luck desert them in the se...But China saw their luck desert them in the se...but china saw their luck desert them in the se...but china saw their luck desert them in the se...True
........................
447robustnesslowercasePortuguesa 1 Atletico Mineiro 0portuguesa 1 atletico mineiro 0portuguesa 1 atletico mineiro 0: ORGportuguesa 1 atletico mineiro 0: ORGTrue
448robustnesslowercaseCRICKET - LARA ENDURES ANOTHER MISERABLE DAY .cricket - lara endures another miserable day .cricket - lara endures another miserable day: ORGcricket - lara endures another miserable day: ORGTrue
449robustnesslowercaseRobert Galvinrobert galvinrobert galvin: PERrobert galvin: PERTrue
450robustnesslowercaseMELBOURNE 1996-12-06melbourne 1996-12-06melbourne: PER, 1996-12-06: ORGmelbourne: PER, 1996-12-06: ORGTrue
451robustnesslowercaseAustralia gave Brian Lara another reason to be...australia gave brian lara another reason to be...australia gave brian lara another reason to be...australia gave brian lara another reason to be...True
\n","

452 rows × 7 columns

\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","
\n","
\n"],"text/plain":[" category test_type original \\\n","0 robustness add_typo SOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI... \n","1 robustness add_typo Nadim Ladki \n","2 robustness add_typo AL-AIN , United Arab Emirates 1996-12-06 \n","3 robustness add_typo Japan began the defence of their Asian Cup tit... \n","4 robustness add_typo But China saw their luck desert them in the se... \n",".. ... ... ... \n","447 robustness lowercase Portuguesa 1 Atletico Mineiro 0 \n","448 robustness lowercase CRICKET - LARA ENDURES ANOTHER MISERABLE DAY . \n","449 robustness lowercase Robert Galvin \n","450 robustness lowercase MELBOURNE 1996-12-06 \n","451 robustness lowercase Australia gave Brian Lara another reason to be... \n","\n"," test_case \\\n","0 SOCCER - JAPAN GET LUCKY WIN , CHINA IN SURYRI... \n","1 Nadin Ladki \n","2 AL-AIN , United Arab Rmirates 1996-12-06 \n","3 Japan began the defence of their Asian Cyp tit... \n","4 But China saw their luck desert them in the se... \n",".. ... \n","447 portuguesa 1 atletico mineiro 0 \n","448 cricket - lara endures another miserable day . \n","449 robert galvin \n","450 melbourne 1996-12-06 \n","451 australia gave brian lara another reason to be... \n","\n"," expected_result \\\n","0 soccer - japan get lucky win , china in surpri... \n","1 nadim ladki: ORG \n","2 al-ain: PER, , united arab emirates 1996-12-06... \n","3 japan began: ORG, defence of their asian cup t... \n","4 but china saw their luck desert them in the se... \n",".. ... \n","447 portuguesa 1 atletico mineiro 0: ORG \n","448 cricket - lara endures another miserable day: ORG \n","449 robert galvin: PER \n","450 melbourne: PER, 1996-12-06: ORG \n","451 australia gave brian lara another reason to be... \n","\n"," actual_result pass \n","0 soccer - japan get lucky win , china in suryri... True \n","1 nadin ladki: ORG True \n","2 al-ain , united arab rmirates 1996-12-06: ORG False \n","3 japan began: ORG, defence of their asian cyp t... True \n","4 but china saw their luck desert them in the se... True \n",".. ... ... \n","447 portuguesa 1 atletico mineiro 0: ORG True \n","448 cricket - lara endures another miserable day: ORG True \n","449 robert galvin: PER True \n","450 melbourne: PER, 1996-12-06: ORG True \n","451 australia gave brian lara another reason to be... True \n","\n","[452 rows x 7 columns]"]},"execution_count":19,"metadata":{},"output_type":"execute_result"}],"source":["harness.generated_results()"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/","height":112},"executionInfo":{"elapsed":31,"status":"ok","timestamp":1692343358879,"user":{"displayName":"Prikshit sharma","userId":"07819241395213139913"},"user_tz":-330},"id":"JSqkrBOZ-TeG","outputId":"24a29834-ca8f-4e4d-b976-ad86f264e485"},"outputs":[{"data":{"text/html":["\n","
\n","
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
categorytest_typefail_countpass_countpass_rateminimum_pass_ratepass
0robustnessadd_typo5716975%73%True
1robustnesslowercase0226100%65%True
\n","
\n","
\n","\n","
\n"," \n","\n"," \n","\n"," \n","
\n","\n","\n","
\n"," \n","\n","\n","\n"," \n","
\n","
\n","
\n"],"text/plain":[" category test_type fail_count pass_count pass_rate minimum_pass_rate \\\n","0 robustness add_typo 57 169 75% 73% \n","1 robustness lowercase 0 226 100% 65% \n","\n"," pass \n","0 True \n","1 True "]},"execution_count":20,"metadata":{},"output_type":"execute_result"}],"source":["harness.report()"]}],"metadata":{"colab":{"machine_shape":"hm","provenance":[]},"gpuClass":"standard","kernelspec":{"display_name":"Python 3","name":"python3"},"language_info":{"name":"python","version":"3.8.9"}},"nbformat":4,"nbformat_minor":0} +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "e7PsSmy9sCoR" + }, + "source": [ + "![image.png]()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MhgkQYQiEvZt" + }, + "source": [ + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/langtest/blob/main/demo/tutorials/misc/Templatic_Augmentation_Notebook.ipynb)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WJJzt3RWhEc6" + }, + "source": [ + "**LangTest** is an open-source python library designed to help developers deliver safe and effective Natural Language Processing (NLP) models. Whether you are using **John Snow Labs, Hugging Face, Spacy** models or **OpenAI, Cohere, AI21, Hugging Face Inference API and Azure-OpenAI** based LLMs, it has got you covered. You can test any Named Entity Recognition (NER), Text Classification model using the library. We also support testing LLMS for Question-Answering and Summarization tasks on benchmark datasets. The library supports 50+ out of the box tests. These tests fall into robustness, accuracy, bias, representation, toxicity and fairness test categories.\n", + "\n", + "Metrics are calculated by comparing the model's extractions in the original list of sentences against the extractions carried out in the noisy list of sentences. The original annotated labels are not used at any point, we are simply comparing the model against itself in a 2 settings." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "26qXWhCYhHAt" + }, + "source": [ + "# Getting started with LangTest on John Snow Labs" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "oGIyE43uhTxH", + "outputId": "b6bc6b0e-7206-4685-a73f-5e4f3406c280" + }, + "outputs": [], + "source": [ + "!pip install \"langtest[johnsnowlabs,openai]\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yR6kjOaiheKN" + }, + "source": [ + "# Harness and its Parameters\n", + "\n", + "The Harness class is a testing class for Natural Language Processing (NLP) models. It evaluates the performance of a NLP model on a given task using test data and generates a report with test results.Harness can be imported from the LangTest library in the following way." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "lTzSJpMlhgq5" + }, + "outputs": [], + "source": [ + "#Import Harness from the LangTest library\n", + "from langtest import Harness" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sBcZjwJBhkOw" + }, + "source": [ + "It imports the Harness class from within the module, that is designed to provide a blueprint or framework for conducting NLP testing, and that instances of the Harness class can be customized or configured for different testing scenarios or environments.\n", + "\n", + "Here is a list of the different parameters that can be passed to the Harness function:\n", + "\n", + "
\n", + "\n", + "\n", + "\n", + "| Parameter | Description |\n", + "| - | - |\n", + "| **task** | Task for which the model is to be evaluated (text-classification or ner) |\n", + "| **model** | Specifies the model(s) to be evaluated. This parameter can be provided as either a dictionary or a list of dictionaries. Each dictionary should contain the following keys:
  • model (mandatory): \tPipelineModel or path to a saved model or pretrained pipeline/model from hub.
  • hub (mandatory): Hub (library) to use in back-end for loading model from public models hub or from path
|\n", + "| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys:
  • data_source (mandatory): The source of the data.
  • subset (optional): The subset of the data.
  • feature_column (optional): The column containing the features.
  • target_column (optional): The column containing the target labels.
  • split (optional): The data split to be used.
  • source (optional): Set to 'huggingface' when loading Hugging Face dataset.
|\n", + "| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n", + "\n", + "\n", + "
\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JFhJ9CcbsKqN" + }, + "source": [ + "# Real-World Project Workflows\n", + "\n", + "In this section, we dive into complete workflows for using the model testing module in real-world project settings." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UtxtE6Y0r4CJ" + }, + "source": [ + "## Robustness Testing\n", + "\n", + "In this example, we will be testing a model's robustness. We will be applying 2 tests: add_typo and lowercase. The real-world project workflow of the model robustness testing and fixing in this case goes as follows:\n", + "\n", + "1. Train NER model on original CoNLL training set\n", + "\n", + "2. Test NER model robustness on CoNLL test set\n", + "\n", + "3. Augment CoNLL training set based on test results\n", + "\n", + "4. Train new NER model on augmented CoNLL training set\n", + "\n", + "5. Test new NER model robustness on the CoNLL test set from step 2\n", + "\n", + "6. Compare robustness of new NER model against original NER model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "I21Jmq79jgC6" + }, + "source": [ + "#### Load Train and Test CoNLL" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "6uW22VqJje8E", + "outputId": "0870162e-f3be-41b5-8764-ac464d7aa6a9" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "--2023-11-30 13:43:59-- https://raw.githubusercontent.com/JohnSnowLabs/langtest/main/langtest/data/conll/sample.conll\n", + "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ...\n", + "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.\n", + "HTTP request sent, awaiting response... 200 OK\n", + "Length: 50519 (49K) [text/plain]\n", + "Saving to: ‘sample.conll.1’\n", + "\n", + "sample.conll.1 100%[===================>] 49.33K --.-KB/s in 0.04s \n", + "\n", + "2023-11-30 13:44:00 (1.10 MB/s) - ‘sample.conll.1’ saved [50519/50519]\n", + "\n", + "--2023-11-30 13:44:00-- https://raw.githubusercontent.com/JohnSnowLabs/langtest/main/demo/data/conll03.conll\n", + "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.108.133, ...\n", + "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.\n", + "HTTP request sent, awaiting response... 200 OK\n", + "Length: 827443 (808K) [text/plain]\n", + "Saving to: ‘conll03.conll.1’\n", + "\n", + "conll03.conll.1 100%[===================>] 808.05K 4.30MB/s in 0.2s \n", + "\n", + "2023-11-30 13:44:02 (4.30 MB/s) - ‘conll03.conll.1’ saved [827443/827443]\n", + "\n" + ] + } + ], + "source": [ + "# Load test CoNLL\n", + "!wget https://raw.githubusercontent.com/JohnSnowLabs/langtest/main/langtest/data/conll/sample.conll\n", + "\n", + "# Load train CoNLL\n", + "!wget https://raw.githubusercontent.com/JohnSnowLabs/langtest/main/demo/data/conll03.conll" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MNtH_HOUt_PL" + }, + "source": [ + "#### Step 1: Train NER Model" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "id": "jRnEmCfPhsZs" + }, + "outputs": [], + "source": [ + "from johnsnowlabs import nlp" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "bHXeP18sGp-g", + "outputId": "17793e40-704e-4f89-fa14-965e77288db3" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Warning::Spark Session already created, some configs may not take.\n", + "Warning::Spark Session already created, some configs may not take.\n", + "small_bert_L2_128 download started this may take some time.\n", + "Approximate size to download 16.1 MB\n", + "[OK!]\n" + ] + } + ], + "source": [ + "ner_model = nlp.load('bert train.ner').fit(dataset_path=\"/content/conll03.conll\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kKgXC7cvuyar" + }, + "source": [ + "#### Step 2: Test NER Model Robustness " + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "RVk9NWn7u-Lm", + "outputId": "54d635c8-528a-424a-9abf-abc65ffc4ff3" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Test Configuration : \n", + " {\n", + " \"tests\": {\n", + " \"defaults\": {\n", + " \"min_pass_rate\": 1.0\n", + " },\n", + " \"robustness\": {\n", + " \"add_typo\": {\n", + " \"min_pass_rate\": 0.7\n", + " },\n", + " \"american_to_british\": {\n", + " \"min_pass_rate\": 0.7\n", + " }\n", + " },\n", + " \"accuracy\": {\n", + " \"min_micro_f1_score\": {\n", + " \"min_score\": 0.7\n", + " }\n", + " },\n", + " \"bias\": {\n", + " \"replace_to_female_pronouns\": {\n", + " \"min_pass_rate\": 0.7\n", + " },\n", + " \"replace_to_low_income_country\": {\n", + " \"min_pass_rate\": 0.7\n", + " }\n", + " },\n", + " \"fairness\": {\n", + " \"min_gender_f1_score\": {\n", + " \"min_score\": 0.6\n", + " }\n", + " },\n", + " \"representation\": {\n", + " \"min_label_representation_count\": {\n", + " \"min_count\": 50\n", + " }\n", + " }\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "harness = Harness(task=\"ner\", model={\"model\": ner_model, \"hub\": \"johnsnowlabs\"}, data={\"data_source\":\"sample.conll\"})" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "mynkAUwZyuFN", + "outputId": "6ebb9251-5e34-409e-9604-09cadfd11e65" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'tests': {'defaults': {'min_pass_rate': 0.65},\n", + " 'robustness': {'add_typo': {'min_pass_rate': 0.73},\n", + " 'lowercase': {'min_pass_rate': 0.65}}}}" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "harness.configure({\n", + " 'tests': {\n", + " 'defaults': {'min_pass_rate': 0.65},\n", + "\n", + " 'robustness': {\n", + " 'add_typo': {'min_pass_rate': 0.73},\n", + " 'lowercase':{'min_pass_rate': 0.65},\n", + " }\n", + " }\n", + "})" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZPU46A7WigFr" + }, + "source": [ + "Here we have configured the harness to perform two robustness tests (add_typo and lowercase) and defined the minimum pass rate for each test." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "MomLlmTwjpzU" + }, + "source": [ + "\n", + "#### Generating the test cases.\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "UiUNzTwF89ye", + "outputId": "3d577348-9d1b-4152-a95a-23c8e9ae5633" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n", + "Generating testcases...: 100%|██████████| 1/1 [00:00<00:00, 5184.55it/s]\n", + "WARNING:root:[W009] Removing samples where no transformation has been applied:\n", + "[W010] - Test 'add_typo': 15 samples removed out of 226\n", + "[W010] - Test 'lowercase': 3 samples removed out of 226\n", + "\n" + ] + }, + { + "data": { + "text/plain": [] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "harness.generate()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UiMIF-o49Bg_" + }, + "source": [ + "harness.generate() method automatically generates the test cases (based on the provided configuration)" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 423 + }, + "id": "p0tTwFfc891k", + "outputId": "882c68c1-a913-4f1a-9e71-dc3bcdba0d77" + }, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
categorytest_typeoriginaltest_case
0robustnessadd_typoSOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI...SOCCER - JAPAN GET LUCOY WIN , CHINA IN SURPRI...
1robustnessadd_typoNadim LadkiNadim Ladoi
2robustnessadd_typoAL-AIN , United Arab Emirates 1996-12-06AL-AIN , Unitev Arab Emirates 1996-12-06
3robustnessadd_typoJapan began the defence of their Asian Cup tit...Japan began the defence of their Asian Cup tit...
4robustnessadd_typoBut China saw their luck desert them in the se...But China saw their luck desert them in the se...
...............
429robustnesslowercasePortuguesa 1 Atletico Mineiro 0portuguesa 1 atletico mineiro 0
430robustnesslowercaseCRICKET - LARA ENDURES ANOTHER MISERABLE DAY .cricket - lara endures another miserable day .
431robustnesslowercaseRobert Galvinrobert galvin
432robustnesslowercaseMELBOURNE 1996-12-06melbourne 1996-12-06
433robustnesslowercaseAustralia gave Brian Lara another reason to be...australia gave brian lara another reason to be...
\n", + "

434 rows × 4 columns

\n", + "
\n", + "
\n", + "\n", + "
\n", + " \n", + "\n", + " \n", + "\n", + " \n", + "
\n", + "\n", + "\n", + "
\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "
\n", + "\n", + "
\n", + "
\n" + ], + "text/plain": [ + " category test_type original \\\n", + "0 robustness add_typo SOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI... \n", + "1 robustness add_typo Nadim Ladki \n", + "2 robustness add_typo AL-AIN , United Arab Emirates 1996-12-06 \n", + "3 robustness add_typo Japan began the defence of their Asian Cup tit... \n", + "4 robustness add_typo But China saw their luck desert them in the se... \n", + ".. ... ... ... \n", + "429 robustness lowercase Portuguesa 1 Atletico Mineiro 0 \n", + "430 robustness lowercase CRICKET - LARA ENDURES ANOTHER MISERABLE DAY . \n", + "431 robustness lowercase Robert Galvin \n", + "432 robustness lowercase MELBOURNE 1996-12-06 \n", + "433 robustness lowercase Australia gave Brian Lara another reason to be... \n", + "\n", + " test_case \n", + "0 SOCCER - JAPAN GET LUCOY WIN , CHINA IN SURPRI... \n", + "1 Nadim Ladoi \n", + "2 AL-AIN , Unitev Arab Emirates 1996-12-06 \n", + "3 Japan began the defence of their Asian Cup tit... \n", + "4 But China saw their luck desert them in the se... \n", + ".. ... \n", + "429 portuguesa 1 atletico mineiro 0 \n", + "430 cricket - lara endures another miserable day . \n", + "431 robert galvin \n", + "432 melbourne 1996-12-06 \n", + "433 australia gave brian lara another reason to be... \n", + "\n", + "[434 rows x 4 columns]" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "harness.testcases()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nRgq7e-g9Gev" + }, + "source": [ + "harness.testcases() method gives the produced test cases in form of a pandas data frame." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "IaPBjl_R9slh" + }, + "source": [ + "#### Saving test configurations, data, test cases" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "id": "ba0MYutC96CN" + }, + "outputs": [], + "source": [ + "harness.save(\"saved_test_configurations\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "groBqKuD9I34" + }, + "source": [ + "#### Running the tests" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "CHQHRbQb9EDi", + "outputId": "7c813f3c-8ce7-4795-d603-9dba712f1aaa" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Running testcases... : 100%|██████████| 434/434 [00:51<00:00, 8.35it/s]\n" + ] + }, + { + "data": { + "text/plain": [] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "harness.run()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "71zHGe2q9O6G" + }, + "source": [ + "Called after harness.generate() and is to used to run all the tests. Returns a pass/fail flag for each test." + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 510 + }, + "id": "keBNodfJ894u", + "outputId": "c0bcee1a-8e76-462d-a883-2639bdc69df9" + }, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
categorytest_typeoriginaltest_caseexpected_resultactual_resultpass
0robustnessadd_typoSOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI...SOCCER - JAPAN GET LUCOY WIN , CHINA IN SURPRI...japan: LOC, lucky: LOC, china: LOCjapan: LOC, lucoy: PER, china: LOCFalse
1robustnessadd_typoNadim LadkiNadim Ladoinadim ladki: PERnadim ladoi: PERTrue
2robustnessadd_typoAL-AIN , United Arab Emirates 1996-12-06AL-AIN , Unitev Arab Emirates 1996-12-06al-ain: LOC, united arab emirates: LOCal-ain: LOC, unitev: PER, arab emirates: LOCFalse
3robustnessadd_typoJapan began the defence of their Asian Cup tit...Japan began the defence of their Asian Cup tit...japan: LOC, asian cup: MISC, syria: LOCjapan: LOC, asian cup: MISC, lucuy: PER, syria...True
4robustnessadd_typoBut China saw their luck desert them in the se...But China saw their luck desert them in the se...china: LOC, uzbekistan: LOCchina: LOC, yzbekistan: LOCTrue
........................
429robustnesslowercasePortuguesa 1 Atletico Mineiro 0portuguesa 1 atletico mineiro 0portuguesa: ORG, atletico mineiro: ORGportuguesa: ORG, atletico mineiro: ORGTrue
430robustnesslowercaseCRICKET - LARA ENDURES ANOTHER MISERABLE DAY .cricket - lara endures another miserable day .lara: PERlara: PERTrue
431robustnesslowercaseRobert Galvinrobert galvinrobert galvin: PERrobert galvin: PERTrue
432robustnesslowercaseMELBOURNE 1996-12-06melbourne 1996-12-06melbourne: LOCmelbourne: LOCTrue
433robustnesslowercaseAustralia gave Brian Lara another reason to be...australia gave brian lara another reason to be...australia: LOC, brian lara: PER, west indies: ...australia: LOC, brian lara: PER, west indies: ...True
\n", + "

434 rows × 7 columns

\n", + "
\n", + "
\n", + "\n", + "
\n", + " \n", + "\n", + " \n", + "\n", + " \n", + "
\n", + "\n", + "\n", + "
\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "
\n", + "\n", + "
\n", + "
\n" + ], + "text/plain": [ + " category test_type original \\\n", + "0 robustness add_typo SOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI... \n", + "1 robustness add_typo Nadim Ladki \n", + "2 robustness add_typo AL-AIN , United Arab Emirates 1996-12-06 \n", + "3 robustness add_typo Japan began the defence of their Asian Cup tit... \n", + "4 robustness add_typo But China saw their luck desert them in the se... \n", + ".. ... ... ... \n", + "429 robustness lowercase Portuguesa 1 Atletico Mineiro 0 \n", + "430 robustness lowercase CRICKET - LARA ENDURES ANOTHER MISERABLE DAY . \n", + "431 robustness lowercase Robert Galvin \n", + "432 robustness lowercase MELBOURNE 1996-12-06 \n", + "433 robustness lowercase Australia gave Brian Lara another reason to be... \n", + "\n", + " test_case \\\n", + "0 SOCCER - JAPAN GET LUCOY WIN , CHINA IN SURPRI... \n", + "1 Nadim Ladoi \n", + "2 AL-AIN , Unitev Arab Emirates 1996-12-06 \n", + "3 Japan began the defence of their Asian Cup tit... \n", + "4 But China saw their luck desert them in the se... \n", + ".. ... \n", + "429 portuguesa 1 atletico mineiro 0 \n", + "430 cricket - lara endures another miserable day . \n", + "431 robert galvin \n", + "432 melbourne 1996-12-06 \n", + "433 australia gave brian lara another reason to be... \n", + "\n", + " expected_result \\\n", + "0 japan: LOC, lucky: LOC, china: LOC \n", + "1 nadim ladki: PER \n", + "2 al-ain: LOC, united arab emirates: LOC \n", + "3 japan: LOC, asian cup: MISC, syria: LOC \n", + "4 china: LOC, uzbekistan: LOC \n", + ".. ... \n", + "429 portuguesa: ORG, atletico mineiro: ORG \n", + "430 lara: PER \n", + "431 robert galvin: PER \n", + "432 melbourne: LOC \n", + "433 australia: LOC, brian lara: PER, west indies: ... \n", + "\n", + " actual_result pass \n", + "0 japan: LOC, lucoy: PER, china: LOC False \n", + "1 nadim ladoi: PER True \n", + "2 al-ain: LOC, unitev: PER, arab emirates: LOC False \n", + "3 japan: LOC, asian cup: MISC, lucuy: PER, syria... True \n", + "4 china: LOC, yzbekistan: LOC True \n", + ".. ... ... \n", + "429 portuguesa: ORG, atletico mineiro: ORG True \n", + "430 lara: PER True \n", + "431 robert galvin: PER True \n", + "432 melbourne: LOC True \n", + "433 australia: LOC, brian lara: PER, west indies: ... True \n", + "\n", + "[434 rows x 7 columns]" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "harness.generated_results()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "57lqGecA9UXG" + }, + "source": [ + "This method returns the generated results in the form of a pandas dataframe, which provides a convenient and easy-to-use format for working with the test results. You can use this method to quickly identify the test cases that failed and to determine where fixes are needed." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jPvPCr_S9Zb8" + }, + "source": [ + "#### Report of the tests" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 112 + }, + "id": "gp57HcF9yxi7", + "outputId": "1f4a3c0d-d4e2-42a1-9673-07c885657eec" + }, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
categorytest_typefail_countpass_countpass_rateminimum_pass_ratepass
0robustnessadd_typo5615573%73%True
1robustnesslowercase0223100%65%True
\n", + "
\n", + "
\n", + "\n", + "
\n", + " \n", + "\n", + " \n", + "\n", + " \n", + "
\n", + "\n", + "\n", + "
\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "
\n", + "\n", + "
\n", + "
\n" + ], + "text/plain": [ + " category test_type fail_count pass_count pass_rate minimum_pass_rate \\\n", + "0 robustness add_typo 56 155 73% 73% \n", + "1 robustness lowercase 0 223 100% 65% \n", + "\n", + " pass \n", + "0 True \n", + "1 True " + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "harness.report()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7rpJ3QbPinkT" + }, + "source": [ + "It summarizes the results giving information about pass and fail counts and overall test pass/fail flag." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3g-s1Gikv65h" + }, + "source": [ + "#### Step 3: Augment CoNLL Training Set Based on Robustness Test Results" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JqMbXhF11rmX" + }, + "source": [ + "Templatic Augmentation is a technique that allows you to generate new training data by applying a set of predefined templates to the original training data. The templates are designed to introduce noise into the training data in a way that simulates real-world conditions. The augmentation process is controlled by a configuration file that specifies the augmentation templates to be used and the proportion of the training data to be augmented. The augmentation process is performed by the augment() method of the **Harness** class.\n", + "\n", + "**Augumentation with templates**\n", + "\n", + "Templatic augmentation is controlled by templates to be used with training data to be augmented. The augmentation process is performed by the augment() method of the **Harness** class.\n", + "\n", + "```\n", + "templates = [\"The {ORG} company is located in {LOC}\"]\n", + "\n", + "```\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "PI75iT-F1rmX" + }, + "source": [ + "The `.augment()` function takes the following parameters:\n", + "\n", + "- `training_data` (dict): (Required) Specifies the source of the original training data. It should be a dictionary containing the necessary information about the dataset.\n", + "- `save_data_path` (str): (Required) Name of the file to store the augmented data. The augmented dataset will be saved in this file.\n", + "- `templates` (list): List of templates(string) or conll file to be used for augmentation.\n", + "- `generate_templates` (bool): if set to True, generates sample templates from given ones.\n", + "- `show_templates` (bool): if set to True, displays the used templates." + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "EBTz4Fqev7xX", + "outputId": "47a61a3e-580e-4e27-c5ac-bd2d3d4b0b6c" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The {ORG} company is located in {LOC}\n", + "'The {ORG} organization is based in {LOC}',\n", + " '{ORG} is headquartered in {LOC}',\n", + " '{LOC} is the home of {ORG}',\n", + " '{ORG} is situated in {LOC}',\n", + " '{LOC} is where {ORG} is located',\n", + " '{ORG} is found in {LOC}',\n", + " '{LOC} is the location of {ORG}',\n", + " '{ORG} is based in {LOC}',\n", + " '{LOC} is the home of the {ORG} company',\n", + " '{ORG} is situated in the city of {LOC}'\n" + ] + }, + { + "data": { + "text/plain": [] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data_kwargs = {\n", + " \"data_source\" : \"conll03.conll\",\n", + " }\n", + "\n", + "import openai\n", + "openai.api_key = \"YOUR OPENAI KEY\"\n", + "harness.augment(\n", + " training_data=data_kwargs,\n", + " save_data_path='augmented_conll03.conll',\n", + " templates=[\"The {ORG} company is located in {LOC}\"],\n", + " generate_templates = True,\n", + " show_templates = True,\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "O2HL6Gip0ST0" + }, + "source": [ + "Essentially it applies perturbations to the input data based on the recommendations from the harness reports. Then this augmented_dataset is used to retrain the original model so as to make the model more robust and improve its performance." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "tKOgWXL145WR", + "outputId": "5e5aff93-254d-48e5-c27a-9336177de64f" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "The -X- -X- O\n", + "Dinamo -X- -X- B-ORG\n", + "company -X- -X- O\n", + "is -X- -X- O\n", + "located -X- -X- O\n", + "in -X- -X- O\n", + "Yugoslavia -X- -X- B-LOC\n", + "\n", + "The -X- -X- O\n", + "Red -X- -X- B-ORG\n", + "Star -X- -X- I-ORG\n", + "company -X- -X- O\n", + "is -X- -X- O\n", + "located -X- -X- O\n", + "in -X- -X- O\n", + "Ghana -X- -X- B-LOC\n", + "\n", + "The -X- -X- O\n" + ] + } + ], + "source": [ + "!head -n 20 augmented_conll03.conll" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "z4aCF0kYwL4w" + }, + "source": [ + "#### Step 4: Train New NER Model on Augmented CoNLL" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "WvRFmf3PGz3k", + "outputId": "6d4ffed5-2951-4544-fbf7-a9c4127ad905" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Warning::Spark Session already created, some configs may not take.\n", + "Warning::Spark Session already created, some configs may not take.\n", + "small_bert_L2_128 download started this may take some time.\n", + "Approximate size to download 16.1 MB\n", + "[OK!]\n" + ] + } + ], + "source": [ + "augmented_ner_model = nlp.load('bert train.ner').fit(dataset_path= \"augmented_conll03.conll\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QK8o7XaI_ZAf" + }, + "source": [ + "#### Load saved test configurations, data" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "UpaSjj05_fPd", + "outputId": "83f5a3bf-5f1d-4119-d359-dfc75cb792c8" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Test Configuration : \n", + " {\n", + " \"tests\": {\n", + " \"defaults\": {\n", + " \"min_pass_rate\": 0.65\n", + " },\n", + " \"robustness\": {\n", + " \"add_typo\": {\n", + " \"min_pass_rate\": 0.73\n", + " },\n", + " \"lowercase\": {\n", + " \"min_pass_rate\": 0.65\n", + " }\n", + " }\n", + " }\n", + "}\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n", + "Generating testcases...: 100%|██████████| 1/1 [00:00<00:00, 5127.51it/s]\n", + "WARNING:root:[W009] Removing samples where no transformation has been applied:\n", + "[W010] - Test 'add_typo': 11 samples removed out of 226\n", + "[W010] - Test 'lowercase': 3 samples removed out of 226\n", + "\n" + ] + } + ], + "source": [ + "harness = Harness.load(\"saved_test_configurations\",model={\"model\":augmented_ner_model,\"hub\":\"johnsnowlabs\"}, task=\"ner\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9aif5bl_G0GZ" + }, + "source": [ + "#### Step 5: Test New NER Model Robustness" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "StrOVtMoAQpf", + "outputId": "77ffb7ff-9413-4d38-d1f8-fecf0bd68852" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Running testcases... : 100%|██████████| 438/438 [00:50<00:00, 8.63it/s]\n" + ] + }, + { + "data": { + "text/plain": [] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "harness.run()" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 527 + }, + "id": "znh2xqQmAWHf", + "outputId": "32c10a0f-6737-4dad-8d58-d47bde83bb1a" + }, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
categorytest_typeoriginaltest_caseexpected_resultactual_resultpass
0robustnessadd_typoSOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI...SOCCER - JAPAN GET LUCKY WIN , CHINA IN WURPRI...soccer - japan get lucky win , china in surpri...soccer - japan get lucky win , china in wurpri...True
1robustnessadd_typoNadim LadkiNadum Ladkinadim ladki: ORGnadum ladki: ORGTrue
2robustnessadd_typoAL-AIN , United Arab Emirates 1996-12-06AL-AIG , United Arab Emirates 1996-12-06al-ain , united arab: ORG, emirates: LOC, 1996...al-aig , united arab: ORG, emirates: LOC, 1996...True
3robustnessadd_typoJapan began the defence of their Asian Cup tit...Japan began the defence of their Asian Cup tit...japan began the defence of their asian cup tit...japan began the defence of their asian cup tit...True
4robustnessadd_typoBut China saw their luck desert them in the se...But China saw their luck desert them in the se...but china saw their luck desert them in the se...but china saw their luck desert them in the se...True
........................
433robustnesslowercasePortuguesa 1 Atletico Mineiro 0portuguesa 1 atletico mineiro 0portuguesa 1 atletico mineiro 0: ORGportuguesa 1 atletico mineiro 0: ORGTrue
434robustnesslowercaseCRICKET - LARA ENDURES ANOTHER MISERABLE DAY .cricket - lara endures another miserable day .cricket - lara endures another miserable: ORGcricket - lara endures another miserable: ORGTrue
435robustnesslowercaseRobert Galvinrobert galvinrobert galvin: ORGrobert galvin: ORGTrue
436robustnesslowercaseMELBOURNE 1996-12-06melbourne 1996-12-06melbourne: LOC, 1996-12-06: ORGmelbourne: LOC, 1996-12-06: ORGTrue
437robustnesslowercaseAustralia gave Brian Lara another reason to be...australia gave brian lara another reason to be...australia gave brian lara another reason to be...australia gave brian lara another reason to be...True
\n", + "

438 rows × 7 columns

\n", + "
\n", + "
\n", + "\n", + "
\n", + " \n", + "\n", + " \n", + "\n", + " \n", + "
\n", + "\n", + "\n", + "
\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "
\n", + "\n", + "
\n", + "
\n" + ], + "text/plain": [ + " category test_type original \\\n", + "0 robustness add_typo SOCCER - JAPAN GET LUCKY WIN , CHINA IN SURPRI... \n", + "1 robustness add_typo Nadim Ladki \n", + "2 robustness add_typo AL-AIN , United Arab Emirates 1996-12-06 \n", + "3 robustness add_typo Japan began the defence of their Asian Cup tit... \n", + "4 robustness add_typo But China saw their luck desert them in the se... \n", + ".. ... ... ... \n", + "433 robustness lowercase Portuguesa 1 Atletico Mineiro 0 \n", + "434 robustness lowercase CRICKET - LARA ENDURES ANOTHER MISERABLE DAY . \n", + "435 robustness lowercase Robert Galvin \n", + "436 robustness lowercase MELBOURNE 1996-12-06 \n", + "437 robustness lowercase Australia gave Brian Lara another reason to be... \n", + "\n", + " test_case \\\n", + "0 SOCCER - JAPAN GET LUCKY WIN , CHINA IN WURPRI... \n", + "1 Nadum Ladki \n", + "2 AL-AIG , United Arab Emirates 1996-12-06 \n", + "3 Japan began the defence of their Asian Cup tit... \n", + "4 But China saw their luck desert them in the se... \n", + ".. ... \n", + "433 portuguesa 1 atletico mineiro 0 \n", + "434 cricket - lara endures another miserable day . \n", + "435 robert galvin \n", + "436 melbourne 1996-12-06 \n", + "437 australia gave brian lara another reason to be... \n", + "\n", + " expected_result \\\n", + "0 soccer - japan get lucky win , china in surpri... \n", + "1 nadim ladki: ORG \n", + "2 al-ain , united arab: ORG, emirates: LOC, 1996... \n", + "3 japan began the defence of their asian cup tit... \n", + "4 but china saw their luck desert them in the se... \n", + ".. ... \n", + "433 portuguesa 1 atletico mineiro 0: ORG \n", + "434 cricket - lara endures another miserable: ORG \n", + "435 robert galvin: ORG \n", + "436 melbourne: LOC, 1996-12-06: ORG \n", + "437 australia gave brian lara another reason to be... \n", + "\n", + " actual_result pass \n", + "0 soccer - japan get lucky win , china in wurpri... True \n", + "1 nadum ladki: ORG True \n", + "2 al-aig , united arab: ORG, emirates: LOC, 1996... True \n", + "3 japan began the defence of their asian cup tit... True \n", + "4 but china saw their luck desert them in the se... True \n", + ".. ... ... \n", + "433 portuguesa 1 atletico mineiro 0: ORG True \n", + "434 cricket - lara endures another miserable: ORG True \n", + "435 robert galvin: ORG True \n", + "436 melbourne: LOC, 1996-12-06: ORG True \n", + "437 australia gave brian lara another reason to be... True \n", + "\n", + "[438 rows x 7 columns]" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "harness.generated_results()" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 112 + }, + "id": "JSqkrBOZ-TeG", + "outputId": "918b4337-af90-4385-bba5-37577b850665" + }, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
categorytest_typefail_countpass_countpass_rateminimum_pass_ratepass
0robustnessadd_typo3917682%73%True
1robustnesslowercase0223100%65%True
\n", + "
\n", + "
\n", + "\n", + "
\n", + " \n", + "\n", + " \n", + "\n", + " \n", + "
\n", + "\n", + "\n", + "
\n", + " \n", + "\n", + "\n", + "\n", + " \n", + "
\n", + "\n", + "
\n", + "
\n" + ], + "text/plain": [ + " category test_type fail_count pass_count pass_rate minimum_pass_rate \\\n", + "0 robustness add_typo 39 176 82% 73% \n", + "1 robustness lowercase 0 223 100% 65% \n", + "\n", + " pass \n", + "0 True \n", + "1 True " + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "harness.report()" + ] + } + ], + "metadata": { + "colab": { + "machine_shape": "hm", + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.8.9" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}