Skip to content

Commit

Permalink
Merge pull request #983 from JohnSnowLabs/release/2.0.0
Browse files Browse the repository at this point in the history
Release/2.0.0
  • Loading branch information
ArshaanNazir authored Feb 20, 2024
2 parents 1c3e32d + fdf173f commit 69aeca2
Show file tree
Hide file tree
Showing 339 changed files with 55,050 additions and 60,113 deletions.
33 changes: 30 additions & 3 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -1,7 +1,34 @@
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "John Snow Labs"
- family-names: "John Snow Labs"

title: "LangTest"
date-released: 2022-11-18
url: "https://github.com/JohnSnowLabs/langtest"
version: 1.10.0
url: "https://www.langtest.org"
repository-code: "https://github.com/JohnSnowLabs/langtest"
keywords:
- LLM
- NLP
- Evaluation
- Harness
- Robustness
- Bias
preferred-citation:
type: article
authors:
- given-names: "Arshaan Nazir"
- given-names: "Thadaka Kalyan Chakravarthy"
- given-names: "David Amore Cecchini"
- given-names: "Thadaka Kalyan Chakravarthy"
- given-names: "Rakshit Khajuria"
- given-names: "Prikshit Sharma"
- given-names: "Ali Tarik Mirik"
- given-names: "Veysel Kocaman"
- given-names: "David Talby"
doi: "10.1016/j.simpa.2024.100619"
journal: "Software Impacts"
title: "LangTest: A comprehensive evaluation library for custom LLM and NLP models"
issue: 100619
volume: 19
year: 2024
24 changes: 21 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<p align="center">
<img src="https://github.com/RakshitKhajuria/test/assets/71117423/4e759227-de04-4ba6-8f41-bf33b948d614" alt="johnsnowlabs_logo" width="360" style="text-align:center;">
<img src="docs/assets/images/langtest/langtest_logo.png" alt="johnsnowlabs_logo" width="360" style="text-align:center;">
</p>

<div align="center">
Expand Down Expand Up @@ -35,7 +35,7 @@
<img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-v2.0%20adopted-ff69b4.svg">
</a>

![screenshot](https://raw.githubusercontent.com/JohnSnowLabs/langtest/gh-pages/docs/assets/images/langtest/langtest_flow_graphic.jpeg)
![Langtest Workflow](docs/assets/images/langtest/langtest_flow_graphic.jpeg)

<p align="center">
<a href="https://langtest.org/">Project's Website</a> •
Expand Down Expand Up @@ -102,7 +102,7 @@ You can check out the following LangTest articles:
| [**Streamlining ML Workflows: Integrating MLFlow Tracking with LangTest for Enhanced Model Evaluations**](https://medium.com/john-snow-labs/streamlining-ml-workflows-integrating-mlflow-tracking-with-langtest-for-enhanced-model-evaluations-4ce9863a0ff1) | In this blog post, we dive into the growing need for transparent, systematic, and comprehensive tracking of models. Enter MLFlow and LangTest: two tools that, when combined, create a revolutionary approach to ML development. |
| [**Testing the Question Answering Capabilities of Large Language Models**](https://medium.com/john-snow-labs/testing-the-question-answering-capabilities-of-large-language-models-1bc424d61740) | In this blog post, we dive into enhancing the QA evaluation capabilities using LangTest library. Explore about different evaluation methods that LangTest offers to address the complexities of evaluating Question Answering (QA) tasks. |
| [**Evaluating Stereotype Bias with LangTest**](https://medium.com/john-snow-labs/evaluating-stereotype-bias-with-langtest-8286af8f0f22) | In this blog post, we are focusing on using the StereoSet dataset to assess bias related to gender, profession, and race.|
| [**Unveiling Sentiments: Exploring LSTM-based Sentiment Analysis with PyTorch on the IMDB Dataset**](To be Published) | Explore the robustness of custom models with LangTest Insights.|
| [**Testing the Robustness of LSTM-Based Sentiment Analysis Models**](https://medium.com/john-snow-labs/testing-the-robustness-of-lstm-based-sentiment-analysis-models-67ed84e42997) | Explore the robustness of custom models with LangTest Insights.|
| [**LangTest Insights: A Deep Dive into LLM Robustness on OpenBookQA**](https://medium.com/john-snow-labs/langtest-insights-a-deep-dive-into-llm-robustness-on-openbookqa-ab0ddcbd2ab1) | Explore the robustness of Language Models (LLMs) on the OpenBookQA dataset with LangTest Insights.|
| [**LangTest: A Secret Weapon for Improving the Robustness of Your Transformers Language Models**](https://medium.com/john-snow-labs/langtest-a-secret-weapon-for-improving-the-robustness-of-your-transformers-language-models-9693d64256cc) | Explore the robustness of Transformers Language Models with LangTest Insights.|

Expand Down Expand Up @@ -154,6 +154,24 @@ Feel free to ask questions on the [Q&A](https://github.com/JohnSnowLabs/langtest

As contributors and maintainers to this project, you are expected to abide by LangTest's code of conduct. More information can be found at: [Contributor Code of Conduct](https://github.com/JohnSnowLabs/langtest/blob/release/1.8.0/CODE_OF_CONDUCT.md)


## Citation

We have published a [paper](https://www.sciencedirect.com/science/article/pii/S2665963824000071) that you can cite for
the LangTest library:

```bibtex
@article{nazir2024langtest,
title={LangTest: A comprehensive evaluation library for custom LLM and NLP models},
author={Arshaan Nazir, Thadaka Kalyan Chakravarthy, David Amore Cecchini, Rakshit Khajuria, Prikshit Sharma, Ali Tarik Mirik, Veysel Kocaman and David Talby},
journal={Software Impacts},
pages={100619},
year={2024},
publisher={Elsevier}
}
```


## Contributors

We would like to acknowledge all contributors of this open-source community project.
Expand Down
44 changes: 11 additions & 33 deletions demo/blogposts/KDnuggets_spacy_workflow.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -449,30 +449,7 @@
},
"outputs": [],
"source": [
"h = Harness(model=spacy_model, data=\"sample.conll\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "i6sQSBHjK1WT",
"outputId": "c7d9b6de-bbad-451e-8750-3212ce62ade4"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Generating testcases... (robustness): 100%|██████████| 5/5 [00:14<00:00, 2.90s/it]\n"
]
}
],
"source": [
"h = Harness.load(save_dir='saved_test_configurations', model=spacy_model)"
"h = Harness(model=spacy_model, data={\"data_source\":\"sample.conll\"}, task=\"ner\")"
]
},
{
Expand Down Expand Up @@ -1646,16 +1623,14 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "1PgESWMTJupx",
"outputId": "8b0544d1-5525-49d0-9bac-fd21f01c2281"
},
"metadata": {},
"outputs": [],
"source": [
"h.augment(\"conll03.conll\", \"augmented_conll03.conll\", inplace=False)"
"data_kwargs = {\n",
" \"data_source\" : \"conll03.conll\",\n",
" }\n",
"\n",
"h.augment(training_data=data_kwargs, save_data_path=\"augmented_conll03.conll\", export_mode=\"add\")"
]
},
{
Expand Down Expand Up @@ -1794,7 +1769,10 @@
}
],
"source": [
"new_h = Harness.load(\"saved_testsuite\", model=augmented_spacy_model)"
"new_h = Harness.load(\"saved_testsuite\",\n",
" model={\"model\": augmented_spacy_model,\"hub\":\"spacy\"},\n",
" task=\"ner\", \n",
" load_testcases=True)"
]
},
{
Expand Down
5,278 changes: 2,639 additions & 2,639 deletions demo/tutorials/RAG/RAG_HF.ipynb

Large diffs are not rendered by default.

Loading

0 comments on commit 69aeca2

Please sign in to comment.