Description

This is a Jupyter Notebook file to calculate the 3 most popular Word Embedding-based metrics with Python to evaluate a generative conversational chatbot's answering performance for dialogue texts.

The 3 metrics implemented:

Greedy Matching score, the cosine similarity matching between the 300d vectors of the reference answer and the chatbot's answer
Embedding average score, average cosine similarity between vectors
Vector Extrema score, min and max score of cosine similarity

Example Usage:

(see "EMBEDDING_METRICS_TEST_EXAMPLE")

Screenshot:

References:

A Comparison of Greedy and Optimal Assessment of Natural Language Student Input Word Similarity Metrics Using Word to Word Similarity Metrics. Vasile Rus, Mihai Lintean. 2012. Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, NAACL 2012.
Bootstrapping Dialog Systems with Word Embeddings. G. Forgues, J. Pineau, J. Larcheveque, R. Tremblay. 2014. Workshop on Modern Machine Learning and Natural Language Processing, NIPS 2014.
Sai, A. B., Mohankumar, A. K., and Khapra, M. M. (2022). A survey ofevaluation metrics used for nlg systems. ACM Computing Surveys (CSUR),55(2):1–39.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Description

Greedy Matching score, the cosine similarity matching between the 300d vectors of the reference answer and the chatbot's answer

Embedding average score, average cosine similarity between vectors

Vector Extrema score, min and max score of cosine similarity

Example Usage:

References:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Description

Greedy Matching score, the cosine similarity matching between the 300d vectors of the reference answer and the chatbot's answer

Embedding average score, average cosine similarity between vectors

Vector Extrema score, min and max score of cosine similarity

Example Usage:

References: