Learn about Natural Language Processing, find resources, models, repos, datasets.
- Sequential models: RNN (1985), LSTM (1997), GRU (2014)
- Word embeddings: Word2vec (2013), GloVe (2014), FastText (2016)
- Word embeddings with context: ELMo (2018)
- Attention: Transformer (2017)
- Pre-training: ULMFiT (2017), GPT (2017)
- Combining the above: BERT (2018)
- Improving BERT: DistilBERT, ALBERT, RoBERTa, XLNet (2019); Big Bird, Multilingual embeddings (2020)
- Everything is text-to-text: T5 (2019)
Source: NLP for Supervised Learning - A Brief Survey
"This document aims to track the progress in Natural Language Processing (NLP) and give an overview of the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets.
It aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging as well as more recent ones such as reading comprehension and natural language inference. The main objective is to provide the reader with a quick overview of benchmark datasets and the state-of-the-art for their task of interest, which serves as a stepping stone for further research. To this end, if there is a place where results for a task are already published and regularly maintained, such as a public leaderboard, the reader will be pointed there."
Source: NLP Progress
"Natural Language Processing (NLP) uses algorithms to understand and manipulate human language. This technology is one of the most broadly applied areas of machine learning. As AI continues to expand, so will the demand for professionals skilled at building models that analyze speech and language, uncover contextual patterns, and produce insights from text and audio."
Source: Coursera