-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathTest.txt
46 lines (30 loc) · 2.27 KB
/
Test.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
What is NLP?
NLP - Natural Language Processing
what kind of data do we deal with?
We deal with textual data.. we know that textual data will not be available into some particular format so if you try to read out some books, if you try to read without some paragraph, if you will try to pick up some line.. so for sure all those lines will not be same and even length of the lines, number of words in a line, count of words in that line won't be same. so these things are not going to be same at all in any cases.
we won't be getting any tabular data however we'll be dealing completely with unstructured data.
Whenever you like to talk with NLP or you will try to see a nlp like a problem statement, so the first problem that you are going to face to be with a dataset.
Data wise, NLP data can be gathered from a speech/text/images and anywhere.you will be able to see and gather nlp data.
Pre-defined tasks in NLP:
-------------------------
just like other algorithms, each n every algorithm will won't be able to work on all kinds of data and same goes for NLP
We have to do huge amount of data pre-processing when we deal with NLP such as
- Lemmatization
- Stop word removal
- stemming
- regex
- normalization and so on... depends on the NLP task
Tasks:
--------
- Parts of Speech tagging (POS)
- Text Generation
- Speech to text
NLP Approaches:
--------------
- Traditional Approach
- State of the Art (SOTA) approach (as per the paper "attention is all you need" released after 2018)
Now most of the problems are getting solved using SOTA approach rather than Traditional model. But traditional NLP libraries are relevant till certain extent but not 100%
text generation: if you are typing something in your gmail, you have seen that if you will type some word then automatically it will try to complete a line by showing some generated words. It is generating those words based on the typing that you are trying to do.
Text generation will be possible with the help of multiple different different kind of a model.
Speech to text: We can use existing models which gives us the better results so no need to create new models from scratch.
This plays a major role when we talk about speech to text conversion (or) conversation AI because a user can give input in terms of a speech