Table of Contents
What is Natural Language Processing ?
Natural Language Processing Is deal between computers and human language. 80% Data in unstructured format As Email, Call Transcript, Speech Transcript, Interview, Audio And Video, Post Social Media, Survey. Only 20% in structured format such as contact details, credit card data, product data .
NLP In Daily Life
- Google search engine
- gmail spam filter
- Alexa
- Image Or Video To Text Converter
NLTK Library
in Python Install NLTK, Go To CMD(Command Prompt) And Type “pip install nltk” Or “conda install -c conda-forge nltk” if python 3 then Type. “pip3 install nltk”.
Bag of words model
Bag Of Words is technique of NLP To Extract meaning full word into Document. First Tokenize Data means convert sentences in to word. each row call as Document Including Empty Row. Collection of Document call as corpus. then create Word Frequency Dictionary. In Word Frequency Dictionary include all word and they Frequency values. Last Create document Frequency Matrix row as document and column as word.
NLP Algorithm
CountVectorizer: Convert a collection of text documents to a matrix of token counts
TfidfTransformer: Transform a count matrix to a normalized tf or tf-idf representation

#check Word of Frequency freq = nltk.FreqDist(clean_tokens) freq #you can visit https://www.nltk.org/api/nltk.html for more parameter
Full Code : Click Hear
Conclusion
In NLP, need a preprocessing depend of need of data first then perform sentiment analysis or other analysis.