Natural Language Processing in Python

What is Natural Language Processing ?

Natural Language Processing Is deal between computers and human language. 80% Data in unstructured format As Email, Call Transcript, Speech Transcript, Interview, Audio And Video, Post Social Media, Survey. Only 20% in structured format such as contact details, credit card data, product data .

NLP In Daily Life

  • Google search engine
  • gmail spam filter
  • Alexa
  • Image Or Video To Text Converter

NLTK Library

in Python Install NLTK, Go To CMD(Command Prompt) And Type “pip install nltk” Or “conda install -c conda-forge nltk” if python 3 then Type. “pip3 install nltk”.

Bag of words model

Bag Of Words is technique of NLP To Extract meaning full word into Document. First Tokenize Data means convert sentences in to word. each row call as Document Including Empty Row. Collection of Document call as corpus. then create Word Frequency Dictionary. In Word Frequency Dictionary include all word and they Frequency values. Last Create document Frequency Matrix row as document and column as word.

NLP Algorithm

CountVectorizer: Convert a collection of text documents to a matrix of token counts

TfidfTransformer: Transform a count matrix to a normalized tf or tf-idf representation

#check Word of Frequency
freq = nltk.FreqDist(clean_tokens)
freq
#you can visit https://www.nltk.org/api/nltk.html for more parameter

Full Code : Click Hear

Conclusion

In NLP, need a preprocessing depend of need of data first then perform sentiment analysis or other analysis.

Leave a Comment