Tokenisation

As we are in fourth blog of this Lexical Processing series. Please watch the first three video as we will be considering a Spam/Ham detector model by end of this module which will use all the algorithms that we are considering one by one. https://kite4sky.wordpress.com/2019/09/16/introduction-to-nlp/ , https://kite4sky.wordpress.com/2019/09/18/lexical-processing-part-i/ , https://kite4sky.wordpress.com/2019/09/24/lexical-processing-part-ii/ In the spam detector model, whichContinue reading “Tokenisation”

Lexical Processing Part II

So in the current part of Lexical Processing we will first focus on Word Frequency and Stop words and then we will have some practical demonstration. While working with any kind of data whether it is structured or unstructured data we should have proper understanding of data , and thus we have to do someContinue reading “Lexical Processing Part II”

Lexical Processing Part I

So As discussed in our last post we will be focusing on Lexical Processing. It generally mean extracting the raw text, identifying and analyzing the structure of words. Lexical analysis is extracting the whole document to sentence, sentence to words or we can simply term as breaking whole chunk of words in tokens or smallerContinue reading “Lexical Processing Part I”