Keen Learner and Exp... • 20d
Day 9 of learning AI/ML as a beginner. Topic: Bag of Words practical. Yesterday I shared the theory about bag of words and now I am sharing about the practical I did I know there's still a lot to learn and I am not very much satisfied with the topic yet however I would like to share my progress. I first created a file and stored various types of ham and spam messages in it along with the label. I then imported pandas and used pandas.read_csv funtion to create a table categorizing label and message. I then started cleaning and preprocessing the text I used porter stemmer for stemming however quickly realised that it is less accurate and therefore I used lemmatization which was slow but gave me accurate results. I then imported countvectorizer from sklearn and used it to create a bag of words model and then used fit_transform to convert the documents in corplus into an array of 0 and 1 (I used normal BOW though). Here's what my code looks like and I would appreciate your suggestions and recommendations.
Keen Learner and Exp... • 19d
Day 10 of learning AI/ML as a beginner. Topic: N-Grams in Bag of Words (BOW). Yesterday I have talked about an amazing text to vector converter in machine learning i.e. Bag of Words (BOW). N-Gram is just a part of BOW. In BOW the program sees sente
See MoreKeen Learner and Exp... • 21d
Day 8 of learning AI/ML as a beginner. Topic: Bag of Words (BOW) Yesterday I told you guys about One Hot Encoding which is one way to convert text into vector however with serious disadvantages and to cater to those disadvantages there's another on
See MoreKeen Learner and Exp... • 17d
Day 12 of learning AI/ML as a beginner. Topic: TF-IDF practical. Yesterday I shared my theory notes and today I have done the practical of TF-IDF. For the practical I reused my spam classifier code and for TF-IDF I first imported it from the sklear
See MoreKeen Learner and Exp... • 16d
Day 13 of learning AI/ML as a beginner. Topic: Word Embedding. I have discussed about one hot encoding, Bag of words and TF-IDF in my recent posts. These are the count or frequency tools that are a part of word embedding but before moving forward l
See MoreKeen Learner and Exp... • 24d
Day 5 of learning AI/ML as a beginner. Topic: lemmatization and stopwords. Lemmatization is same as stemming however in lemmatization a word is reduced to its base form also known as lemma. This is a dictionary based process. This is accurate then
See MoreKeen Learner and Exp... • 18d
Day 11 of learning AI/ML as a beginner. Topic: TF-IDF (Term Frequency - Inverse Document Frequency). Yesterday I have talked about N-grams and how they are useful in Bag of Words (BOW) however it has some serious drawbacks and for that reason I am
See MoreKeen Learner and Exp... • 15d
Day 14 of learning AI/ML as a beginner. Topic: Word2vec I think I am getting lost and that I have omitted some core concepts as there are many things I believe I am unfamiliar with and I am searching for some guidance. Can anybody please tell me wh
See MoreDownload the medial app to read full posts, comements and news.