Back

Sanskar

Keen Learner and Exp... • 21d

Day 8 of learning AI/ML as a beginner. Topic: Bag of Words (BOW) Yesterday I told you guys about One Hot Encoding which is one way to convert text into vector however with serious disadvantages and to cater to those disadvantages there's another one know as Bag of words (BOW). Bag of words is an NLP technique used to convert text into collection of words and represent it numerically by counting the frequency of word (highest frequency words come first in vocabulary) it ignores grammar and order of the words. There are two types of Bag of Words (BOW): 1. Binary BOW: it converts words into binary form (1 and 0). 2. Normal BOW: This will count the frequency and update the count. Just like One Hot Encoder, Bag of Words also have some advantages and disadvantages. It's advantages are that it is simple and intuitive to use and it has fixed size inputs i.e. it can convert a text of any length into a numerical vector of fixed length (using vocabulary) this help ML algorithms to process text data efficiently and uniformly. It's disadvantages include the problem of sparse matrix and overfitting i.e. the computer is just memorizing the data and not learning the bigger picture. As BOW don't care about the order of the words it changes it according to the vocabulary which can completely change the meaning of the text and also it means that no real semantic meaning is captured as it will still considered both the text meaning as similar. And it also have the problem of out of vocabular i.e. the word outside the vocabulary will get ignored. Here are my notes which will help you understand Bag of Words (BOW) in more details.

Reply
1
12

More like this

Recommendations from Medial

Sanskar

Keen Learner and Exp... • 18d

Day 11 of learning AI/ML as a beginner. Topic: TF-IDF (Term Frequency - Inverse Document Frequency). Yesterday I have talked about N-grams and how they are useful in Bag of Words (BOW) however it has some serious drawbacks and for that reason I am

See More
Reply
2
Image Description
Image Description

Sanskar

Keen Learner and Exp... • 19d

Day 10 of learning AI/ML as a beginner. Topic: N-Grams in Bag of Words (BOW). Yesterday I have talked about an amazing text to vector converter in machine learning i.e. Bag of Words (BOW). N-Gram is just a part of BOW. In BOW the program sees sente

See More
3 Replies
8

Sanskar

Keen Learner and Exp... • 16d

Day 13 of learning AI/ML as a beginner. Topic: Word Embedding. I have discussed about one hot encoding, Bag of words and TF-IDF in my recent posts. These are the count or frequency tools that are a part of word embedding but before moving forward l

See More
Reply
3
Image Description
Image Description

Sanskar

Keen Learner and Exp... • 20d

Day 9 of learning AI/ML as a beginner. Topic: Bag of Words practical. Yesterday I shared the theory about bag of words and now I am sharing about the practical I did I know there's still a lot to learn and I am not very much satisfied with the topi

See More
4 Replies
20
1

Harish Kumar

Training Consoler • 2m

Learn Common Phrases and Vocabulary - Ed11 This content teaches simple, everyday phrases and vocabulary for beginners. It covers basic greetings, common expressions, and useful words to help you communicate easily. Perfect for language learners or a

See More
Reply
6

Ayush Kushwaha

Hey I am on Medial • 8m

We will make a attachi one on top of the bag solar panel. You can heat and cool the water inside the bag. You can also charge your phone and laptop. And I will give you a small bag vehicles can you use it anywhere ..

Reply
2
Image Description

param siddh

@paramsiddh • 5m

Listen to every elder’s advice’ is the worst sentence ever said. Even fools grow old. If the talk is bullshit, don’t follow it—even if it’s from an elder. And if there’s truth in the words, don’t hesitate to bow before a 5-year-old

1 Reply
4

Sanskar

Keen Learner and Exp... • 22d

Day 7 of learning AI/ML as a beginner. Topic: One Hot Encoding and Future roadmap. Now that I have learnt how to clean up the text input a little its time for converting that data into vectors (I am so glad that I have learned it despite getting cr

See More
Reply
2

Download the medial app to read full posts, comements and news.