Back

Sanskar

Keen Learner and Exp... • 15h

Day 8 of learning AI/ML as a beginner. Topic: Bag of Words (BOW) Yesterday I told you guys about One Hot Encoding which is one way to convert text into vector however with serious disadvantages and to cater to those disadvantages there's another one know as Bag of words (BOW). Bag of words is an NLP technique used to convert text into collection of words and represent it numerically by counting the frequency of word (highest frequency words come first in vocabulary) it ignores grammar and order of the words. There are two types of Bag of Words (BOW): 1. Binary BOW: it converts words into binary form (1 and 0). 2. Normal BOW: This will count the frequency and update the count. Just like One Hot Encoder, Bag of Words also have some advantages and disadvantages. It's advantages are that it is simple and intuitive to use and it has fixed size inputs i.e. it can convert a text of any length into a numerical vector of fixed length (using vocabulary) this help ML algorithms to process text data efficiently and uniformly. It's disadvantages include the problem of sparse matrix and overfitting i.e. the computer is just memorizing the data and not learning the bigger picture. As BOW don't care about the order of the words it changes it according to the vocabulary which can completely change the meaning of the text and also it means that no real semantic meaning is captured as it will still considered both the text meaning as similar. And it also have the problem of out of vocabular i.e. the word outside the vocabulary will get ignored. Here are my notes which will help you understand Bag of Words (BOW) in more details.

Reply
6

More like this

Recommendations from Medial

Harish Kumar

Training Consoler • 1m

Learn Common Phrases and Vocabulary - Ed11 This content teaches simple, everyday phrases and vocabulary for beginners. It covers basic greetings, common expressions, and useful words to help you communicate easily. Perfect for language learners or a

See More
Reply
6

Ayush Kushwaha

Hey I am on Medial • 7m

We will make a attachi one on top of the bag solar panel. You can heat and cool the water inside the bag. You can also charge your phone and laptop. And I will give you a small bag vehicles can you use it anywhere ..

Reply
2
Image Description
Image Description

Pravin Bhosale

Hey I am on Medial • 1y

Unableto post due to text limitations of 1000 words, uploading a screenshot. Thank you for understanding!

5 Replies
3
Image Description

param siddh

@paramsiddh • 4m

Listen to every elder’s advice’ is the worst sentence ever said. Even fools grow old. If the talk is bullshit, don’t follow it—even if it’s from an elder. And if there’s truth in the words, don’t hesitate to bow before a 5-year-old

1 Reply
4

Sanskar

Keen Learner and Exp... • 6d

Day 2 of learning AI/ML as a beginner. Topic: text preprocessing (tokenization) in NLP. I have moved further and decided to learn about Natural Language Process(NLP) which is used especially for translations, chatbots, and help them to generate hum

See More
Reply
4

Sanskar

Keen Learner and Exp... • 1d

Day 7 of learning AI/ML as a beginner. Topic: One Hot Encoding and Future roadmap. Now that I have learnt how to clean up the text input a little its time for converting that data into vectors (I am so glad that I have learned it despite getting cr

See More
Reply
2

Sanskar

Keen Learner and Exp... • 3d

Day 5 of learning AI/ML as a beginner. Topic: lemmatization and stopwords. Lemmatization is same as stemming however in lemmatization a word is reduced to its base form also known as lemma. This is a dictionary based process. This is accurate then

See More
Reply
2

Kazmeen

Everything happen at... • 6m

share your idea and we can discuss the upcoming advantages and disadvantages of the idea in the future

Reply

Download the medial app to read full posts, comements and news.