Back

Sanskar

Keen Learner and Exp... • 3h

Day 11 of learning AI/ML as a beginner. Topic: TF-IDF (Term Frequency - Inverse Document Frequency). Yesterday I have talked about N-grams and how they are useful in Bag of Words (BOW) however it has some serious drawbacks and for that reason I am going to talk about TF-IDF. TF-IDF is a tool used to convert text into vectors. I determines how important a word is in a document i.e. it is capable of capturing word importance. Term Frequency as the name suggest means how many times a word is present in a document(sentence). It is calculated by: No. of repetition of words in sentence/No. of words in sentence. Then there is Inverse Document Frequency which assigns less weight to the terms which are used many times across many documents and more weightage to the one which is less used across documents. TF-IDF has some of the major benefits and advantages as compared to its previous tools like BOW, One Hot Encoding etc. Its advantages includes it is intuitive to use, it has fixed vocab size and most importantly it is capable of capturing word importance. Its disadvantages includes the usual Sparsity and the problem of out of vocabulary (OOV). Here are my notes.

Reply

More like this

Recommendations from Medial

Sanskar

Keen Learner and Exp... • 3d

Day 8 of learning AI/ML as a beginner. Topic: Bag of Words (BOW) Yesterday I told you guys about One Hot Encoding which is one way to convert text into vector however with serious disadvantages and to cater to those disadvantages there's another on

See More
Reply
1
12
Image Description

param siddh

@paramsiddh • 4m

Listen to every elder’s advice’ is the worst sentence ever said. Even fools grow old. If the talk is bullshit, don’t follow it—even if it’s from an elder. And if there’s truth in the words, don’t hesitate to bow before a 5-year-old

1 Reply
4
Image Description

Sanskar

Keen Learner and Exp... • 1d

Day 10 of learning AI/ML as a beginner. Topic: N-Grams in Bag of Words (BOW). Yesterday I have talked about an amazing text to vector converter in machine learning i.e. Bag of Words (BOW). N-Gram is just a part of BOW. In BOW the program sees sente

See More
2 Replies
5
Image Description
Image Description

Sanskar

Keen Learner and Exp... • 8d

Day 3 of learning AI/ML as a beginner. Topic: NLP (Tokenization) Tokenization is breaking paragraph (corpus) or sentence (document) into smaller units called tokens. In order to perform tokenization we use nltk (natural language toolkit) python li

See More
10 Replies
22
42
1

Sanskar

Keen Learner and Exp... • 6d

Day 5 of learning AI/ML as a beginner. Topic: lemmatization and stopwords. Lemmatization is same as stemming however in lemmatization a word is reduced to its base form also known as lemma. This is a dictionary based process. This is accurate then

See More
Reply
2
Image Description
Image Description

DIVYANSHU MHATRE

Work on your ideas • 1y

Current favorite function of ChatGPT at the moment "Give me power words for {Insert word}" This is my extent to Chatgpt usage

3 Replies
3
14

Sanket Bhosale

Post on Writing & Pe... • 1y

The power of writing lies in its ability to shape your future. Every word you write is a step towards a new: - Opportunity - Connection - Understanding Don’t underestimate the impact of your words.

Reply
1
5
Image Description
Image Description

T.K.ANJANAA SREE

Hey I am on Medial • 1y

Why should create a pdf viewer app integrated with AI So everytime we come across a new sentence or new word we come out and search for it.. so using AI selecting the text and double taking leads to result of meaning of the sentence Why shouldn't we

See More
7 Replies
1
5

Sanskar

Keen Learner and Exp... • 5d

Day 6 of learning AI/ML as a beginner. Topic: pos tagging and name entity recognition. Pos (Part of Speech) tagging is process of labeling each word in a sentence(document with its role). Name entity recognition is the process where the system ide

See More
Reply
2
Image Description
Image Description

Mridul Chandhok

Entrepreneur and Ger... • 14d

Do you see the power of visual storytelling? 💪 A vocabulary word of German language has been broken down with a visual, which eventually helped the kid imagine the meaning of the word 😇 Every person has its own meaning of the word and visual stor

See More
3 Replies
9
37

Download the medial app to read full posts, comements and news.