Back

Sanskar

Keen Learner and Exp... • 12h

Day 7 of learning AI/ML as a beginner. Topic: One Hot Encoding and Future roadmap. Now that I have learnt how to clean up the text input a little its time for converting that data into vectors (I am so glad that I have learned it despite getting criticism on my approach). There are various processes to convert this data into useful vectors: 1. One hot encoding 2. Bag of words (BOW) 3. TF - IDF 4. Word2vec 5. AvgWord2vec These are some of the ways we can do so. Today lets talk about One hot encoding. This process is pretty much outdated and is rarely used in real word scenarios however it is important to know why we don't use this and why are there different ways? One hot encoding is a technique used for converting a variable into a binary vector. Its advantage is that it is easy to use in python via scitkit learn and pandas library. Its disadvantages however includes. sparse matrix which can lead to overfitting(when a model performs well on the data its been trained and performs poorly with new one). Then it require only fixed sized input in order to get trained. One hot encoding does not capture sematic meaning. And what about a word being out of the vocabulary. Then it is also not practical to use in real world scenarios as it is not much scalable and may lead to problems in future. I have also attached my notes here explaining all these in much details.

Reply
1

More like this

Recommendations from Medial

Image Description
Image Description

Aman meshram

Finding business gap... • 1y

India is witnessing the rise in "Ghost Malls" ( occupancy less than 60% of capacity) . What would you think a viable solution to revive them? : Mine will be converting the unoccupied area into remote data centre and server rooms . Whats your solution

See More
9 Replies
10

Sudarshan Pal

Data Engineer @Quant... • 1y

The Data engineering life cycle has the following 5 stages Generation: This is where data is created. Think of it as information being born. It can come from various sources like e-commerce sites, weather sensors, or even your favourite video games

See More
Reply
7

Sanskar

Keen Learner and Exp... • 13d

Day 2 of learning mathematics for AI/ML as a no math person. Topic: vectors and matrices. We use NumPy python library for these. I got introduced to the concept of vectors and matrices. Vectors are like lists and are divided Vectors are divided in

See More
Reply

Sanskar

Keen Learner and Exp... • 13d

Day 2 of learning mathematics for AI/ML as a no math person. Topic: vectors and matrices. We use NumPy python library for these. I got introduced to the concept of vectors and matrices. Vectors are like lists and are divided Vectors are divided in

See More
Reply
1
Image Description

Mridul Das

Introvert! • 5m

Ola ⚡ rolls out its first Roadster X motorcycle—priced from ₹84,999, with deliveries starting April 2025. The top model offers 501 km/charge! .. let's see how it performs

1 Reply
1
12

Galaxynine

Hey I am on Medial • 1y

Thanks @ZeptoNow for its new feature Zepto Cafe making it easy to deliver food within 10min, and the food is meant to be consumed when it's hot. It would be nice if you add more food options like Dosa, Biriyani, and more. #zepto #zeptocafe #Instantfo

See More
Reply
2
7
Image Description
Image Description

Medina Jalal

EComm Marketing Stra... • 4m

We Indian contribute to 20% of world data consumption but only 3% of data is stored in India. Why so, why can't we save our own data in our own country? Is any one working on it.

2 Replies
3
12
Image Description
Image Description

Comet

#freelancer • 1y

The Akira ransomware has been identified as a significant threat to the LATAM airline industry, leveraging sophisticated tactics to encrypt valuable data and demand ransoms. This ransomware operation, active since early 2023, employs a blend of adv

See More
3 Replies
1
8
Image Description
Image Description

Acevolt

ENGIPRENEUR's • 1y

Now a days Indian stock market performs a strange it's gave a approx 5.00 percent return in one month even most of the companies PE ratio is not good . What u think it is a bubble ? It's a best time to invest or exit the market ?

3 Replies
5

Vansh Khandelwal

Full Stack Web Devel... • 10m

𝐄𝐱𝐩𝐥𝐨𝐫𝐢𝐧𝐠 𝐌𝐨𝐧𝐠𝐨𝐃𝐁: 𝐅𝐥𝐞𝐱𝐢𝐛𝐢𝐥𝐢𝐭𝐲 & 𝐒𝐜𝐚𝐥𝐚𝐛𝐢𝐥𝐢𝐭𝐲 MongoDB, a leading NoSQL database, offers a flexible, document-oriented model that stores data in JSON-like BSON documents. Unlike SQL databases, MongoDB supports dyn

See More
Reply
4

Download the medial app to read full posts, comements and news.