Back

Sanskar

Keen Learner and Exp... • 20d

Day 1 of learning Data Science as a beginner. Topic: data science life cycle and reading a json file data dump. What is data science life cycle? The data science lifecycle is the structured process of extracting useful actionable insights from raw data (which we refer to as data dump). Data science life cycle has the following steps: 1. Problem Solving: understand the problem you want to solve. 2. Data Collection: gathering relevant data from multiple sources is a crucial step in data science we can collect data using APIs, web scraping or from any third party datasets. 3. Data Cleaning (Data Preprocessing): here we prepare the raw data (data dump) which we collected in step 2. 4. Data Exploration: here we understand and analyse data to find patterns and relationships. 5. Model Building: here we create and train machine learning models and use algorithms to predict outcome or classify data. 6. Model Evaluation: here we measure how our model is performing and its accuracy. 7. Deployment: integrating our model into production system. 8. Communicating and Reporting: now that we have deployed our model it is important to communicate and report it's analysis and results with relevant people. 9. Maintenance & Iteration: keeping our model upto date and accurate is crucial for better results. As a part of my data science learning journey I decided to start with trying to read a data dump (obviously a dummy one) from a .json file using pure python my goal is to understand why we need so many libraries to analyse and clean the data why can't we do it in just pure python script? the obvious answer can be to save time however I feel like I first need to feel the problem in order to understand its solution better. So first I dumped my raw data into a data.json file and then I used json's load method in a function to read my data dump from data.json file. Then I used f string and for loop to analyse each line and print the data in a more readable format. Here's my code and its result.

2 Replies
2
9
Replies (2)

More like this

Recommendations from Medial

Sudarshan Pal

Data Engineer @Quant... • 1y

Many argue that Data Engineering is a part of data science and analytics. It's different from data science, but they work together closely. Data Engineers come first in the process. They gather and organize data. This data is then used by Data Scien

See More
Reply
2
4

Sanskar

Keen Learner and Exp... • 3d

Day 11 of learning Data science as a beginner Topic: creating data structure In my previous post I discussed about the difference between panda's series and data frames we typically use data frames more often as compared to series There are a lot

See More
Reply
2

Sanskar

Keen Learner and Exp... • 4d

Day 10 of learning data science as a beginner Topic: data analysis using pandas Pandas is one of the python's most famous open source library and it is used for a variety of tasks like data manipulation, data cleaning and for analysis of data. Pand

See More
Reply

Mahendra Lochhab

Content creator • 1y

The data science and analytics industry is expected to reach $3.03 billion in 2024, with over 72,000 data science job openings.

Reply
4
Image Description

Ayush Mishra

A wise man does at o... • 4m

For tech guys, I am pursuing B.tech from a pvt university and I want to make a career in data science and very passionate about technology and programming. how should I study and what roadmap should I follow to excel my skills in programing and data

See More
3 Replies
3

Dudekula Kasimvali

Hey I am on Medial • 1m

Final Year CS Student | Exploring Opportunities in Generative AI, Data Science & ML Engineering I’m currently in my final year of Computer Science and Business Systems and actively seeking internship opportunities where I can apply my skills in AI,

See More
Reply
2

Sanskar

Keen Learner and Exp... • 1d

Day 13 of learning data science as a beginner. Topic: data cleaning and preprocessing In most of the real world applications we rarely get almost perfect data most of the time we get a raw data dump which needs to be cleaned and preprocessed before

See More
Reply
1

Sadiq Ali

Building Bridges, No... • 3m

📊 Data Science Reality Check! Ever tracked how much time you actually spend on: ⏳ 60% Data cleaning & preprocessing 📊 20% Exploratory analysis 🤖 15% Model building 💡 5% Delivering insights The hard truth: We spend 12x more time preparing data t

See More
Reply
3
Image Description
Image Description

Siddharth Boxi

Hey I am on Medial • 1y

Where to do Data Science course? Any suggestions

5 Replies
6

Download the medial app to read full posts, comements and news.