What Is Data Science? Who Is a Data Scientist?
Data Science involves extracting hidden insights from data related to trends, behavior, interpretation, and inferences to enable informed decision-making in business. The professionals who perform these activities are known as Data Scientists or Data Science professionals. Data Science is one of the most in-demand professions, and according to Harvard, it is one of the most sought-after careers globally.
The answer is a resounding YES for numerous reasons. Digitalization across various domains is generating vast amounts of data, increasing the demand for Data Science professionals who can evaluate and extract meaningful insights. This demand is creating millions of jobs in the Data Science field. There is a significant gap between demand and supply, leading to ample job opportunities and high salaries. Data Scientists are highly valued in the job market. The career path of a Data Scientist is long-lasting and rewarding, as data generation is growing exponentially, and the need for Data Science professionals will continue to rise.
- Professionals from any domain who have logical, mathematical, and analytical skills
- Professionals working on Business Intelligence, Data Warehousing, and reporting tools
- Freshers from any stream with good analytical and logical skills
- Statisticians, Economists, Mathematicians
- Software programmers
- Business analysts
- Six Sigma consultants
- Recap of Demo
- Introduction to Types of Analytics
- Project life cycle
- An introduction to our E-learning platform
- Data Types
- Measure Of central tendency
- Measures of Dispersion
- Graphical Techniques
- Skewness & Kurtosis
- Box Plot
- Python (Installation and basic commands) and Libraries
- Colab notebook
- Descriptive Stats in Python
- Pandas and Matplotlib / Seaborn
- Random Variable
- Probability
- Probability Distribution
- Normal Distribution
- SND
- Expected Value
- Sampling Funnel
- Sampling Variation
- CLT
- Confidence interval
- Assignments Session-1 (1 hr.)
- Introduction to Hypothesis Testing
- Hypothesis Testing with examples
- 2 proportion tests
- 2 sample t test
- Anova and Chi-square case studies
- Visualization
- Data Cleaning
- Imputation Techniques
- Scatter Plot
- Correlation analysis
- Transformations
- Normalization and Standardization
- Time Series Analysis
- ARIMA Models
- Exponential Smoothing
- Seasonal Decomposition
- State Space Models
- Data Cleaning
- Handling Missing Data
- Outlier Detection and Treatment
- Data Transformation
- Feature Scaling
- Encoding Categorical Data
- Feature Engineering
- Data Integration
- Introduction to Machine Learning
- Supervised Learning
- Unsupervised Learning
- Regression Algorithms
- Classification Algorithms
- Clustering Algorithms
- Decision Trees
- Support Vector Machines
- Model Evaluation Metrics
- Confusion Matrix
- Precision, Recall, F1-Score
- ROC-AUC Curve
- Cross-Validation Techniques
- Overfitting and Underfitting
- Hyperparameter Tuning
- Grid Search and Random Search
- Why dimension reduction
- Advantages of PCA
- Calculation of PCA weights
- 2D Visualization using Principal components
- Basics of Matrix algebra
- Introduction to Feature Engineering
- Feature Creation
- Feature Transformation
- Feature Selection
- Handling Categorical Features
- Handling Date and Time Features
- Text Feature Extraction
- Introduction to Time Series Data
- Decomposition of Time Series
- Time Series Visualization
- Stationarity in Time Series
- Autocorrelation and Partial Autocorrelation
- ARIMA Models
- Seasonal ARIMA (SARIMA)
- Forecasting Future Values
- Ensemble Learning
- Bagging and Boosting
- Random Forest
- Gradient Boosting Machines (GBM)
- XGBoost
- LightGBM
- CatBoost
- Neural Networks
- Introduction to NLP
- Text Preprocessing
- Tokenization
- Stop Words Removal
- Stemming and Lemmatization
- Bag of Words
- TF-IDF
- Sentiment Analysis
- Text Classification
- Named Entity Recognition (NER)
- Word Embeddings
- OHE
- Label Encoders
- Outlier detection-Isolation Forest
- Predictive power Score
- Introduction to Deep Learning
- Neural Networks Overview
- Activation Functions
- Forward and Backpropagation
- Gradient Descent
- Loss Functions
- Building a Neural Network with TensorFlow/Keras
- Introduction to CNN
- Convolution Operation
- Pooling Layers
- Flattening and Fully Connected Layers
- Building a CNN Model
- Image Classification with CNN
- Transfer Learning with Pre-trained Models
- Introduction to RNN
- Recurrent Layers
- Long Short-Term Memory (LSTM)
- Gated Recurrent Units (GRU)
- Building an RNN Model
- Sequence Prediction with RNN
- Applications of RNN
- Introduction to Reinforcement Learning
- Key Concepts: Agent, Environment, State, Action, Reward
- Exploration vs. Exploitation
- Q-Learning
- Deep Q-Networks (DQN)
- Policy Gradients
- Applications of Reinforcement Learning
- Lasso Regression
- Ridge Regression
- Elastic Net
- Dropout in Neural Networks
- Batch Normalization
- Data Augmentation
- Introduction to Model Deployment
- Saving and Loading Models
- Using Flask for Deployment
- Creating APIs for Model Serving
- Containerization with Docker
- Deploying on Cloud Platforms
- Monitoring and Maintaining Models
- Introduction to Ethics in Data Science
- Data Privacy and Protection
- Bias and Fairness in Machine Learning
- Transparency and Explainability
- Responsible AI
- Legal and Regulatory Aspects
- Case Studies on Ethical Issues
- Introduction to Big Data
- Hadoop Ecosystem
- MapReduce
- Apache Spark
- HDFS (Hadoop Distributed File System)
- Data Ingestion with Apache Flume and Sqoop
- NoSQL Databases
- Data Processing with Spark
- What is Data Visualization?
- Why Visualization came into Picture?
- Importance of Visualizing Data
- Poor Visualizations Vs. Perfect Visualizations
- Principles of Visualizations
- Tufte’s Graphical Integrity Rule
- Tufte’s Principles for Analytical Design
- Visual Rhetoric
- Goal of Data Visualization
- Introduction to Tableau
- What is Tableau? Different Products and their functioning
- Architecture Of Tableau
- Pivot Tables
- Split Tables
- Hiding
- Rename and Aliases
- Data Interpretation
- Understanding about Data Types and Visual Cues
- Text Tables, Highlight Tables, Heat Map
- Pie Chart, Tree Chart
- Bar Charts, Circle Charts
- Time Series Charts
- Time Series Hands-On
- Dual Lines
- Dual Combination
- Bullet Chart
- Scatter Plot
- Introduction to Correlation Analysis
- Introduction to Regression Analysis
- Trendlines
- Histograms
- Bin Sizes in Tableau
- Box Plot
- Pareto Chart
- Donut Chart, Word Cloud
- Forecasting (Predictive Analysis)
- Types of Maps in Tableau
- Polygon Maps
- Connecting with WMS Server
- Custom Geo coding
- Data Layers
- Radial & Lasso Selection
- How to get Background Image and highlight the data on it
- Creating Data Extracts
- Filters and their working at different levels
- Usage of Filters on at Extract and Data Source level
- Worksheet level filters
- Context, Dimension Measures Filter
- Joins
- Unions
- Data Blending
- Cross Database Joins
- Sets
- Groups
- Parameters
- Logical Functions
- Case-If Function
- ZN Function
- Else-If Function
- Ad-Hoc Calculations
- Quick Table Calculations
- Level of Detail (LoD)
- Fixed LoD
- Include LoD
- Exclude LoD
- Responsive Tool Tips
- Dashboards
- Actions at Sheet level and Dashboard level
- Story
- Connecting Tableau with Tableau Server
- Publishing our Workbooks in Tableau Server
- Publishing dataset on to Tableau Server
- Setting Permissions on Tableau Server
- Python Introduction - Programing Cycle of Python
- Python IDE and Jupyter notebook
- Variables
- Data type
- Code Practice Platform
- create, insert, update and delete operation, Handling errors
- Operator - Arithmetic, comparison, Assignment, Logical, Bitwise operator
- Decision making - Loops
- While loop, for loop and nested loop
- Number type conversion - int(), long(), float()
- Mathematical functions, Random function, Trigonometric function
- Strings- Escape char, String special Operator, String formatting Operator
- Build in string methods - center(), count(), decode(), encode()
- Python List - Accessing values in list, delete list elements, Indexing slicing & Matrices
- Built in Function - cmp(), len(), min(), max(), list comprehension
- Tuples - Accessing values in Tuples, Delete Tuples elements, Indexing slicing & Matrices
- Built in tuples functions - cmp(), len()