Data Science Life Cycle

The OSEMN framework

Data Science Process (a.k.a the O.S.E.M.N. framework) Figure 1: Data Science Process (a.k.a the O.S.E.M.N. framework
Image source: https://towardsdatascience.com/5-steps-of-a-data-science-project-lifecycle-26c50372b492

  • The OSEMN framework is comprised of 5 major steps that help us to focus and prioritize the right data science tasks at different stages:
  1. Obtaining Data
  2. Scrubbing Data
  3. Exploring Data
  4. Modelling Data
  • Model parameter estimation

  • Hyper-parameter tuning

    • Hyperparameters are the parameters that define the model architecture.
    • Hyperparameters are external to the model and cannot be estimated from data.
    • Hyperparameter optimization or tuning is the process of searching for the ideal model architecture (a set of optimal hyperparameters).

Model Data Figure 2: Model Data
Image source: https://towardsdatascience.com/5-steps-of-a-data-science-project-lifecycle-26c50372b492

  1. Interpretation of Data.

Tidy workflow in Data Science

Tidy Workflow in data science Figure 3: Tidy Workflow in data science
Image source: https://r4ds.had.co.nz/introduction.html

R

Import Tidy Transform Visualize Model Communicate
readr tidyr dplyr ggplot2 broom rmarkdown
heaven tibble lubridate tidymodels bookdown
readxl forcats modelr knitr
htr stringr shiny
rvest
xml2

Python

Import Tidy Transform Visualize Model Communicate
pandas - tabular data pandas pandas matplotlib Scikit-Learn Jupyter Notebook
numpy (numerical data) seaborn statsmodels JupyterLab
plotnine (GoG) TensorFlow Dash
plotly keras streamlit
Flask

Jupyter Notebook vs JupyterLab

  • Jupyter Notebook is a web-based interactive computational environment for creating Jupyter notebook documents.

  • JupyterLab is the next-generation user interface including notebooks. It has a modular structure, where we can open several notebooks or files (e.g. HTML, Text, Markdowns etc) as tabs in the same window. It offers more of an IDE-like experience.

source

References

Previous
Next