Skip to content

Classic Data Science toolbox: EDA, preprocessing, Modeling etc.

License

Notifications You must be signed in to change notification settings

Omerdan03/DanzDSTools

Repository files navigation

DanzDSTools

This repo cantained basic tools for data science.

follow the sklean_roadmap

main page

SQL tools :

Open In Colab

  • connect_to_mysql
  • send_statement
  • get_tables
  • get_columns
  • get_table_as_df
  • get_quarry

EDA tools :

Open In Colab

Basic EDA workflow

get_numerical_categorical - basic separation for numerial/ categorical features

tools for assumptions testing (all_values_in_list and all_values_in_range)

plotting functions

  • multi numerical box
  • multi numerical EDA
  • multi categorical coutplot
  • multi numerical corr
  • one numerical multi categorical KDA
  • numerical + categorical bars
  • numerical + numerical scatter
  • categorical + categorical

Preprossesing tools :

Open In Colab

  • random split
  • one hot incoding
  • ordinal incoding
  • Normalize feature
  • Dimensionality Reduction with PCA

Feature engineering :

Open In Colab

  • feature selection
  • Dimensionlity Reduction with PCA

Modle selection :

Open In Colab

  • cross validation
  • grid search
  • random search

Classification tools :

Open In Colab

  • roc curve
  • confusion matrix
  • simple random forest classifier

Regression tools :

Open In Colab

  • Linear Regresion
  • Polyunomial Regression
  • print metrics

Unsupervised learning :

Open In Colab

  • number clusters (Elbow method)
  • print clusters on 2 PCA
  • Anomaly detection (local outlier and Isolation forest)

Time_Series - working progress :

Open In Colab

  • A walkthrough on solving a time series problem.
  • ARIMA model
  • results evaluation

About

Classic Data Science toolbox: EDA, preprocessing, Modeling etc.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published