@@ -59,10 +59,15 @@ data_with_outlier = pd.DataFrame({
5959 ' SepalWidthCm' :[1.4 , 1.4 , 1.3 , 1.2 , 1.2 , 1.3 , 1.6 , 1.3 ],
6060 ' PetalWidthCm' :[0.2 , 0.1 , 30 , 0.2 , 0.3 , 0.1 , 0.4 , 0.5 ]
6161 })
62+
63+ data_with_scale = pd.DataFrame({' SepalLengthCm' :[1 , 0 , 0 , 3 , 4 ],
64+ ' SepalWidthCm' :[4 , 1 , 1 , 0 , 1 ],
65+ ' PetalWidthCm' :[2 , 0 , 0 , 2 , 1 ],
66+ ' Species' :[' Iris-setosa' ,' Iris-virginica' , ' Iris-germanica' , ' Iris-virginica' ,' Iris-germanica' ]})
6267```
6368
6469The eda_utils_py will help you to:
65- - Diagnose data quality : Resolve skewed data by identifing missing data and outlier and provide corresponding remedy.
70+ - ** Impute ** : Resolve skewed data by identifying missing data and outlier and provide corresponding remedy.
6671
6772``` python
6873imputer(data_with_NA)
@@ -71,14 +76,16 @@ Output:
7176
7277![ imputer_output] ( images/imputer_output.png )
7378
79+ - ** Identify Outliers** : Identify and deal with outliers in the dataset.
80+
7481``` python
7582outlier_identifier(data_with_outlier, method = " median" )
7683```
7784Output:
7885
7986![ outlier_output] ( images/outlier_output.png )
8087
81- - This package can help you easily plot a correlation matrix along with its values to help explore data.
88+ - ** Correlation Heatmap Plotting ** : Easily plot a correlation matrix along with its values to help explore data.
8289
8390``` python
8491numerical_columns = [' SepalLengthCm' ,' SepalWidthCm' ,' PetalWidthCm' ]
@@ -90,8 +97,18 @@ Output:
9097
9198![ cor_map_output] ( images/cor_map.output.png )
9299
93- - Machine learning pereperation: Perform column transformations, derive scaler automatically to fulfill further machine learning need
94-
100+ - ** Scaling** : Scale the data in preperation for future use in machine learning projects.
101+
102+ ``` python
103+ numerical_columns = [' SepalLengthCm' ,' SepalWidthCm' ,' PetalWidthCm' ]
104+
105+ scale(data, numerical_columns, scaler = " minmax" )
106+
107+ ```
108+ Output:
109+
110+ ![ scale_output] ( images/scale_output.png )
111+
95112## Documentation
96113
97114The official documentation is hosted on Read the Docs: https://eda_utils_py.readthedocs.io/en/latest/
0 commit comments