Skip to content

Commit 8add29c

Browse files
authored
Fine tune readme and add scale example
1 parent 14dc655 commit 8add29c

File tree

1 file changed

+21
-4
lines changed

1 file changed

+21
-4
lines changed

README.md

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -59,10 +59,15 @@ data_with_outlier = pd.DataFrame({
5959
'SepalWidthCm':[1.4, 1.4, 1.3, 1.2, 1.2, 1.3, 1.6, 1.3],
6060
'PetalWidthCm':[0.2, 0.1, 30, 0.2, 0.3, 0.1, 0.4, 0.5]
6161
})
62+
63+
data_with_scale = pd.DataFrame({'SepalLengthCm':[1, 0, 0, 3, 4],
64+
'SepalWidthCm':[4, 1, 1, 0, 1],
65+
'PetalWidthCm':[2, 0, 0, 2, 1],
66+
'Species':['Iris-setosa','Iris-virginica', 'Iris-germanica', 'Iris-virginica','Iris-germanica']})
6267
```
6368

6469
The eda_utils_py will help you to:
65-
- Diagnose data quality: Resolve skewed data by identifing missing data and outlier and provide corresponding remedy.
70+
- **Impute**: Resolve skewed data by identifying missing data and outlier and provide corresponding remedy.
6671

6772
```python
6873
imputer(data_with_NA)
@@ -71,14 +76,16 @@ Output:
7176

7277
![imputer_output](images/imputer_output.png)
7378

79+
- **Identify Outliers**: Identify and deal with outliers in the dataset.
80+
7481
```python
7582
outlier_identifier(data_with_outlier, method = "median")
7683
```
7784
Output:
7885

7986
![outlier_output](images/outlier_output.png)
8087

81-
- This package can help you easily plot a correlation matrix along with its values to help explore data.
88+
- **Correlation Heatmap Plotting**: Easily plot a correlation matrix along with its values to help explore data.
8289

8390
```python
8491
numerical_columns = ['SepalLengthCm','SepalWidthCm','PetalWidthCm']
@@ -90,8 +97,18 @@ Output:
9097

9198
![cor_map_output](images/cor_map.output.png)
9299

93-
- Machine learning pereperation: Perform column transformations, derive scaler automatically to fulfill further machine learning need
94-
100+
- **Scaling**: Scale the data in preperation for future use in machine learning projects.
101+
102+
```python
103+
numerical_columns = ['SepalLengthCm','SepalWidthCm','PetalWidthCm']
104+
105+
scale(data, numerical_columns, scaler="minmax")
106+
107+
```
108+
Output:
109+
110+
![scale_output](images/scale_output.png)
111+
95112
## Documentation
96113

97114
The official documentation is hosted on Read the Docs: https://eda_utils_py.readthedocs.io/en/latest/

0 commit comments

Comments
 (0)