Skip to content

Commit 0894aca

Browse files
committed
add specification for imputer method
1 parent e2c45c9 commit 0894aca

File tree

1 file changed

+24
-0
lines changed

1 file changed

+24
-0
lines changed

eda_utils_py/eda_utils_py.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,27 @@
1+
def imputer(dataframe, strategy = "mean", fill_value):
2+
"""
3+
A function to implement imputation functionality for completing missing values.
4+
5+
Parameters
6+
----------
7+
dataframe : pandas.DataFrame
8+
a dataframe that might contain missing data
9+
strategy : string, default="mean"
10+
The imputation strategy.
11+
- If “mean”, then replace missing values using the mean along each column. Can only be used with numeric data.
12+
- If “median”, then replace missing values using the median along each column. Can only be used with numeric data.
13+
- If “most_frequent”, then replace missing using the most frequent value along each column. Can be used with strings or numeric data. If there is more than one such value, only the smallest is returned.
14+
- If “constant”, then replace missing values with fill_value. Can be used with strings or numeric data.
15+
fill_value : string or numerical value, default=None
16+
When strategy == “constant”, fill_value is used to replace all occurrences of missing_values. If left to the default, fill_value will be 0 when imputing numerical data and “missing_value” for strings or object data types.
17+
18+
Returns
19+
-------
20+
pandas.DataFrame
21+
a dataframe that contains no missing data
22+
"""
23+
24+
125
def cor_map(dataframe, num_col):
226
"""
327
A function to implement a correlation heatmap including coefficients based on given numeric columns of a data frame.

0 commit comments

Comments
 (0)