Skip to content

Conversation

@greglandrum
Copy link
Member

Things in this PR:

  • Get the scripts which operate on datasets II working in python 3
  • Add additional scorers for: XGB, balanced random forests, LMNB
  • Swap the RF scorer to use vanilla scikit-learn RFs instead of our monkey-patched implementation of balanced random forests.

Notes:

  • I have not done as much work with the datasets I scripts. Those datasets are, with some years of perspective, less interesting and useful, so I'm not feeling strongly compelled to spend time working on them
  • There's significant room for refactoring and removing duplicate code in the scoring scripts. I'll think about doing this.
  • The scoring scripts are quite verbose in their output (generating huge amounts of data). I think it wouldn't be terrible to make the output more compact, but that's a longer term project.

@greglandrum greglandrum marked this pull request as draft December 19, 2022 16:47
@greglandrum
Copy link
Member Author

@sriniker : if you have time and inclination to look at this, I'd lover your comments. I have a bit more work to do before marking it as "done", but I wanted to give you a heads up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant