-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Proposed changes for test_metrics.py
#1577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed changes for test_metrics.py
#1577
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, seem really good, thank you :)
black
does elongate everything quite a bit which is a bit annoying. There are workarounds, i.e.
(
np.array([[1.0, 0.0], [1.0, 0.0], [0.0, 1.0], [0.0, 1.0]]),
"roc_auc",
sklearn.metrics.roc_auc_score,
*(1, 0, -1, -1.0),
),
Seeing all these tests like this, it seems like we require a lot of redundant info:
- The
optimum
,worst_possible_result
andsign
can all be captured by something likebounds = (optimum, worst_possible_result)
, where you can infersign
from the ordering of whetheroptimum > worst_possible_result
- Looking at how most of this test setup seems to be just reconstructing the metrics that already exist, we could just use the ones we already have defined. It should make the
parametrize
s a good bit shorter and remove themake_scorer
calls everywhere and make the tests much more readable!
The first part would be for another PR and if you want to work on make that simpler, it would be a nice simple contribution too. The second part would be pretty nice to have for this PR :)
I'll run the tests now!
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## development #1577 +/- ##
===============================================
- Coverage 84.75% 84.49% -0.27%
===============================================
Files 157 155 -2
Lines 11981 11898 -83
Branches 2066 2058 -8
===============================================
- Hits 10155 10053 -102
- Misses 1278 1281 +3
- Partials 548 564 +16 |
Hi @shantam-8, It looks good to me now, sorry for the deleyad responses, we had some deadlines coming up for other papers that kept us away from auto-sklearn! I pinged @mfeurer to have a quick look Best, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, that looks good. Could you maybe add a comment stating that we test the assembled scorers instead of assembling the scorer objects ourselves in the test?
Besides that, this looks great. Thanks a lot!
Hello @eddiebergman @mfeurer, sorry to reply so late. I had made a few more updates to |
Hi @shantam-8, I looked at the The other ones look fine to me. Edit: I think mfeurer is busy for the nxt few days so deleting that class and if all the tests pass, I think this is ready to merge :) |
Not sure why the metadata generation tests are failing to be honest. I'll try rerun them |
Sorry to keep it going, I guess I didn't unfold the |
While the long test list does contain metrics like |
Sure, we can just keep them in instead of adding them to the huge list then :) |
Hey @shantam-8, Very sorry for the slow feedback loop on all of this, we've been pretty busy recently! Many thanks for the contribution and we'll be more responsive for the coming months :) Best, |
The following PR (part of #1351) proposes the following changes for _PredictScorer, _ProbaScorer, and _TresholdScorer as part of the
test_metrics.py
. Using the guidance present, I have used pytests to make the tests as flexible as possible. Please let me know if I need to add/delete anything else and I would be glad to do the same for the rest oftest_metrics.py
. Thanks a lot!