Adjust variance implementation and add new default hybrid algorithm#465
Adjust variance implementation and add new default hybrid algorithm#465jmh530 wants to merge 3 commits into
Conversation
5e67136 to
9308465
Compare
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #465 +/- ##
==========================================
+ Coverage 91.91% 91.94% +0.02%
==========================================
Files 79 79
Lines 18799 18890 +91
==========================================
+ Hits 17279 17368 +89
- Misses 1520 1522 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
@jmh530 please find someone for review, and if it is OK, feel free to merge as long as it doesn't break API |
|
@9il I'm not sure who would want to review. If you're aware of anyone, please let them know. At this point the PR is like 3 years old, so I can't recall if there is any API breakage at the function level, but there likely is some breakage at the accumulator level. People probably shouldn't be depending on that low level of functionality not breaking, IMO. I rarely ever use the mir-algorithm version of these functions. This PR was more about bringing the functionality in line with the version in mir-stat. It's an extra burden to maintain two versions. I would rather just a) publicly import the relevant functions from mir-stat, or b) deprecate the whole package and tell people to use mir-stat instead. |
This makes some adjustments to how variance is implemented and the unit tests and coverage. The more significant change is a new default hybrid algorithm that combines the nice parts about the twoPass and online algorithm. For the purpose of the variance calculation, it is basically the same as twoPass, but offers more flexibility for VarianceAccumulator.
The online algorithm turns out to be much slower than I expected in benchmarks.