Commit e8f7a978 authored by Andrew Jeffery's avatar Andrew Jeffery

strgrp: Add cosine fudge-curve to unify filter comparison spaces

If we are to use should_grp_score_cos(x,y) as a filter the the following
relationship must hold (from least to most expensive):

        should_grp_score_len(x,y)
                >= should_grp_score_cos(x,y)
                >= grp_score(x)

should_grp_score_cos(x,y) wasn't holding up its part of the bargain, so
real data was used to generate a fudge curve to bring
should_grp_score_cos(x,y) results into the same space. Really this is a
terrible hack and the problem needs more thought. Evaluation of
should_grp_score_cos(x,y)'s performance benefit (given the relaxation of
the filter under the fudge curve) is sorely needed.
parent 911a66a7
......@@ -74,10 +74,6 @@
* License: LGPL
* Author: Andrew Jeffery <andrew@aj.id.au>
*
* Ccanlint:
* tests_pass FAIL
* tests_pass_without_features FAIL
*
* Example:
* FILE *f;
* char *buf;
......
......@@ -108,10 +108,18 @@ strcossim(const int16_t ref[CHAR_N_VALUES], const int16_t key[CHAR_N_VALUES]) {
/* Low-cost filter functions */
static inline double
cossim_correction(const double s)
{
return -((s - 0.5) * (s - 0.5)) + 0.33;
}
static inline bool
should_grp_score_cos(const struct strgrp *const ctx,
struct strgrp_grp *const grp, const char *const str) {
return ctx->threshold <= strcossim(ctx->pop, grp->pop);
const double s1 = strcossim(ctx->pop, grp->pop);
const double s2 = s1 + cossim_correction(s1);
return ctx->threshold <= s2;
}
static inline bool
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment