-
Hello. I'm having some trouble trying to figure out why I'm getting an unexpected number of samples from APP. import quapy.functional as F F.num_prevalence_combinations(n_prevpoints, n_classes, n_repeats=10) So here is a piece of my code: from quapy.method.aggregative import CC model = CC(newLR()) for run, data in enumerate(qp.data.Dataset.kFCV(collection, nfolds=3, nrepeats=1, random_state=0)):
So, this code is a slight adaption of a code found in https://github.com/HLT-ISTI/QuaPy/blob/master/examples/uci_experiments.py My problem is that my code returns 17570 testing points verified by the resulting len(true_prevalences), differently than the expected 17710. Even if set n_repeats=1 it still returns 1757 testing points with 4 classes and n_prevalences=21 I'll be very grateful if someone could help me. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
You are right! The problem was that quapy check's for combinations of values that generate plausible prevalence vectors, i.e., prevance vectors summing up to 1. Unfortunately, when you work with floats, it sometimes happens that the rounding error accumulates and produces values that were slightly >1 (e.g., 1.000000001) resulting some values to be discarded. I have fixed it now in the master branch (thanks for noticing it!). |
Beta Was this translation helpful? Give feedback.
You are right! The problem was that quapy check's for combinations of values that generate plausible prevalence vectors, i.e., prevance vectors summing up to 1. Unfortunately, when you work with floats, it sometimes happens that the rounding error accumulates and produces values that were slightly >1 (e.g., 1.000000001) resulting some values to be discarded. I have fixed it now in the master branch (thanks for noticing it!).
In any case, the APP protocol is falling into disuse in favor of modern protocols like the UPP. This protocol implements the Kraemer sampling algorithm for yielding samples with prevalence values uniformly distributed. In UPP you can specify how many samples you want …