Generic N-fold data partitioner.
Given a dataset with N chunks, with cvtype = 1 (which is default), it would generate N partition sets, where each chunk is sequentially taken out (with replacement) to form a second partition, while all other samples together form the first partition. Example, if there are 4 chunks, partition sets for cvtype = 1 are:
[[1, 2, 3], [0]]
[[0, 2, 3], [1]]
[[0, 1, 3], [2]]
[[0, 1, 2], [3]]
If cvtype>1, then all possible combinations of cvtype number of chunks are taken out, so for cvtype = 2 in previous example yields:
[[2, 3], [0, 1]]
[[1, 3], [0, 2]]
[[1, 2], [0, 3]]
[[0, 3], [1, 2]]
[[0, 2], [1, 3]]
[[0, 1], [2, 3]]
Note that the “taken-out” partition is always labeled ‘2’ while the remaining elements are labeled ‘1’.
Notes
Available conditional attributes:
(Conditional attributes enabled by default suffixed with +)
| Parameters : | cvtype : int
enable_ca : None or list of str
disable_ca : None or list of str
count : None or int
selection_strategy : str
attr : str
space : str
postproc : Node instance, optional
descr : str
|
|---|