datacapushe {capushe} | R Documentation |
A dataframe example for the capushe package
based on a simulated Gaussian
mixture dataset in \R^3.
data(datacapushe)
A data frame with 50 rows (models) and the following 4 variables:
model
a character vector
: model names.
pen
a numeric vector
: model penalty shape values.
complexity
a numeric vector
: model complexity values.
contrast
a numeric vector
: model contrast values.
The simulated dataset is composed of n=1000 observations in \R^3. It consists of an equiprobable mixture of three large "bubble" groups centered at ν_1=(0,0,0), ν_2=(6,0,0) and ν_3=(0,6,0) respectively. Each bubble group j is simulated from a mixture of seven components according to the following density distribution:
x\in\R^3\rightarrow 0.4Φ(x|μ_1+ν_j,I_3)+∑_{k=2}^70.1Φ(x|μ_k+ν_j,0.1I_3)
with μ_1=(0,0,0), μ_2=(0,0,1.5), μ_3=(0,1.5,0), μ_4=(1.5,0,0,), μ_5=(0,0,-1.5), μ_6=(0,-1.5,0) and μ_7=(-1.5,0,0,). Thus the distribution of the dataset is actually a 21-component Gaussian mixture.
A model collection of spherical Gaussian mixtures is considered and the dataframe
datacapushe
contains the maximum likelihood estimations for each of these models.
The number of free parameters of each model is used for the complexity values and pen_{shape}
is defined by this complexity divided by n.
datapartialcapushe
and datavalidcapushe
can be used to run the
validation
function. datapartialcapushe
only
contains the models with less than 21 components. datavalidcapushe
contains three models with 30, 40 and 50 components respectively.
http://www.math.univ-toulouse.fr/~maugis/CAPUSHE.html
Article: Baudry, J.-P., Maugis, C. and Michel, B. (2011) Slope heuristics: overview and implementation. Statistics and Computing, to appear. doi: 10.1007/ s11222-011-9236-1
data(datacapushe) capushe(datacapushe,n=1000) ## BIC, DDSE and Djump all three select the true model plot(capushe(datacapushe)) ## Validation: data(datapartialcapushe) capushepartial=capushe(datapartialcapushe) data(datavalidcapushe) validation(capushepartial,datavalidcapushe) ## The slope heuristics should not ## be applied for datapartialcapushe.