我使用caret套件的trainControl()來進行leave one out cross validation
資料集是使用iris
在使用caret中的train()時,將form設定為如下所示會出現Error in cut.default(y, breaks, include.lowest = TRUE) : invalid number of intervals
程式碼 :
require(neuralnet)
require(nnet)
require(caret)
data <- iris
data <- cbind(data, class.ind(data$Species)) # 編碼類別資料
formula.bpn <- setosa+versicolor+virginica ~ Sepal.Length+Sepal.Width+Petal.Length+Petal.Width # 建構bpn的formula
# 將資料集以8:2切為訓練集和測試集
smp.size <- floor(0.8*nrow(data))
set.seed(777)
train.ind <- sample(seq_len(nrow(data)), smp.size)
train <- data[train.ind,]
test <- data[-train.ind,]
# tuning parameters and training
train.control <- trainControl(method="LOOCV",
search="grid",
verboseIter=FALSE,
returnData=TRUE,
returnResamp="final",
savePredictions="final",
selectionFunction="best",
indexFinal=NULL,
allowParallel=TRUE
)
model <- train(form=formula.bpn,
data=train,
method="neuralnet",
metric="RMSE",
maximize=FALSE,
trControl=train.control,
tuneGrid=expand.grid(layer1=c(1:4), layer2=c(0:4), layer3=c(0)),
na.action=na.omit,
# learningrate=0.01,
startweights=NULL,
algorithm="rprop+",
err.fct="sse",
act.fct="logistic",
threshold=0.01,
stepmax=5e10,
linear.output=FALSE
)
traceback如下所示
11: stop("invalid number of intervals")
10: cut.default(y, breaks, include.lowest = TRUE)
9: cut(y, breaks, include.lowest = TRUE)
8: createFolds(outcome, n, returnTrain = TRUE)
7: make_resamples(trControl, outcome = y)
6: with_preserve_seed({
set_seed(list(seed = seed, rng_kind = rng_kind))
code
})
5: withr::with_seed(rs_seed, make_resamples(trControl, outcome = y))
4: train.default(x, y, weights = w, ...)
3: train(x, y, weights = w, ...)
2: train.formula(form = formula.bpn, data = train, method = "neuralnet",
metric = "RMSE", maximize = FALSE, trControl = train.control,
tuneGrid = expand.grid(.layer1 = c(1:4), .layer2 = c(0:4),
.layer3 = c(0)), na.action = na.omit, startweights = NULL,
algorithm = "rprop+", err.fct = "sse", act.fct = "logistic",
threshold = 0.01, stepmax = 5e+10, linear.output = FALSE)
1: train(form = formula.bpn, data = train, method = "neuralnet",
metric = "RMSE", maximize = FALSE, trControl = train.control,
tuneGrid = expand.grid(.layer1 = c(1:4), .layer2 = c(0:4),
.layer3 = c(0)), na.action = na.omit, startweights = NULL,
algorithm = "rprop+", err.fct = "sse", act.fct = "logistic",
threshold = 0.01, stepmax = 5e+10, linear.output = FALSE)
如果將form改成應變數只有一個就不會發生錯誤,如setosa ~ Sepal.Length+Sepal.Width+Petal.Length+Petal.Width
或versicolor ~ Sepal.Length+Sepal.Width+Petal.Length+Petal.Width
或virginica ~ Sepal.Length+Sepal.Width+Petal.Length+Petal.Width
但我是要讓model能依據四個自變數來輸出三個類別對應的機率,所以想請問此error是因為什麼問題發生的??
還請有用過caret來做loocv的大大指點!