Learning so much from each debugging. Following previous post, where I got the “new factor levels not present in the training data”. Here’s some advice I copied from https://stat.ethz.ch/pipermail/r-sig-ecology/2008-September/000320.html, just so I know where to look next time.
And here’s more to read about randomForest from http://www.stat.berkeley.edu/~breiman/RandomForests/
do str(indata) and str(test) give the same information regarding the types of variables? If any of the variables used are factors, do the factors have the same levels in indata and test? I'd probably do this differently, and store the test and training data in the same df to start with, and then split it out at random into a training and test set object (or just use the indices on the main object depending on whether I want the training or test rows). This way, the variables will be the same type/format/structure as they came from the same df to begin with. Also, I really don't follow your loop code. You seem to be indexing indata without reference to columns/rows in first line within the loop. There also seem to be several syntax errors - too many "]"? So start simple, set y <- 7 and perform the first run of the loop "by hand" and once that works, then do the loop in full.
Exploring and venting about quantitative issues
Using large digital libraries to advance literary history
Zoom out, zoom in, zoom out.
Blog to document and reflect on Columbia Data Science Class
A Quick-R Companion
[R] + applied economics.
Scientific computing, data viz and general geekery, with examples in R and MATLAB.