deep learning - h2o deeplearning checkpoint model -
folks,
i have problem when try resuming h2o deep learning in r checkpointed model with validation frame provided. says "validation dataset must same check pointed model", believe have same validation datasets. if leave validation_frame blank, checkpointing model works fine. attach code below:
localh2o <- h2o.init(nthreads = -1) train_image.hex <- read.csv("mnist_train.csv",header=false) train_image.hex[,785] <- factor(train_image.hex[,785]) train_image.hex <- as.h2o(train_image.hex) test_image.hex <- read.csv("mnist_test.csv",header=false) test_image.hex[,785] <- factor(test_image.hex[,785]) test_image.hex <- as.h2o(test_image.hex) mnist_model <- h2o.deeplearning(x=1:784, y = 785, training_frame= train_image.hex, validation_frame = test_image.hex, activation = "rectifierwithdropout", hidden = c(500,1000), input_dropout_ratio = 0.2, hidden_dropout_ratios = c(0.5,0.5), adaptive_rate=true, rho=0.98, epsilon = 1e-7, l1 = 1e-8, l2 = 1e-7, max_w2 = 10, epochs = 10, export_weights_and_biases = true, variable_importances = false ) h2o.savemodel(mnist_model, path="/tmp",force=true)
then shut down h2o, quit r , restart h2o in r resume training, h2o errors out:
localh2o <- h2o.init(nthreads = -1) train_image.hex <- read.csv("mnist_train.csv",header=false) train_image.hex[,785] <- factor(train_image.hex[,785]) train_image.hex <- as.h2o(train_image.hex) test_image.hex <- read.csv("mnist_test.csv",header=false) test_image.hex[,785] <- factor(test_image.hex[,785]) test_image.hex <- as.h2o(test_image.hex) startmodel <- h2o.loadmodel("/tmp/deeplearning_model_r_1443812402059_20", localh2o) mnist_model <- h2o.deeplearning(x=1:784, y = 785, checkpoint = startmodel@model_id, training_frame= train_image.hex, validation_frame = test_image.hex, activation = "rectifierwithdropout", hidden = c(500,1000), input_dropout_ratio = 0.2, hidden_dropout_ratios = c(0.5,0.5), adaptive_rate=true, rho=0.98, epsilon = 1e-7, l1 = 1e-8, l2 = 1e-7, max_w2 = 10, epochs = 10, export_weights_and_biases = true, variable_importances = false )
thank pointing out us. have added jira, , can track progress here: https://0xdata.atlassian.net/browse/pubdev-2182
you can expect problem fixed soon.
thanks!
avni
Comments
Post a Comment