如何在tensorflow中从头开始在CPU上训练?

python tensorflow

1042 观看

1回复

55 作者的声誉

我正在尝试在mobileNet上训练MNIST。因此开始安装git repo模型。在获得数据集之后,我还遵循了之前的安装步骤。后来也将MNIST转换为TFRecord格式。然后,当我从模型存储库的苗条文件夹中运行train_image_classifier.py时,得到以下日志。(注意:我正在使用别名为boa的 anaconda python ,并且旁边有库存python。)

boa train_image_classifier.py --train_dir=${TRAIN_DIR} --dataset_name=mnist --dataset_split_name=train --dataset_dir=${DATASET_DIR} --model_name=mobilenet_v1

警告:tensorflow:来自train_image_classifier.py:468:不推荐使用softmax_cross_entropy(来自tensorflow.contrib.losses.python.losses.loss_ops),并将在2016-12-30之后删除。更新说明:改用tf.losses.softmax_cross_entropy。请注意,logits和labels参数的顺序已更改。警告:tensorflow:来自/opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:398:compute_weighted_loss(来自tensorflow.contrib.losses.python.losses.loss_ops )已弃用,并将在2016-12-30之后删除。更新说明:改用tf.losses.compute_weighted_loss。警告:tensorflow:来自/opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/losses/python/losses/loss_ops.py:151:add_loss(来自tensorflow.contrib.losses.python.losses。loss_ops)已弃用,并将在2016-12-30之后删除。更新说明:改用tf.losses.add_loss。INFO:tensorflow:摘要名称/ clone_loss不合法;改用clone_loss。2017-09-14 11:23:12.377137:W tensorflow / core / platform / cpu_feature_guard.cc:45] TensorFlow库未编译为使用SSE4.1指令,但这些在您的计算机上可用,并且可以加速CPU计算。2017-09-14 11:23:12.377158:W tensorflow / core / platform / cpu_feature_guard.cc:45] TensorFlow库未编译为使用SSE4.2指令,但这些在您的计算机上可用,并且可以加速CPU计算。2017-09-14 11:23:12.377162:W tensorflow / core / platform / cpu_feature_guard.cc:45] TensorFlow库未编译为使用AVX指令,但是这些在您的计算机上可用,并且可以加速CPU计算。2017-09-14 11:23:12.377165:W tensorflow / core / platform / cpu_feature_guard.cc:45] TensorFlow库尚未编译为使用AVX2指令,但这些指令在您的计算机上可用,并且可以加快CPU计算。2017-09-14 11:23:12.377169:W tensorflow / core / platform / cpu_feature_guard.cc:45] TensorFlow库尚未编译为使用FMA指令,但这些在您的计算机上可用,并且可以加快CPU计算。2017-09-14 11:23:14.890614:我tensorflow / core / common_runtime / simple_placer.cc:669]忽略节点'fifo_queue_Dequeue'的设备规范/ device:GPU:0,因为'prefetch_queue / fifo_queue'的输入边缘是参考连接,并且已经将设备字段设置为/ device:CPU:0 INFO:tensorflow:向协调器报告错误:,无法将设备分配给节点'gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs':由于没有匹配的设备,无法满足明确的设备规范'/ device:GPU:0'该规范已在此过程中注册;可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU:0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]] 无法将设备分配给节点'gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs':无法满足显式设备规范'/ device:GPU:0',因为没有在该设备中注册与该规范匹配的设备处理; 可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU:0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]] 无法将设备分配给节点'gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs':无法满足显式设备规范'/ device:GPU:0',因为没有在该设备中注册与该规范匹配的设备处理; 可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU:0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]] 渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs':无法满足显式设备规范'/ device:GPU:0',因为在此过程中未注册与该规范匹配的设备;可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU:0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]] 渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs':无法满足显式设备规范'/ device:GPU:0',因为在此过程中未注册与该规范匹配的设备;可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU:0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]] 因为在此过程中没有注册符合该规格的设备;可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU:0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]] 因为在此过程中没有注册符合该规格的设备;可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU:0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]]

由op u'gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs引起,定义于:tf.app.run()文件的“ train_image_classifier.py”行574中,“ / opt / anaconda2 / lib / python2.7 / site-packages / tensorflow / python / platform / app.py“,第48行,运行中_sys.exit(main(_sys.argv [:1] + flags_passthrough))文件” train_image_classifier.py ,在主var_list = variables_to_train文件中的第534行,“ / home / csb / path / to / projects / RnD / mobilenet / tensorflow_models / slim / deployment / model_deploy.py”,在第297行,在optimize_clones优化器,克隆,num_clones, regularization_losses,** kwargs)文件“ /home/csb/path/to/projects/RnD/mobilenet/tensorflow_models/slim/deployment/model_deploy.py”,第261行,在_optimize_clone clone_grad = optimizer.compute_gradients(sum_loss,** kwargs)中,文件``/opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py'',第386行,在compute_gradients colocate_gradients_with_ops = colocate_ )文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py”,第560行,渐变grad_scope,op,func_call,lambda:grad_fn(op,* out_grads)) _MaybeCompile中的文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py”,第368行,返回grad_fn()#退出早期文件“ / opt / anaconda2 / lib / python2 .7 / site-packages / tensorflow / python / ops / gradients_impl.py“,第560行,位于grad_scope,op,func_call,lambda中:grad_fn(op,* out_grads))文件“/opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/math_grad.py“,第609行,在_SubGrad rx中,ry = gen_array_ops._broadcast_gradient_args(sx,sy)文件” / opt / anaconda2 / lib / python2.7 / site-packages / tensorflow / python / ops / gen_array_ops.py”,第411行,位于_broadcast_gradient_args name = name)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python” /framework/op_def_library.py”,行768,在apply_op中op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,行2336,在create_op中original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,行1228,在第609行,在_SubGrad rx中,ry = gen_array_ops._broadcast_gradient_args(sx,sy)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py”,第411行,在_broadcast_gradient_args中name = name)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py”,行768,位于apply_op op_def = op_def)文件“ / opt / anaconda2 / lib / python2.7 / site-packages / tensorflow / python / framework / ops.py“,第2336行,位于create_op original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages /tensorflow/python/framework/ops.py“,第1228行,在第609行,在_SubGrad rx中,ry = gen_array_ops._broadcast_gradient_args(sx,sy)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py”,第411行,在_broadcast_gradient_args中name = name)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py”,行768,位于apply_op op_def = op_def)文件“ / opt / anaconda2 / lib / python2.7 / site-packages / tensorflow / python / framework / ops.py“,第2336行,位于create_op original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages /tensorflow/python/framework/ops.py“,第1228行,在7 / site-packages / tensorflow / python / ops / gen_array_ops.py“,第411行,在_broadcast_gradient_args name = name中)File” /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library .py“,第768行,位于apply_op op_def = op_def中)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,第2336行,位于create_op original_op = self中。 _default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,第1228行,在7 / site-packages / tensorflow / python / ops / gen_array_ops.py“,第411行,在_broadcast_gradient_args name = name中)File” /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library .py“,第768行,位于apply_op op_def = op_def中)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,第2336行,位于create_op original_op = self中。 _default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,第1228行,在7 / site-packages / tensorflow / python / framework / ops.py“,第2336行,位于create_op original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow /python/framework/ops.py”,第1228行,在7 / site-packages / tensorflow / python / framework / ops.py“,第2336行,位于create_op original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow /python/framework/ops.py”,第1228行,在初始化 self._traceback = _extract_stack()

...最初创建为op'MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub',定义为:文件“ train_image_classifier.py”,行574,位于tf.app.run()中[已删除0在运行_sys.exit(main(_sys.argv [:的第48行中,文件/opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py]中的第48行1] + flags_passthrough))文件“ train_image_classifier.py”,行474,在主克隆中= model_deploy.create_clones(deploy_config,clone_fn,[batch_queue])文件“ / home / csb / path / to / projects / RnD / mobilenet / tensorflow_models /slim/deployment/model_deploy.py”,行193,在create_clones中输出= model_fn(* args,** kwargs)文件“ train_image_classifier.py”,行457,在clone_fn登录中,end_points = network_fn(images)文件“ /home/csb/path/to/projects/RnD/mobilenet/tensorflow_models/slim/nets/nets_factory.py”,行114,在network_fn中返回func(images,num_classes,is_training = is_training)文件“ /home/csb/path/to/projects/RnD/mobilenet/tensorflow_models/slim/nets/mobilenet_v1.py”,行323,位于mobilenet_v1 conv_defs = conv_defs)文件“ / home / csb / path / to / projects / RnD / mobilenet / tensorflow_models / slim / nets / mobilenet_v1.py”,行232,在mobilenet_v1_base范围= end_point中)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops /arg_scope.py“,在func_with_args中的第181行,返回func(* args,** current_args)文件” /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers。 py”,第927行,在卷积输出中= normalizer_fn(outputs,** normalizer_params)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py”,第181行,在func_with_args返回中func(* args,** current_args)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py”,第528行,位于batch_norm输出=层中。 apply(inputs,training = is_training)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/layers/base.py”,行320,适用于返回自身。** current_args)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py”,第528行,位于batch_norm输出= layer.apply(输入,训练)中= is_training)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/layers/base.py”,在应用返回自身中,第320行。** current_args)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py”,第528行,位于batch_norm输出= layer.apply(输入,训练)中= is_training)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/layers/base.py”,在应用返回自身中,第320行。调用(输入,** kwargs)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/layers/base.py”,行290,在调用 输出中= self.call(输入,* * kwargs)

InvalidArgumentError(请参阅上面的回溯):无法将设备分配给节点“ gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs”:无法满足显式设备规范“ / device:GPU:0”,因为没有在该过程中注册了符合该规格的设备;可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU:0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]]

追溯(最近一次通话最近):tf.app.run()中的文件“ train_image_classifier.py”,行574,文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app” .py“,第48行,在运行_sys.exit(main(_sys.argv [:1] + flags_passthrough))文件“ train_image_classifier.py”,第570行,在主sync_optimizer = optimizer中,如果FLAGS.sync_replicas则为其他,则无)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py“,第725行,在火车主站中,start_standard_services = False,config = session_config)作为sess:文件“ /opt/anaconda2/lib/python2.7/contextlib.py”,输入的第17行 返回self.gen.next()文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py”,第960行,位于managed_session self.stop(close_summary_writer = close_summary_writer)文件中“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py”,行788,位于stop stop_grace_period_secs = self._stop_grace_secs)文件“ /opt/anaconda2/lib/python2.7 /site-packages/tensorflow/python/training/coordinator.py“,第389行,加入联接Six.reraise(* self._exc_info_to_raise)文件” /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python /training/supervisor.py“,第949行,位于managed_session start_standard_services = start_standard_services)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/supervisor中。py”,第706行,在prepare_or_wait_for_session init_feed_dict = self._init_feed_dict,init_fn = self._init_fn)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/session_manager.py”,第262行,在prepare_session sess.run(init_op,feed_dict = init_feed_dict)中,文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py”,行778,在运行run_metadata_ptr中) /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py”,第982行,位于_run feed_dict_string,选项,run_metadata中)文件“ /opt/anaconda2/lib/python2.7/ site-packages / tensorflow / python / client / session.py”,第1032行,位于_do_run target_list,选项,run_metadata中)文件“ / opt / anaconda2 / lib / python2。7 / site-packages / tensorflow / python / client / session.py“,行1052,在_do_call中引发类型(e)(node_def,op,message)tensorflow.python.framework.errors_impl.InvalidArgumentError:无法将设备分配给节点'gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs':无法满足显式设备规范'/ device:GPU:0',因为在此过程中未注册与该规范匹配的设备;可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU :0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]]]

由op u'gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs引起,定义于:tf.app.run()文件的“ train_image_classifier.py”行574中,“ / opt / anaconda2 / lib / python2.7 / site-packages / tensorflow / python / platform / app.py“,第48行,运行中_sys.exit(main(_sys.argv [:1] + flags_passthrough))文件” train_image_classifier.py ,在主var_list = variables_to_train文件中的第534行,“ / home / csb / path / to / projects / RnD / mobilenet / tensorflow_models / slim / deployment / model_deploy.py”,在第297行,在optimize_clones优化器,克隆,num_clones, regularization_losses,** kwargs)文件“ /home/csb/path/to/projects/RnD/mobilenet/tensorflow_models/slim/deployment/model_deploy.py”,第261行,在_optimize_clone clone_grad = optimizer.compute_gradients(sum_loss,** kwargs)中,文件``/opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py'',第386行,在compute_gradients colocate_gradients_with_ops = colocate_ )文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py”,第560行,渐变grad_scope,op,func_call,lambda:grad_fn(op,* out_grads)) _MaybeCompile中的文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py”,第368行,返回grad_fn()#退出早期文件“ / opt / anaconda2 / lib / python2 .7 / site-packages / tensorflow / python / ops / gradients_impl.py“,第560行,位于grad_scope,op,func_call,lambda中:grad_fn(op,* out_grads))文件“/opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/math_grad.py“,第609行,在_SubGrad rx中,ry = gen_array_ops._broadcast_gradient_args(sx,sy)文件” / opt / anaconda2 / lib / python2.7 / site-packages / tensorflow / python / ops / gen_array_ops.py”,第411行,位于_broadcast_gradient_args name = name)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python” /framework/op_def_library.py”,行768,在apply_op中op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,行2336,在create_op中original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,行1228,在第609行,在_SubGrad rx中,ry = gen_array_ops._broadcast_gradient_args(sx,sy)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py”,第411行,在_broadcast_gradient_args中name = name)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py”,行768,位于apply_op op_def = op_def)文件“ / opt / anaconda2 / lib / python2.7 / site-packages / tensorflow / python / framework / ops.py“,第2336行,位于create_op original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages /tensorflow/python/framework/ops.py“,第1228行,在第609行,在_SubGrad rx中,ry = gen_array_ops._broadcast_gradient_args(sx,sy)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py”,第411行,在_broadcast_gradient_args中name = name)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py”,行768,位于apply_op op_def = op_def)文件“ / opt / anaconda2 / lib / python2.7 / site-packages / tensorflow / python / framework / ops.py“,第2336行,位于create_op original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages /tensorflow/python/framework/ops.py“,第1228行,在7 / site-packages / tensorflow / python / ops / gen_array_ops.py“,第411行,在_broadcast_gradient_args name = name中)File” /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library .py“,第768行,位于apply_op op_def = op_def中)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,第2336行,位于create_op original_op = self中。 _default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,第1228行,在7 / site-packages / tensorflow / python / ops / gen_array_ops.py“,第411行,在_broadcast_gradient_args name = name中)File” /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library .py“,第768行,位于apply_op op_def = op_def中)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,第2336行,位于create_op original_op = self中。 _default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py”,第1228行,在7 / site-packages / tensorflow / python / framework / ops.py“,第2336行,位于create_op original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow /python/framework/ops.py”,第1228行,在7 / site-packages / tensorflow / python / framework / ops.py“,第2336行,位于create_op original_op = self._default_original_op,op_def = op_def)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow /python/framework/ops.py”,第1228行,在初始化 self._traceback = _extract_stack()

...最初创建为op'MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub',定义为:文件“ train_image_classifier.py”,行574,位于tf.app.run()中[已删除0在运行_sys.exit(main(_sys.argv [:的第48行中,文件/opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py]中的第48行1] + flags_passthrough))文件“ train_image_classifier.py”,行474,在主克隆中= model_deploy.create_clones(deploy_config,clone_fn,[batch_queue])文件“ / home / csb / path / to / projects / RnD / mobilenet / tensorflow_models /slim/deployment/model_deploy.py”,行193,在create_clones中输出= model_fn(* args,** kwargs)文件“ train_image_classifier.py”,行457,在clone_fn登录中,end_points = network_fn(images)文件“ /home/csb/path/to/projects/RnD/mobilenet/tensorflow_models/slim/nets/nets_factory.py”,行114,在network_fn中返回func(images,num_classes,is_training = is_training)文件“ /home/csb/path/to/projects/RnD/mobilenet/tensorflow_models/slim/nets/mobilenet_v1.py”,行323,位于mobilenet_v1 conv_defs = conv_defs)文件“ / home / csb / path / to / projects / RnD / mobilenet / tensorflow_models / slim / nets / mobilenet_v1.py”,行232,在mobilenet_v1_base范围= end_point中)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops /arg_scope.py“,在func_with_args中的第181行,返回func(* args,** current_args)文件” /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers。 py”,第927行,在卷积输出中= normalizer_fn(outputs,** normalizer_params)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py”,第181行,在func_with_args返回中func(* args,** current_args)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py”,第528行,位于batch_norm输出=层中。 apply(inputs,training = is_training)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/layers/base.py”,行320,适用于返回自身。** current_args)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py”,第528行,位于batch_norm输出= layer.apply(输入,训练)中= is_training)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/layers/base.py”,在应用返回自身中,第320行。** current_args)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py”,第528行,位于batch_norm输出= layer.apply(输入,训练)中= is_training)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/layers/base.py”,在应用返回自身中,第320行。调用(输入,** kwargs)文件“ /opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/layers/base.py”,行290,在调用 输出中= self.call(输入,* * kwargs)

InvalidArgumentError(请参阅上面的回溯):无法将设备分配给节点“ gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs”:无法满足显式设备规范“ / device:GPU:0”,因为没有在该过程中注册了符合该规格的设备;可用设备:/ job:localhost / replica:0 / task:0 / cpu:0 [[节点:gradients / MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / BroadcastGradientArgs = BroadcastGradientArgs [T = DT_INT32,_device =“ / device:GPU:0“](渐变色/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape,渐变/ MobilenetV1 / MobilenetV1 / Conv2d_0 / BatchNorm / moments / sufficient_statistics / Sub_grad / Shape_1)]]

后来我发现我收到错误消息,因为系统上没有GPU。没有GPU,我们不能在TF上训练吗?如果我们可以在CPU上进行培训,请告知代码中要进行的更改。

作者: Bhaskar Chakradhar 的来源 发布者: 2017 年 9 月 15 日

回应 1


1

55 作者的声誉

决定

deployment/文件夹中的model_deploy.py 使用GPU在其上创建克隆。我们需要专门指定使用Flag在CPU上进行克隆--clone_on_cpu=True。因此,命令变为
python train_image_classifier.py --train_dir=${TRAIN_DIR} --dataset_name=mnist --dataset_split_name=train --dataset_dir=${DATASET_DIR} --model_name=mobilenet_v1 --clone_on_cpu=True。摆脱了boa别名。但是我们无法使用mobileNet训练MNIST,因此实际正确的命令将是
python train_image_classifier.py --train_dir=${TRAIN_DIR} --dataset_name=mnist --dataset_split_name=train --dataset_dir=${DATASET_DIR} --model_name=lenet --clone_on_cpu=True

作者: Bhaskar Chakradhar 发布者: 2017 年 9 月 26 日
32x32