单隐含层神经网络公式推导及C++实现笔记 - c++编程基础

TOP

单隐含层神经网络公式推导及C++实现笔记(一)

2018-03-02 06:57:00 【大中小】浏览:992次

下面是在逻辑回归的基础上，对单隐含层的神经网络进行公式推导：

选择激活函数时的一些经验：不同层的激活函数可以不一样。如果输出层值是0或1，在做二元分类，可以选择sigmoid作为输出层的激活函数；其它层可以选择默认(不确定情况下)使用ReLU作为激活函数。使用ReLU作为激活函数一般比使用sigmoid或tanh在使用梯度下降法时学习速度会快很多。一般在深度学习中都需要使用非线性激活函数。唯一能用线性激活函数的地方通常也就只有输出层。
深度学习中的权值w不能初始化为0，偏置b可以初始化为0.
反向传播中的求导需要使用微积分的链式法则。

以下code是完全按照上面的推导公式进行实现的，对数字0和1进行二分类。训练数据集为从MNIST中train中随机选取的0、1各10个图像；测试数据集为从MNIST中test中随机选取的0、1各10个图像，如下图，其中第一排前10个0用于训练，后10个0用于测试；第二排前10个1用于训练，后10个1用于测试：

single_hidden_layer.hpp:

#ifndef FBC_SRC_NN_SINGLE_HIDDEN_LAYER_HPP_
#define FBC_SRC_NN_SINGLE_HIDDEN_LAYER_HPP_

#include 
  
   
#include 
   
     namespace ANN { template
    
      class SingleHiddenLayer { // two categories public: typedef enum ActivationFunctionType { Sigmoid = 0, TanH = 1, ReLU = 2, Leaky_ReLU = 3 } ActivationFunctionType; SingleHiddenLayer() = default; int init(const T* data, const T* labels, int train_num, int feature_length, int hidden_layer_node_num = 20, T learning_rate = 0.00001, int iterations = 10000, int hidden_layer_activation_type = 2, int output_layer_activation_type = 0); int train(const std::string& model); int load_model(const std::string& model); T predict(const T* data, int feature_length) const; private: T calculate_activation_function(T value, ActivationFunctionType type) const; T calcuate_activation_function_derivative(T value, ActivationFunctionType type) const; int store_model(const std::string& model) const; void init_train_variable(); void init_w_and_b(); ActivationFunctionType hidden_layer_activation_type = ReLU; ActivationFunctionType output_layer_activation_type = Sigmoid; std::vector
     
      > x; // training set std::vector
      
        y; // ground truth labels int iterations = 10000; int m = 0; // train samples num int feature_length = 0; T alpha = (T)0.00001; // learning rate std::vector
       
        > w1, w2; // weights std::vector
        
          b1, b2; // threshold int hidden_layer_node_num = 10; int output_layer_node_num = 1; T J = (T)0.; std::vector
         
          > dw1, dw2; std::vector
          
            db1, db2; std::vector
           
            > z1, a1, z2, a2, da2, dz2, da1, dz1; }; // class SingleHiddenLayer } // namespace ANN #endif // FBC_SRC_NN_SINGLE_HIDDEN_LAYER_HPP_

single_hidden_layer.cpp:

#include "single_hidden_layer.hpp"
#include 
  
   
#include 
   
     #include 
    
      #include 
     
       #include "common.hpp" namespace ANN { template
      
        int SingleHiddenLayer
       
        ::init(const T* data, const T* labels, int train_num, int feature_length, int hidden_layer_node_num, T learning_rate, int iterations, int hidden_layer_activation_type, int output_layer_activation_type) { CHECK(train_num > 2 && feature_length > 0 && hidden_layer_node_num > 0 && learning_rate > 0 && iterations > 0); CHECK(hidden_layer_activation_type >= 0 && hidden_layer_activation_type < 4); CHECK(output_layer_activation_type >= 0 && output_layer_activation_type < 4); this->hidden_layer_node_num = hidden_layer_node_num; this->alpha = learning_rate; this->iterations = iterations; this->hidden_layer_activation_type = static_cast
        
         (hidden_layer_activation_type); this->output_layer_activation_type = static_cast
         
          (output_layer_activation_type); this->m = train_num; this->feature_length = feature_length; this->x.resize(train_num); this->y.resize(train_

首页上一页 1 2 3 4 5 6 下一页尾页 1/6/6
【大中小】【打印】【繁体】【投稿】【收藏】【推荐】【举报】【评论】【关闭】【返回顶部】

上一篇：c++与C const变量的区别详解	下一篇：C++中的引用（代码实例）