Tensorflow函數:tf.nn.softmax_cross_entropy_with_logits 講解


首先把Tensorflow英文API搬過來:

tf.nn.softmax_cross_entropy_with_logits(_sentinel=None, labels=None, logits=None, dim=-1, name=None)

Computes softmax cross entropy between logits and labels.

Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but not both.

NOTE: While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row oflabels is a valid probability distribution. If they are not, the computation of the gradient will be incorrect.

If using exclusive labels (wherein one and only one class is true at a time), seesparse_softmax_cross_entropy_with_logits.

WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.

logits and labels must have the same shape [batch_size,
num_classes]
 and the same dtype (either float16,float32, or float64).

Note that to avoid confusion, it is required to pass only named arguments to this function.

Args:

  • _sentinel: Used to prevent positional parameters. Internal, do not use.
  • labels: Each row labels[i] must be a valid probability distribution.
  • logits: Unscaled log probabilities.
  • dim: The class dimension. Defaulted to -1 which is the last dimension.
  • name: A name for the operation (optional).
這個函數至少需要兩個參數: labels, logits.

labels:為神經網絡期望的輸出

logits:為神經網絡最后一層的輸出

警告:這個函數內部自動計算softmax,然后再計算交叉熵代價函數,也就是說logits必須是沒有經過tf.nn.softmax函數處理的數據,否則導致訓練結果有問題。建議編程序時使用這個函數,而不必自己編寫交叉熵代價函數。

下面是兩層CNN識別mnist的softmax回歸實驗:

 
 
#coding=utf-8import tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets("MNIST_data/", one_hot=True)def compute_accuracy(v_xs,v_ys):    global prediction    y_pre=sess.run(prediction,feed_dict={xs:v_xs,keep_prob:1}) #這里的keep_prob是保留概率,即我們要保留的RELU的結果所占比例    correct_prediction=tf.equal(tf.argmax(y_pre,1),tf.argmax(v_ys,1))    accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))    result=sess.run(accuracy,feed_dict={xs:v_xs,ys:v_ys,keep_prob:1})    return resultdef weight_variable(shape):    inital=tf.truncated_normal(shape,stddev=0.1)     #stddev爲標準差    return tf.Variable(inital)def bias_variable(shape):    inital=tf.constant(0.1,shape=shape)    return tf.Variable(inital)def conv2d(x,W):    #x爲像素值,W爲權值    #strides[1,x_movement,y_movement,1]    #must have strides[0]=strides[3]=1    #padding=????    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')#def max_pool_2x2(x):    # strides[1,x_movement,y_movement,1]    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')#ksize二三維為池化窗口#define placeholder for inputs to networkxs=tf.placeholder(tf.float32,[None,784])/255ys=tf.placeholder(tf.float32,[None,10])keep_prob=tf.placeholder(tf.float32)x_image=tf.reshape(xs, [-1,28,28,1]) #-1為這個維度不確定,變成一個4維的矩陣,最后為最里面的維數#print x_image.shape                 #最后這個1理解為輸入的channel,因為為黑白色所以為1##conv1 layer##W_conv1=weight_variable([5,5,1,32]) #patch 5x5,in size 1 是image的厚度,outsize 32 是提取的特征的維數b_conv1=bias_variable([32])h_conv1=tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)# output size 28x28x32 因為padding='SAME'h_pool1=max_pool_2x2(h_conv1)      #output size 14x14x32##conv2 layer##W_conv2=weight_variable([5,5,32,64]) #patch 5x5,in size 32 是conv1的厚度,outsize 64 是提取的特征的維數b_conv2=bias_variable([64])h_conv2=tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)# output size 14x14x64 因為padding='SAME'h_pool2=max_pool_2x2(h_conv2)      #output size 7x7x64##func1 layer##W_fc1=weight_variable([7*7*64,1024])b_fc1=bias_variable([1024])#[n_samples,7,7,64]->>[n_samples,7*7*64]h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64])h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)  #防止過擬合##func2 layer##W_fc2=weight_variable([1024,10])b_fc2=bias_variable([10])#prediction=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)prediction=tf.matmul(h_fc1_drop,W_fc2)+b_fc2
#h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)  #防止過擬合#the errro between prediction and real data#cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys*tf.log(prediction),reduction_indices=[1]))
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=ys, logits=prediction)train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)sess=tf.Session()sess.run(tf.global_variables_initializer())for i in range(1000):    batch_xs,batch_ys=mnist.train.next_batch(100)    sess.run(train_step,feed_dict={xs:batch_xs,ys:batch_ys,keep_prob:0.5})    if i%50 ==0:        accuracy = 0        for j in range(10):            test_batch = mnist.test.next_batch(1000)            acc_forone=compute_accuracy(test_batch[0], test_batch[1])            #print 'once=%f' %(acc_forone)            accuracy=acc_forone+accuracy        print '測試結果:batch:%g,准確率:%f' %(i,accuracy/10)

 

實驗結果為:

測試結果:batch:0,准確率:0.090000
測試結果:batch:50,准確率:0.788600
測試結果:batch:100,准確率:0.880200
測試結果:batch:150,准確率:0.904600
測試結果:batch:200,准確率:0.927500
測試結果:batch:250,准確率:0.929800
測試結果:batch:300,准確率:0.939600
測試結果:batch:350,准確率:0.942100
測試結果:batch:400,准確率:0.950600
測試結果:batch:450,准確率:0.950700
測試結果:batch:500,准確率:0.956700
測試結果:batch:550,准確率:0.956000
測試結果:batch:600,准確率:0.957100
測試結果:batch:650,准確率:0.958400
測試結果:batch:700,准確率:0.961500
測試結果:batch:750,准確率:0.963800
測試結果:batch:800,准確率:0.965000
測試結果:batch:850,准確率:0.966300
測試結果:batch:900,准確率:0.967800
測試結果:batch:950,准確率:0.967700

迭代次數沒有太多,否則准確率還會提高。


注意!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系我们删除。



 
粤ICP备14056181号  © 2014-2021 ITdaan.com