使用spark mllib識別LED數字


最近在學習spark mllib,突然有個想法,能否利用mllib來識別LED數字呢?

說干就干,先在紙上畫出了0-9的LED顯示,然后教兒子怎么把每個數字轉成一個向量:

我和兒子共同完成的LED數字

然后准備一個文本文件labeled-points.txt:

(0,[1,1,1,0,1,1,1])
(1,[0,0,1,0,0,1,0])
(2,[1,0,1,1,1,0,1])
(3,[1,0,1,1,0,1,1])
(4,[0,1,1,1,0,1,0])
(5,[1,1,0,1,0,1,1])
(6,[1,1,0,1,1,1,1])
(7,[1,0,1,0,0,1,0])
(8,[1,1,1,1,1,1,1])
(9,[1,1,1,1,0,1,1])

這個應該是個分類問題,選用LogisticRegressionWithLBFGS:

import org.apache.spark.SparkContext
import org.apache.spark.mllib.classification.{LogisticRegressionWithLBFGS, LogisticRegressionModel}
import org.apache.spark.mllib.evaluation.MulticlassMetrics
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.util.MLUtils

val data = MLUtils.loadLabeledPoints(sc,"file:///labeled-points.txt")

val model = new LogisticRegressionWithLBFGS().setNumClasses(10).run(data)

最后隨便拿出一個數字,讓它來猜猜:

model.predict(Vectors.dense(Array(1.0,1.0,0.0,1.0,1.0,1.0,1.0)))
res24: Double = 6.0

怎么樣?還是不錯的吧~~~
這個model的權重好像很多:

scala> model.weights
res29: org.apache.spark.mllib.linalg.Vector = [-30.68420353263392,-13.819117851407729,20.27517376062209,-17.878063796956003,-8.641455334351308,30.780101010826108,-17.243132942967932,11.509783689282543,-14.900033352764835,11.684427660291847,12.07060530601555,17.31310120498649,-59.232219286895614,11.386593790510851,5.090436073435982,-25.983226739036247,3.8562537560950205,10.30603523382833,-14.356520662249086,-0.8387336591993536,9.557546094038313,-31.853702036193578,14.47383139964512,8.794502062134377,13.95757471451068,-9.84270147539397,8.893604711671909,-20.982818493822286,6.522677078631281,10.241387441753627,-40.12363598646755,8.698186733644015,-17.906199015916943,3.8333204609982374,8.050490292724493,2.3703073829997034,7.401912395339493,-46.95213122452467,9.951675789189242,19.38502076196...

真的很累啊^_^


注意!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系我们删除。



 
粤ICP备14056181号  © 2014-2021 ITdaan.com