Spark mllib 保序回归

栏目: 服务器 · 发布时间: 7年前

内容简介：从该序列的首元素往后观察，一旦出现乱序现象停止该轮观察，从该乱序元素开始逐个吸收元素组成一个序列，直到该序列所有元素的平均值小于或等于下一个待吸收的元素。举例：原始序列：<9, 10, 14>

从该序列的首元素往后观察，一旦出现乱序现象停止该轮观察，从该乱序元素开始逐个吸收元素组成一个序列，直到该序列所有元素的平均值小于或等于下一个待吸收的元素。

举例：

原始序列：<9, 10, 14>

结果序列：<9, 10, 14>

分析：从9往后观察，到最后的元素14都未发现乱序情况，不用处理。

原始序列：<9, 14, 10>

结果序列：<9, 12, 12>

分析：从9往后观察，观察到14时发生乱序（14>10），停止该轮观察转入吸收元素处理，吸收元素10后子序列为<14, 10>，取该序列所有元素的平均值得12，故用序列<12, 12>替代<14, 10>。吸收10后已经到了最后的元素，处理操作完成。

原始序列：<14, 9, 10, 15>

结果序列：<11, 11, 11, 15>

分析：从14往后观察，观察到9时发生乱序（14>9），停止该轮观察转入吸收元素处理，吸收元素9后子序列为<14,9>。求该序列所有元素的平均值得12.5，由于12.5大于下个待吸收的元素10，所以再吸收10，得序列<14, 9, 10>。求该序列所有元素的平均值得11，由于11小于下个待吸收的元素15，所以停止吸收操作，用序列<11, 11, 11>替代<14, 9, 10>。

package com.immooc.spark

import org.apache.log4j.{Level, Logger}
import org.apache.spark.mllib.regression.IsotonicRegression
import org.apache.spark.{SparkConf, SparkContext}

object Isotonic_Regression {
  def main(args:Array[String]): Unit = {


    val conf = new SparkConf().setAppName("LinearRegressionWithSGD").setMaster("local[2]")
    val sc = new SparkContext(conf)

    Logger.getRootLogger.setLevel(Level.WARN)

    val data = sc.textFile("file:///Users/walle/Documents/D3/sparkmlib/sample_isotonic_regression_data.txt")
    val parsedData = data.map{
       line=>
        val parts = line.split(',').map(_.toDouble)
         (parts(0), parts(1), 1.0)
    }
    val splits = parsedData.randomSplit(Array(0.6, 0.4), seed = 11L)
    val training = splits(0)
    val test = splits(1)

     val model = new IsotonicRegression().setIsotonic(true).run(training)
     val x = model.boundaries
     val y = model.predictions
     println("boundaries" + "\t" + "predictions")
     for (i <- 0 to x.length -1){
        println(x(i) + "\t" + y(i))
     }

     val predictionAndLabel = test.map{
        point =>
         val predictedLabel = model.predict(point._2)
          (predictedLabel, point._1)
     }
    val print_predict = predictionAndLabel.collect
    println("prediction" + "\t" + "label")
    for (i <- 0 to print_predict.length - 1) {
      println(print_predict(i)._1 + "\t" + print_predict(i)._2)
    }
    val meanSquaredError = predictionAndLabel.map { case (p, l) => math.pow((p - l), 2) }.mean()
    println("Mean Squared Error = " + meanSquaredError)
  }
}

4659

以上所述就是小编给大家介绍的《Spark mllib 保序回归》，希望对大家有所帮助，如果大家有任何疑问请给我留言，小编会及时回复大家的。在此也非常感谢大家对码农网的支持！

查看所有标签

猜你喜欢:

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

The NSHipster Fake Book (Objective-C)

Mattt Thompson / NSHipster Portland, Oregon / 2014 / USD 19.00

Fake Books are an indispensable tool for jazz musicians. They contain the melody, rhythm, and chord changes for hundreds of standards, allowing a player to jump into any session cold, and "fake it" th......一起来看看《The NSHipster Fake Book (Objective-C)》这本书的介绍吧!

码农工具

Spark mllib 保序回归

The NSHipster Fake Book (Objective-C)

MD5 加密

RGB CMYK 转换工具

HEX CMYK 转换工具