中文
注册
我要评分
文档获取效率
文档正确性
内容完整性
文档易理解
在线提单
论坛求助

任务执行

机器学习

在客户端下载并解压开发程序中样例代码中对应的数据集到“/tmp/data/ epsilon”目录,并执行任务,具体步骤如下:

  1. 进入“/tmp/data/epsilon”目录。
    cd /tmp/data/epsilon
  2. 下载训练集。
    wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/epsilon_normalized.bz2

  3. 下载测试集。
    wget https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/epsilon_normalized.t.bz2

  4. 解压训练集和测试集与当前目录。
    bzip2 -d epsilon_normalized.bz2 bzip2 -d epsilon_normalized.t.bz2
  5. 上传训练集和测试集到HDFS上。
    hadoop fs -put /tmp/data/epsilon/epsilon_normalized  /tmp/data/epsilon/ hadoop fs -put /tmp/data/epsilon/epsilon_normalized.t  /tmp/data/epsilon/
  6. 任务执行中生成的kal_examples_2.11-0.1.jar和run_gbdt.sh放入3.2中客户端 “/home/test/sophon/”目录。 run_gbdt.sh内容如下:
    spark-submit \
    --class com.bigdata.examples.GBDTRunner \ --driver-class-path "./lib/*" \
    --jars "lib/fastutil-8.3.1.jar,lib/sophon-ml-acc_2.11-1.2.0.jar,lib/sophon-ml-core_2.11-1.2.0.jar,lib/sophon-mlkernel-2.11-1.2.0-aarch_64.jar" \
    --conf "spark.executor.extraClassPath=fastutil-8.3.1.jar:sophon-ml-acc_2.11-1.2.0.jar:sophon-ml-core_2.11-1.2.0.jar:sophon-ml-kernel-2.11-1.2.0-aarch_64.jar" \
    --master yarn \
    --deploy-mode client \ --driver-cores 40 \
    --driver-memory 50g \
    --executor-cores 19 --num-executors 12 --executor-memory 77g \
    --numPartitions 228 \
    ./kal_examples_2.11-0.1.jar
  7. 执行任务。
    sh run_gbdt.sh

    屏幕上查看打印结果。

  8. 结果说明(总共迭代100次,共用生成100棵子树),选取其中的3个子树(Tree 0, 1, 99)进行展示:
    Test Error = 0.1687594970527827  // 预测的分类误差 Learned classification GBT model:
    GBTClassificationModel (uid=gbtc_dbb4de23ca65) with 100 trees
    Tree 0 (weight 1.0):  //每棵子树权重
    If (feature 818 <= 0.0028371200000000003) //第818维特征的分裂点为0.002831200000000003 If (feature 1866 <= 0.0064008599999999995)
    If (feature 315 <= -0.0067819) If (feature 789 <= -0.0100215)
    If (feature 936 <= 0.0018099549999999998)
    Predict: -0.21098494850805388 //第0棵子树预测的结果
    Else (feature 936 > 0.0018099549999999998) Predict: 0.15191210648637227
    Else (feature 789 > -0.0100215)
    If (feature 936 <= -4.79549E-4) Predict: 0.17726731948384736
    Else (feature 936 > -4.79549E-4) Predict: 0.49173760640961445
    Else (feature 315 > -0.0067819)
    If (feature 789 <= -0.0100215)
    If (feature 936 <= 0.00412001)
    Predict: -0.5011764705882353
    Else (feature 936 > 0.00412001)
    Predict: -0.21027097384924862
    Else (feature 789 > -0.0100215)
    If (feature 649 <= -0.008577419999999999) Predict: -0.20268122451800152
    Else (feature 649 > -0.008577419999999999) Predict: 0.12942312334057446
    Else (feature 1866 > 0.0064008599999999995)
    If (feature 1697 <= 0.02054865)
    If (feature 649 <= -0.00122175)
    If (feature 315 <= 1.620675E-4) Predict: 0.31944546321425954
    Else (feature 315 > 1.620675E-4)
    Predict: -0.014997100008285691
    Else (feature 649 > -0.00122175)
    If (feature 315 <= 0.01095895) Predict: 0.5328728914862532
    Else (feature 315 > 0.01095895)
    Predict: 0.2712697181277476
    Else (feature 1697 > 0.02054865)
    If (feature 649 <= -0.008577419999999999)
    If (feature 315 <= 1.620675E-4)
    Predict: 0.5659284497444633
    Else (feature 315 > 1.620675E-4) Predict: 0.29297616536595655
    Else (feature 649 > -0.008577419999999999)
    If (feature 1519 <= -0.0024157199999999997) Predict: 0.5493390716261912
    Else (feature 1519 > -0.0024157199999999997) Predict: 0.7277585664885257
    Else (feature 818 > 0.0028371200000000003) If (feature 1866 <= 0.008616374999999999)
    If (feature 789 <= -0.0100215)
    If (feature 1794 <= -0.015113399999999999)
    If (feature 315 <= -0.009021685) Predict: -0.14581734458940906
    Else (feature 315 > -0.009021685) Predict: -0.43055000665867627
    Else (feature 1794 > -0.015113399999999999) If (feature 649 <= -0.008577419999999999)
    Predict: -0.6742799137165334
    Else (feature 649 > -0.008577419999999999) Predict: -0.466970082323807
    Else (feature 789 > -0.0100215)
    If (feature 755 <= 0.00919656)
    If (feature 315 <= -0.002158415)
    Predict: 0.04372444164831708
    Else (feature 315 > -0.002158415)
    Predict: -0.2799099183635169
    Else (feature 755 > 0.00919656)
    If (feature 1697 <= 0.0110916)
    Predict: -0.5381727158948686
    Else (feature 1697 > 0.0110916)
    Predict: -0.2597821083320546
    Else (feature 1866 > 0.008616374999999999)
    If (feature 789 <= -0.0144677)
    If (feature 315 <= -0.00447581)
    If (feature 755 <= -0.0100993)
    Predict: 0.22025316455696203
    Else (feature 755 > -0.0100993)
    Predict: -0.1234739607479524
    Else (feature 315 > -0.00447581)
    If (feature 936 <= 0.0018099549999999998)
    Predict: -0.517205957883924
    Else (feature 936 > 0.0018099549999999998)
    Predict: -0.18999735379730087
    Else (feature 789 > -0.0144677)
    If (feature 315 <= 0.002566035)
    If (feature 649 <= -0.01302385)
    Predict: 0.11884615384615385
    Else (feature 649 > -0.01302385)
    Predict: 0.43213844252163164
    Else (feature 315 > 0.002566035)
    If (feature 936 <= 0.0018099549999999998)
    Predict: -0.1847658260422028
    Else (feature 936 > 0.0018099549999999998) Predict: 0.14976641934597418 Tree 1 (weight 0.1):
    If (feature 1389 <= -0.0025451) If (feature 994 <= -0.00340289) If (feature 1814 <= -4.766875E-4)
    .....
    Tree 99 (weight 0.1):
    If (feature 539 <= 0.0201035)
    If (feature 994 <= -0.0271962)
    If (feature 738 <= 0.020128649999999998)
    If (feature 1678 <= 0.001967885)
    If (feature 830 <= -0.01119755) Predict: -0.10394440691166018
    Else (feature 830 > -0.01119755)
    Predict: 0.06684711260628788
    Else (feature 1678 > 0.001967885)
    If (feature 958 <= -0.01238975) Predict: -0.05886208031154687
    Else (feature 958 > -0.01238975)
    Predict: 0.19195532805094076
    ......