使用Dataframeapi的spark/scala中的nnls问题

b1payxdu  于 2021-07-14  发布在  Spark
关注(0)|答案(0)|浏览(189)

我想用基于Dataframe的api解决scala/spark中的nnls问题。
Dataframe如下所示:

+----+------------------+-----+
|   Y|                X1|   X2|
+----+------------------+-----+
|-1.7| 50.51276580843747|0.343|
|-1.2| 70.58384514044127|0.244|
| 0.0| 54.56897445757835|1.911|
| 0.8|  95.8955904991304|0.245|
| 1.3| 53.65640599254909|0.327|
| 1.0| 66.18962623993359|0.243|
| 1.7| 52.37644838584729|0.398|
|-0.4| 39.57264337369486|0.343|
| 1.1|62.259779346546004|0.335|
|-0.9| 64.82937381844667|0.412|
| 0.1| 46.48493458603609|0.339|
|-0.1|123.83841012844874|0.247|
| 0.8| 53.12487590858838|0.354|
| 0.3| 79.81381334543124|0.244|
| 1.5|63.301591212465596|1.198|
| 1.0| 77.14840199888344|0.248|
| 0.1| 79.82359203197964|3.697|
| 2.0| 53.36775787142796| 0.36|
| 1.4| 91.65936155280471|0.353|
| 1.0| 84.27888134495655|0.245|
| 2.2|52.094759755920066|0.369|
| 2.0| 58.75991022338376|1.494|
|-0.3| 89.30966602957443| 1.36|
| 0.9|  79.4832840724177|0.245|
|-0.4| 63.94166370298794|0.673|
| 0.4|  67.0297486304631|1.163|
|-2.8|105.07712894936725|0.657|
|-1.7| 98.99383087455234|0.694|
| 3.1| 70.94369835738382| 0.83|
| 1.1| 65.78630103008123|0.268|
| 1.6| 66.03439924959426| 0.46|
| 1.0| 53.43239602321774| 0.35|
| 2.7| 75.61177994048974|3.716|
| 1.2| 51.38108450105215|0.342|
| 0.2| 30.70446522413276|0.356|
|-0.7| 93.42871609767502|0.695|
| 1.2| 59.96642169499171|1.939|
| 2.0| 95.65822191505502|0.241|
|-0.1|28.799947350603603|0.356|
| 1.2| 80.65412483290257| 0.24|
|-1.0| 46.59691564712856|0.344|
| 1.2| 78.21836143100434| 0.34|
| 0.8| 64.25385693074021|0.834|
|-0.8| 64.71168662524684|1.181|
| 2.1| 74.59897179498064|0.246|
| 0.7|  82.6152756544184|0.245|
|-0.3|60.452703549332476|0.384|
|-1.2| 73.43917806749589| 0.25|
| 1.2| 89.90233876812485|4.779|
| 0.9| 72.84472828638712|0.253|
| 1.9| 76.11283800851527|0.964|
| 0.0| 58.81249871781766|0.245|
|-0.3| 36.12192796898518|0.344|
| 0.3| 81.23517427659587| 0.25|
| 0.3| 73.86432662365802| 0.25|
| 0.4| 96.00440766838409| 0.25|
| 1.7| 71.69561319460973|0.245|
| 1.5| 70.76051592579681|0.245|
| 0.0| 56.15608683365151|0.449|
|-2.2| 76.98005048496123|0.282|
|-1.6| 87.96650261841285|1.358|
|-3.8|100.73008777829898|0.711|
| 2.5| 48.16431333803761|1.907|
| 2.7| 55.34503365248355|0.422|
| 1.8|46.798862029472524|3.739|
|-0.2| 73.35711875245644|0.246|
| 1.5|50.482677350284476| 0.25|
|-1.3| 48.44028879763816| 0.34|
| 0.6| 72.34212794345704| 0.26|
| 2.6| 76.80874433102495| 1.98|
|-1.0| 50.04611453613615| 0.35|
| 0.7| 92.89685936608063|0.249|
|-0.8| 71.59326963753377|0.247|
|-0.6| 58.54059176740566|0.251|
| 0.8| 58.80655219352515|0.336|
| 0.8| 93.62222572602967| 3.12|
|-2.1|153.34495241200904|0.246|
| 2.0| 46.00693873201085|0.345|
| 2.3| 81.90048515461571|0.324|
| 0.8| 94.30133033505714|0.251|
|-0.5| 86.38295354196498|0.248|
| 0.1| 64.46055098570676|0.337|
|-0.1| 63.21427366108357|0.293|
| 0.6|55.000537290454666|0.341|
| 0.5| 49.67546929103337|0.247|
| 2.1| 78.78073623445495|0.246|
|-0.9|49.837813042307864|0.346|
| 1.6| 75.31975062363492|7.018|
| 0.9| 65.23383509901653|2.258|
| 1.2|58.082224310844786|0.403|
| 0.4| 44.17460942823538|0.351|
| 0.5| 68.89917283924727|0.483|
| 1.7|  81.7099153755847|0.252|
| 1.0|  72.3261199644686|1.168|
| 0.5| 61.89567735569798|1.859|
| 1.4| 76.02404619115855|0.247|
+----+------------------+-----+

其中x1和x2是我的变量,y是我的标签。基本方程是 Y = aX1 + bX2 我想找到系数 a 以及 b . 的负值 a 以及 b 毫无意义。
我在这里发现了同样的问题,但仍然没有解决。
这里还有一个类似的问题,但是这个例子是用java编写的。
如何在spark中用dataframeapi解决这个问题?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题