spark graphframe查找层次结构

wkftcu5l  于 2021-05-29  发布在  Spark
关注(0)|答案(0)|浏览(358)

我正在尝试做一个非常简单的用例。我有两个Dataframe-

>>> g.vertices.show(20,False)
+------------------------+
|id                      |
+------------------------+
|Router_UPDATE_INSERT    |
|Seq_Unique_Key          |
|Target_New_Insert       |
|Target_Existing_Update  |
|Target_Existing_Insert  |
|SAMPLE_CUSTOMER         |
|SAMPLE_CUSTOMER_MASTER  |
|Sorter_SAMPLE_CUSTOMER  |
|Sorter_CUSTOMER_MASTER  |
|Join_Source_Target      |
|Exp_DetectChanges       |
|Filter_Unchanged_Records|

边缘细节-

>>> g.edges.show(20,False)
+------------------------+----------+------------------------+----------+
|src                     |From Type |dst                     |To Type   |
+------------------------+----------+------------------------+----------+
|Sorter_SAMPLE_CUSTOMER  |sorter    |Join_Source_Target      |joiner    |
|Sorter_CUSTOMER_MASTER  |sorter    |Join_Source_Target      |joiner    |
|Join_Source_Target      |joiner    |Exp_DetectChanges       |expression|
|SAMPLE_CUSTOMER         |source    |Sorter_SAMPLE_CUSTOMER  |sorter    |
|Router_UPDATE_INSERT    |router    |Target_Existing_Update  |target    |
|Seq_Unique_Key          |sequencetx|Target_Existing_Insert  |target    |
|Filter_Unchanged_Records|filter    |Router_UPDATE_INSERT    |router    |
|Exp_DetectChanges       |expression|Filter_Unchanged_Records|filter    |
|Seq_Unique_Key          |sequencetx|Target_New_Insert       |target    |
|Router_UPDATE_INSERT    |router    |Seq_Unique_Key          |sequencetx|
|SAMPLE_CUSTOMER_MASTER  |source    |Sorter_CUSTOMER_MASTER  |sorter    |
+------------------------+----------+------------------------+----------+

g = GraphFrame(vertices, edges)

现在我可以找到两种不同的血统-

>>> filteredPaths = g.bfs(
...   fromExpr = "id = 'SAMPLE_CUSTOMER_MASTER'",
...   toExpr = "id = 'Router_UPDATE_INSERT'",
...   edgeFilter = "src != 'joiner1'",
...   maxPathLength = 10)

第二血统-

>>> filteredPaths = g.bfs(
...   fromExpr = "id = 'SAMPLE_CUSTOMER'",
...   toExpr = "id = 'Router_UPDATE_INSERT'",
...   edgeFilter = "src != 'joiner1'",
...   maxPathLength = 10)

两个数据源稍后被合并和拆分,我所需要的只是保持顺序的不同值-

SAMPLE_CUSTOMER
Sorter_SAMPLE_CUSTOMER
SAMPLE_CUSTOMER_MASTER
Sorter_CUSTOMER_MASTER
Join_Source_Target
Exp_DetectChanges
Filter_Unchanged_Records
Router_UPDATE_INSERT
Seq_Unique_Key
Target_New_Insert
Target_Existing_Insert
Target_Existing_Update

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题