从csv文件创建配置单元表时的唯一id

7xzttuei  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(350)

我有一个csv文件的列表,我想导出为配置单元表,但我很确定一些记录在csv中是多余的。csv中的每个记录/行都由一个键标识,我想使用该键作为主键生成表。如何生成配置单元表,以便没有重复行?

osh3o9ms

osh3o9ms1#

ROW_NUMBER() OVER([partition_by_clause] order_by_clause)

返回整数的升序,从1开始。

select x, row_number() over(order by x, property) as row_number, property from int_t;
+----+------------+----------+
| x  | row_number | property |
+----+------------+----------+
| 1  | 1          | odd      |
| 1  | 2          | square   |
| 2  | 3          | even     |
| 2  | 4          | prime    |
| 3  | 5          | odd      |
| 3  | 6          | prime    |
| 4  | 7          | even     |
| 4  | 8          | square   |
| 5  | 9          | odd      |
| 5  | 10         | prime    |
| 6  | 11         | even     |
| 6  | 12         | perfect  |
| 7  | 13         | lucky    |
| 7  | 14         | lucky    |
| 7  | 15         | lucky    |
| 7  | 16         | odd      |
| 7  | 17         | prime    |
| 8  | 18         | even     |
| 9  | 19         | odd      |
| 9  | 20         | square   |
| 10 | 21         | even     |
| 10 | 22         | round    |
+----+------------+----------+

相关问题