select dbms_rowid.rowid_relative_fno(rowid) as fileno,
dbms_rowid.rowid_block_number(rowid) as blockno,
dbms_rowid.rowid_row_number(rowid) as offset
from (select rowid from [my_big_table] sample (.01))
where rownum = 1
我使用的是一个子分区表,即使抓取多行,也能获得很好的随机性:
select dbms_rowid.rowid_relative_fno(rowid) as fileno,
dbms_rowid.rowid_block_number(rowid) as blockno,
dbms_rowid.rowid_row_number(rowid) as offset
from (select rowid from [my_big_table] sample (.01))
where rownum <= 5
FILENO BLOCKNO OFFSET
---------- ---------- ----------
152 2454936 11
152 2463140 32
152 2335208 2
152 2429207 23
152 2746125 28
SELECT *
FROM My_User.My_Table
WHERE ROWID = (SELECT MAX(t.ROWID) KEEP(DENSE_RANK FIRST ORDER BY dbms_random.value)
FROM (SELECT o.Data_Object_Id,
e.Relative_Fno,
e.Block_Id + TRUNC(Dbms_Random.Value(0, e.Blocks)) AS Block_Id
FROM Dba_Extents e
JOIN Dba_Objects o ON o.Owner = e.Owner AND o.Object_Type = e.Segment_Type AND o.Object_Name = e.Segment_Name
WHERE e.Segment_Name = 'MY_TABLE'
AND(e.Segment_Type, e.Owner, e.Extent_Id) =
(SELECT MAX(e.Segment_Type) AS Segment_Type,
MAX(e.Owner) AS Owner,
MAX(e.Extent_Id) KEEP(DENSE_RANK FIRST ORDER BY Dbms_Random.Value) AS Extent_Id
FROM Dba_Extents e
WHERE e.Segment_Name = 'MY_TABLE'
AND e.Owner = 'MY_USER'
AND e.Segment_Type = 'TABLE')) e
JOIN My_User.My_Table t
ON t.Rowid BETWEEN Dbms_Rowid.Rowid_Create(1, Data_Object_Id, Relative_Fno, Block_Id, 0)
AND Dbms_Rowid.Rowid_Create(1, Data_Object_Id, Relative_Fno, Block_Id, 32767))
WITH gen AS ((SELECT --+ inline leading(e) use_nl(e t) rowid(t)
MAX(t.ROWID) KEEP(DENSE_RANK FIRST ORDER BY dbms_random.value) Row_Id
FROM (SELECT o.Data_Object_Id,
e.Relative_Fno,
e.Block_Id + TRUNC(Dbms_Random.Value(0, e.Blocks)) AS Block_Id
FROM Dba_Extents e
JOIN Dba_Objects o ON o.Owner = e.Owner AND o.Object_Type = e.Segment_Type AND o.Object_Name = e.Segment_Name
WHERE e.Segment_Name = 'MY_TABLE'
AND(e.Segment_Type, e.Owner, e.Extent_Id) =
(SELECT MAX(e.Segment_Type) AS Segment_Type,
MAX(e.Owner) AS Owner,
MAX(e.Extent_Id) KEEP(DENSE_RANK FIRST ORDER BY Dbms_Random.Value) AS Extent_Id
FROM Dba_Extents e
WHERE e.Segment_Name = 'MY_TABLE'
AND e.Owner = 'MY_USER'
AND e.Segment_Type = 'TABLE')) e
JOIN MY_USER.MY_TABLE t ON t.ROWID BETWEEN Dbms_Rowid.Rowid_Create(1, Data_Object_Id, Relative_Fno, Block_Id, 0)
AND Dbms_Rowid.Rowid_Create(1, Data_Object_Id, Relative_Fno, Block_Id, 32767))),
Retries(Cnt, Row_Id) AS (SELECT 1, gen.Row_Id
FROM Dual
LEFT JOIN gen ON 1=1
UNION ALL
SELECT Cnt + 1, gen.Row_Id
FROM Retries
LEFT JOIN gen ON 1=1
WHERE Retries.Row_Id IS NULL AND Retries.Cnt < 10)
SELECT *
FROM MY_USER.MY_TABLE
WHERE ROWID = (SELECT Row_Id
FROM Retries
WHERE Row_Id IS NOT NULL)
7条答案
按热度按时间ltqd579y1#
使用适当的
sample(x)
值是最快的方法。它是块随机的,在块内是行随机的,所以如果你只想要一个随机行:我使用的是一个子分区表,即使抓取多行,也能获得很好的随机性:
我怀疑您可能应该调优
SAMPLE
子句,以便为所获取的内容使用适当的样本大小。uwopmtnx2#
首先从Adam的答案开始,但如果
SAMPLE
不够快,即使使用ROWNUM优化,您也可以使用块样本:这将在块级别而不是每一行应用采样。这确实意味着它可以从表中跳过大量数据,因此采样百分比将非常粗略。对于具有低百分比的SAMPLE BLOCK返回零行并不罕见。
mwkjh3gx3#
以下是AskTom上的问题:
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6075151195522
如果你知道你的表有多大,就使用上面描述的示例块。如果你不知道,你可以修改下面的例程来得到你想要的行数。
复制自:http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6075151195522#56174726207861
wgeznvg74#
下面这个问题的解决方案并不是确切的答案,但在许多情况下,您尝试选择一行并尝试将其用于某些目的,然后将其状态更新为“已使用”或“已完成”,以便不再选择它。
解决方案:
下面的查询是有用的,但如果你的表很大,我只是试着看到你肯定会面临这个查询的性能问题。
SELECT * FROM(SELECT * FROM table ORDER BY dbms_random.value)WHERE rownum = 1
因此,如果您像下面这样设置rownum,那么您可以解决性能问题。通过递增rownum,您可以减少可能性。但在这种情况下,您将始终从相同的1000行中获取行。如果您从1000中获取一行并将其状态更新为“USED”,则每次使用“ACTIVE”查询时,您几乎都会得到不同的行
选择行后更新行的状态,如果不能更新,则意味着另一个事务已经使用了它。然后您应该尝试获取新行并更新其状态。顺便说一下,由于rownum为1000,因此两个不同事务获取同一行的可能性为0.001。
dw1jzc5e5#
有人告诉sample(x)是最快的方法。但对我来说,这个方法比sample(x)方法稍微快一点。无论表的大小如何,它都需要几分之一秒(在我的情况下是0.2)。如果需要更长的时间,请尝试使用提示(--+ leading(e)use_nl(e t)rowid(t))可以帮助
zhte4eai6#
未返回行时重试的版本:
apeeds0o7#
可以使用伪随机行吗?