优化：将SQLite. db文件转换为R上的data. frame

owfi6suc 于 2023-02-05 发布在 SQLite

关注(0)|答案(1)|浏览(138)

我需要分析大量的数据（~ 40 Go，300万行）。这些数据太大了，无法在电子表格或R中打开。
为了解决这个问题，我将数据加载到SQLite数据库中，然后使用R（和RSQLite包）将其拆分为可以操作的小部分（70000行），然后需要数据为data.frame格式，以便进行分析，我使用了.data.frame：

#Connecting to the database
con = dbConnect(drv=RSQLite::SQLite(),dbname="path")

#Connecting to the table
d=tbl(con, "Test")

#Filter the database and convert it
d %>% 
   #I filtered using dplyr
   filter("reduce number of rows") %>% 
   as.data.frame()

它工作正常，但是执行起来需要很多时间。有人知道如何让这个更快吗（知道我的RAM有限）？
我还尝试了setDT（），但它似乎对SQLight数据不起作用。

d %>% setDT()

Error in setDT(.) : 
All elements in argument 'x' to 'setDT' must be of same length, but the profile of input lengths (length:frequency) is: [2:1, 13:1]
The first entry with fewer than 13 entries is 1

谢谢

sqlite

来源：https://stackoverflow.com/questions/75336279/optimization-turning-a-sqlite-db-file-into-data-frame-on-r

1条答案

按热度按时间

rjee0c151#

要使用问题中的con处理70000行的连续块，请将下面的print语句替换为所需的任何处理（base、dplyr、data.table等）。

rs <- dbSendQuery(con, "select * from Test")
while(!dbHasCompleted(rs)) {
  dat <- dbFetch(rs, 70000)
  print(dim(dat)) # replace with your processing
}
dbClearResult(rs)
dbDisconnect(con)

赞(0）回复(0）举报 2023-02-05

我来回答

优化：将SQLite. db文件转换为R上的data. frame

1条答案

相关问题

热门标签

最新问答