在Shiny中导入和访问大型数据文件

[英]Importing and accessing large data files in Shiny


I have an app where I want to pull out values from a lookup table based on user inputs. The reference table is a statistical test, based on a calculation that'd be too slow to do for all the different combinations of user inputs. Hence, a lookup table for all the possibilities.

我有一个应用程序,我想根据用户输入从查找表中提取值。参考表是一种统计测试,基于对所有不同用户输入组合进行的计算太慢。因此,所有可能性的查找表。

But... right now the table is about 60 MB (as .Rdata) or 214 MB (as .csv), and it'll get much larger if I expand the possible user inputs. I've already reduced the number of significant figures in the data (to 3) and removed the row/column names.

但是......现在这个表大约是60 MB(作为.Rdata)或214 MB(作为.csv),如果我扩展可能的用户输入,它会变得更大。我已经减少了数据中有效数字的数量(减少到3)并删除了行/列名称。

Obviously, I can preload the lookup table outside the reactive server function, but it'll still take a decent chunk of time to load in that data. Does anyone have any tips on dealing with large amounts of data in Shiny? Thanks!

显然,我可以在响应服务器功能之外预加载查找表,但是仍然需要花费相当多的时间来加载该数据。有没有人有关于处理Shiny中大量数据的任何提示?谢谢!

1 个解决方案

#1


5  

flaneuse, we are still working with a smaller set that you but we have been experimenting with:

flaneuse,我们仍然使用较小的设置,但我们一直在尝试:

  1. Use rds for our data

    将rds用于我们的数据

    As @jazzurro mentioned rds above, and you seem to know how to do this, but the syntax for others is below.

    正如@jazzurro上面提到的rds,你似乎知道如何做到这一点,但其他人的语法如下。

    Format .rds allows you to bring in a single R object so you can rename it if needs be.

    格式.rds允许您引入单个R对象,以便在需要时可以重命名它。

    In your prep data code, for example:

    在准备数据代码中,例如:

    mystorefile <- file.path("/my/path","data.rds")
    # ... do data stuff
    
    # Save down (assuming mydata holds your data frame or table)
    saveRDS(mydata, file = mystorefile)
    

    In your shiny code:

    在您闪亮的代码中:

    #  Load in my data
    x <- readRDS(mystorefile)
    

    Remember to copy your data .rds file into your app directory when you deploy. We use a data directory /myapp/data and then file.path for store file is changed to "./data" in our shiny code.

    记住在部署时将数据.rds文件复制到app目录中。我们使用数据目录/ myapp / data,然后在我们闪亮的代码中将存储文件的file.path更改为“./data”。

  2. global.R

    global.R

    We have placed our readRDS calls to load in our data in this global file (instead of in server.R before shinyServer() call), so that is run once, and is available for all sessions, with the added bonus it can be seen by ui.R.

    我们已将readRDS调用放入此全局文件中的数据中(而不是在shinyServer()调用之前的server.R中),因此运行一次,并且可用于所有会话,可以看到额外的奖励由ui.R.

    See this scoping explanation for R Shiny.

    请参阅R Shiny的这个范围解释。

  3. Slice and dice upfront

    前面切片和骰子

    The standard daily reports use the most recent data. So I make a small latest.dt in my global.R of a smaller subset of my data. So the landing page with the latest charts work with this smaller data set to get faster charts.

    标准日报使用最新数据。所以我在我的global.R中创建了一个小的latest.dt,它包含了我的一小部分数据。因此,具有最新图表的登录页面可以使用这个较小的数据集来获得更快的图表。

    The custom data tab which uses the full.dt then is on a separate tab. It is slower but at that stage the user is more patient, and is thinking of what dates and other parameters to choose.

    然后,使用full.dt的自定义数据选项卡位于单独的选项卡上。它比较慢但在那个阶段用户更耐心,并且正在考虑选择什么日期和其他参数。

    This subset idea may help you.

    这个子集的想法可以帮助你。

Would be interested in what others (with more demanding data sets have tried)!

会对其他人感兴趣(有更多要求的数据集尝试过)!


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.itdaan.com/blog/2014/09/04/29a3838e5c71a49c37374f0bb6ff62ed.html



 
© 2014-2018 ITdaan.com 粤ICP备14056181号