在內存映射的稀疏文件的孔中寫入

[英]Writing in the hole of a memory-mapped sparse file


I need to have a file where certain bytes are contiguously laid out. Lets call these chunks. The reason the chunks need to be laid out contiguously is that these eventually get memory-mapped to an array. A file would have several chunks(these correspond to different but related arrays), and these chunks need to be appended over time. The first thing I thought about is to use a sparse file and have holes at the inter-chunk boundaries.

我需要一個文件,其中某些字節是連續布局的。讓我們稱這些塊。塊需要連續布局的原因是這些塊最終被內存映射到數組。一個文件將有幾個塊(這些塊對應於不同但相關的數組),並且這些塊需要隨着時間的推移而附加。我想到的第一件事是使用稀疏文件並在塊間邊界處有孔。

Whenever I have new data I could then write in the hole. If the space available in the hole is not enough I intend to move minimum amount of bytes to create the space and (some extra space for future) and then write the data.

每當我有新數據時,我就可以在洞里寫字。如果孔中的可用空間不夠,我打算移動最小量的字節來創建空間和(將來有一些額外的空間),然后寫入數據。

  1. Is this a wrong way to do things ?
  2. 這是一種錯誤的做事方式嗎?

  3. Are there good alternatives to this approach
  4. 這種方法有很好的替代方案嗎?

  5. How does the OS (Linux) handle a write in the hole, does it move(shift) all the bytes in the tail ? Or restructures the inodes to somehow accommodate (at the cost of fragmentation)
  6. 操作系統(Linux)如何處理漏洞中的寫入,是否移動(移位)尾部的所有字節?或者重新構造inode以某種方式容納(以碎片為代價)

  7. Is there an optimal way to do this so that amortized movement cost is small
  8. 有沒有一種最佳方法來實現這一點,以便攤銷的運動成本很小

1 个解决方案

#1


1  

  1. Most likely, yes.

    最有可能的是,是的。

  2. Linux already comes with a system for tracking multiple, contiguous byte sequences with efficient appends: the file system. Can't you just use multiple files?

    Linux已經附帶了一個系統,用於跟蹤多個連續的字節序列,並提供有效的附加功能:文件系統。你不能只使用多個文件嗎?

  3. If you use a modern Linux fs (i.e. not FAT32), it'll leave the existing data in place and allocate additional space. This could be either in a pre-allocated extent/block or by fragmentation. It's left to the FS to figure that out.

    如果您使用現代Linux fs(即不是FAT32),它將保留現有數據並分配額外空間。這可以是預先分配的范圍/塊,也可以是碎片。它留給FS來解決這個問題。

  4. "Optimal" depends on your usage patterns and how much you value time vs space. It's hard to make general comments, but there are many CS papers on how to allocate and reallocate chunks of bytes given various assumptions.

    “最佳”取決於您的使用模式以及您對時間與空間的重視程度。很難做出一般性評論,但是有很多關於如何在給定各種假設的情況下分配和重新分配字節塊的CS論文。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2014/06/09/afb089e66e692be2ce8aaba6cd94829b.html



 
  © 2014-2022 ITdaan.com 联系我们: