在内存映射的稀疏文件的孔中写入

[英]Writing in the hole of a memory-mapped sparse file


I need to have a file where certain bytes are contiguously laid out. Lets call these chunks. The reason the chunks need to be laid out contiguously is that these eventually get memory-mapped to an array. A file would have several chunks(these correspond to different but related arrays), and these chunks need to be appended over time. The first thing I thought about is to use a sparse file and have holes at the inter-chunk boundaries.

我需要一个文件,其中某些字节是连续布局的。让我们称这些块。块需要连续布局的原因是这些块最终被内存映射到数组。一个文件将有几个块(这些块对应于不同但相关的数组),并且这些块需要随着时间的推移而附加。我想到的第一件事是使用稀疏文件并在块间边界处有孔。

Whenever I have new data I could then write in the hole. If the space available in the hole is not enough I intend to move minimum amount of bytes to create the space and (some extra space for future) and then write the data.

每当我有新数据时,我就可以在洞里写字。如果孔中的可用空间不够,我打算移动最小量的字节来创建空间和(将来有一些额外的空间),然后写入数据。

  1. Is this a wrong way to do things ?
  2. 这是一种错误的做事方式吗?

  3. Are there good alternatives to this approach
  4. 这种方法有很好的替代方案吗?

  5. How does the OS (Linux) handle a write in the hole, does it move(shift) all the bytes in the tail ? Or restructures the inodes to somehow accommodate (at the cost of fragmentation)
  6. 操作系统(Linux)如何处理漏洞中的写入,是否移动(移位)尾部的所有字节?或者重新构造inode以某种方式容纳(以碎片为代价)

  7. Is there an optimal way to do this so that amortized movement cost is small
  8. 有没有一种最佳方法来实现这一点,以便摊销的运动成本很小

1 个解决方案

#1


1  

  1. Most likely, yes.

    最有可能的是,是的。

  2. Linux already comes with a system for tracking multiple, contiguous byte sequences with efficient appends: the file system. Can't you just use multiple files?

    Linux已经附带了一个系统,用于跟踪多个连续的字节序列,并提供有效的附加功能:文件系统。你不能只使用多个文件吗?

  3. If you use a modern Linux fs (i.e. not FAT32), it'll leave the existing data in place and allocate additional space. This could be either in a pre-allocated extent/block or by fragmentation. It's left to the FS to figure that out.

    如果您使用现代Linux fs(即不是FAT32),它将保留现有数据并分配额外空间。这可以是预先分配的范围/块,也可以是碎片。它留给FS来解决这个问题。

  4. "Optimal" depends on your usage patterns and how much you value time vs space. It's hard to make general comments, but there are many CS papers on how to allocate and reallocate chunks of bytes given various assumptions.

    “最佳”取决于您的使用模式以及您对时间与空间的重视程度。很难做出一般性评论,但是有很多关于如何在给定各种假设的情况下分配和重新分配字节块的CS论文。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2014/06/09/afb089e66e692be2ce8aaba6cd94829b.html



 
  © 2014-2022 ITdaan.com 联系我们: