Code is forked and stripped over at https://github.com/buybackoff/MMDataStructures by Victor Baybekov. Take a look and see what's going on.

Welcome to Disk Based Data Structures!

Ever worked with so much data that your Dictionary<T,V> or List<T> consumes too much physical memory? This is what this project tries to remedy. By using memory mapped files for persisting data and a fast serializer the goal is to balance speed and ease of use.
Read about Memory Mapped Files at Wikipedia.

Changes for Release 3

  • Bugfix with AltSerialize
  • Improved handling of iterating a dictionary
  • Dictionary now support loading existing file

Background

Last year I published a small project with an Array<T> implementation which used memory mapped files as the storage. The idea for the project came after working on a full text search engine implementation in C# where the entire dictionary is kept in memory as a long byte array. This array could be quite large consuming lots of memory. Since the dictionary will have hot spots it's more efficient to only keep the parts used in memory, and page the other parts as needed. The OS already have paging implemented with the memory mapped file API, and for 64bit systems it possible to optimize how you map these files due to the vast available address space.

Around the same time I also created a disk based Dictionary<TKey,TValue> with my own paging system. This project has combined those two implementations and also added a List<T> implementation.

In addition I have a keen interest in how to serialize data as fast as possible, especially when sending them between components. This interest grew into the Serializer included in this project. By using size defined structs, the serialization is done with direct memory copy and the fastest serialization you can achieve in .Net.

PS! _If you are running this project on a 32bit system, you will most likely run into memory issues (>800mb for the dictionary, >2gb for the array and list), as the code is tuned for 64bit (multiple large views on the backing file instead of smaller ones which must be moved around. Moving the view impacts performance quite a lot. Download the source code in the Experimental32bit branch instead, it should work fine with huge files on 32bit, but it's not tested.

What's in the project?

mAdcOW.DataStructures - Thread safe generic data types to use in your code
  • Array<T> - with autogrowing capabilities
  • List<T>
  • Dictionary<TKey,TValue>

mAdcOW.Serializer - Implements several options for serializing your object
  • Factory method which benchmarks the different approaches and picks the fastest one for your data type
  • Unsafe pointer based serializer - extremely fast for size defined structures/classes
  • Marshal Serializing
  • BinaryFormatter
  • BitConverter for primitive types
  • DataContractSerializer from WCF
  • AltSerialize
  • Google protocol buffers serializing

Winterdom.IO.Filemap - Modified version
AltSerialize - Serialization library

The future

  • Migrate to .Net 4.0 and replace Winterdom.IO.Filemap with System.IO.MemoryMappedFiles
  • Implement Queue<T>
  • Implement Stack<T>
  • Implement more methods on the classes

Last edited Nov 28 at 8:39 PM by Wobba, version 17