libStatGen Software 1
MemoryMap Class Reference

There are a pair of related data structures in the operating system, and also a few simple algorithms that explain why your processes are waiting forever. More...

#include <MemoryMap.h>

Inheritance diagram for MemoryMap:

Public Member Functions

void debug_print ()
 
void constructor_clear ()
 
void destructor_clear ()
 
virtual bool allocate ()
 
virtual bool open (const char *file, int flags=O_RDONLY)
 open a previously created mapped vector More...
 
virtual bool create (const char *file, size_t size)
 create the memory mapped file on disk More...
 
virtual bool create (size_t size)
 store in allocated memory (malloc), not mmap: More...
 
bool close ()
 
void test ()
 
size_t length ()
 
char operator[] (unsigned int index)
 
int prefetch ()
 
void useMemoryMap (bool flag=true)
 

Public Attributes

void * data
 

Detailed Description

There are a pair of related data structures in the operating system, and also a few simple algorithms that explain why your processes are waiting forever.

The symptom you have is that they are getting little or no CPU time, as shown in the command 'top'. The machine will appear to have available CPU time (look at the Cpu(s): parameter - if less than 100%, you have available CPU). The real key, however, is to look at the 'top' column with the label 'S' - that is the status of the process, and crucial to understanding what is going on.

In your instance, the 'S' column for your karma jobs is 'D', which means it is waiting for data. This is because the process is doing something that is waiting for the filesystem to return data to it. Usually, this is because of a C call like read() or write(), but it also happens in large processes where memory was copied to disk and re-used for other purposes (this is called paging).

So, a bit of background on the operating system... there is a CPU secheduler that takes a list of waiting processes, and picks one to run - if the job is waiting for the disk, there is no point in picking it to run, since it is blocked, waiting for the disk to return data. The scheduler marks the process with 'D' and moves on to the next process to schedule.

In terms of data structures that we care about for this example, there are two that we care about. First is a linear list of disk buffers that are stored in RAM and controlled by the operating system. This is usually called the disk buffer pool. Usually, when a program asks for data from the disk, this list can be scanned quickly to see if the data is already in RAM - if so, no disk operation needs to take place.

Now in the case of the normal Unix read() and write() calls, when the operating system is done finding the page, it copies the data into a buffer to be used by the process that requested it (in the case of a read() - a write() is the opposite). This copy operation is slow and inefficient, but gets the job done.

So overall, you gain some efficiency in a large memory system by having this disk buffer pool data structure, since you aren't re-reading the disk over and over to get the same data that you already have in RAM. However, it is less efficient than it might be because of the extra buffer copying.

Now we come to memory mapped files, and karma. The underlying system call of interest to us is mmap(), and is in MemoryMap.cpp. What it does and how it works are important to understanding the benefits of it, and frankly, most people don't care about it because it is seemingly complex.

Two things are important to know: firstly, there is a data structure in the CPU called the page table, which is mostly contained in the CPU hardware itself. All memory accesses for normal user processes like karma go through this hardware page table. Secondly, it is very fast for the operating system to put together a page table that 'connects' a bunch of memory locations in your user programs address space to the disk buffer pool pages.

The combination of those two facts mean that you can implement a 'zero copy' approach to reading data, which means that the data that is in the disk buffer pool is directly readable by the program without the operating system ever having to actually copy the data, like it does for read() or write().

So the benefit of mmap() is that when the underlying disk pages are already in the disk buffer pool, a hardware data structure gets built, then the program returns, and the data is available at full processor speed with no intervening copy of the data, or waiting for disk or anything else. It is as near to instantaneous as you can possibly get. This works whether it is 100 bytes or 100 gigabytes.

So, the last part of the puzzle is why your program winds up in 'D' (data wait), and what to do about it.

The disk buffer pool is a linear list of blocks ordered by the time and date of access. A process runs every once in awhile to take the oldest of those pages, and free them, during which it also has to update the hardware page tables of any processes referencing them.

So on wonderland, most file access (wget, copy, md5sum, anything else) is constantly putting new fresh pages at the front of the list, and karma index files, having been opened awhile ago, are prime candidates for being paged out. The reason they get paged out as far as I know is that in any given second of execution, nowhere near the entire index is getting accessed... so at some point, at least one page gets sent back to disk (well, flushed from RAM). Once that happens, a cascading effect happens, where the longer it waits, the older the other pages get, then the more that get reclaimed, and the slower it gets, until karma is at a standstill, waiting for pages to be brought back into RAM.

Now in an ideal world, karma would rapidly recover, and it can... sometimes. The problem is that your karma job is accessing data all over that index, and it is essentially looking like a pure random I/O to the underlying filesystem. There is about a 10 to 1 performance difference between accessing the disk sequentially as compared to randomly.

So to make karma work better, the first thing I do when starting karma is force it to read all of the disk pages in order. This causes the entire index to be forced into memory in order, so it is forcing sequential reads, which is the best case possible. There are problems, for example if three karma jobs start at once, the disk I/O is no longer as purely sequential as we would like. Also, if the filesystem is busy taking care of other programs, even if karma thinks it is forcing sequential I/O, the net result looks more random. This happens when the system is starting to break down (thrashing) and it will certainly stall, or look very very slow, or crash.

The upshot of all of this is that when a single reference is shared, it is more likely that all the pages will be in the disk buffer pool to begin with, and thereby reduce startup time to nearly zero. It is also the ideal situation in terms of sharing the same reference among say 24 copies of karma on wonderland - the only cost is the hardware page table that gets set up to point to all of the disk buffers.

As I mentioned a paragraph back, the pages can still get swapped out, even with dozens of karma jobs running. A workaround I created is a program in utilities called mapfile - it simply repeatedly accesses the data in sequential order to help ensure that all of the pages are at the head of the disk buffer pool, and therefore less likely to get swapped out.

The benefit of such a program (mapfile) is greater on wonderland, where a lot of processes are competing for memory and disk buffers.

Definition at line 155 of file MemoryMap.h.

Constructor & Destructor Documentation

◆ MemoryMap()

MemoryMap::MemoryMap ( )

Definition at line 43 of file MemoryMap.cpp.

44{
45 constructor_clear();
46#if defined(_WIN32)
47 SYSTEM_INFO sysinfo = {0};
48 ::GetSystemInfo(&sysinfo);
49 DWORD cbView = sysinfo.dwAllocationGranularity;
50#else
51 page_size = sysconf(_SC_PAGE_SIZE);
52#endif
53}

◆ ~MemoryMap()

MemoryMap::~MemoryMap ( )
virtual

Definition at line 55 of file MemoryMap.cpp.

56{
57 destructor_clear();
58};

Member Function Documentation

◆ allocate()

bool MemoryMap::allocate ( )
virtual

Definition at line 118 of file MemoryMap.cpp.

119{
120 data = (void *) malloc(mapped_length);
121
122 if (data == NULL)
123 {
124#ifdef __WIN32__
125 ::CloseHandle(file_handle);
126#else
127 ::close(fd);
128#endif
129 perror("MemoryMap::open");
130 constructor_clear();
131 return true;
132 }
133
134#ifdef __WIN32__
135 DWORD resultSize = 0;
136 ReadFile(file_handle, data, mapped_length, &resultSize, NULL);
137#else
138 size_t resultSize = read(fd, data, mapped_length);
139#endif
140
141 if ( resultSize != mapped_length)
142 {
143#ifdef __WIN32__
144 ::CloseHandle(file_handle);
145#else
146 ::close(fd);
147#endif
148 perror("MemoryMap::open");
149 constructor_clear();
150 return true;
151 }
152 return false;
153}

◆ close()

bool MemoryMap::close ( )

Definition at line 324 of file MemoryMap.cpp.

325{
326 destructor_clear();
327 return false;
328}

◆ constructor_clear()

void MemoryMap::constructor_clear ( )

Definition at line 74 of file MemoryMap.cpp.

75{
76#if defined(_WIN32)
77 file_handle = NULL;
78 map_handle = NULL;
79#else
80 fd = -1;
81#endif
82 data = (void *) NULL;
83 offset = 0;
84 mapped_length = 0;
85 total_length = 0;
86 useMemoryMapFlag = true;
87};

◆ create() [1/2]

bool MemoryMap::create ( const char *  file,
size_t  size 
)
virtual

create the memory mapped file on disk

a file will be created on disk with the header filled in. The caller must now populate elements using (*this).set(index, value).

Definition at line 243 of file MemoryMap.cpp.

244{
245 if (file==NULL)
246 {
247 data = calloc(size, 1);
248 return(data==NULL);
249 }
250
251 const char * message = "MemoryMap::create - problem creating file %s";
252
253#ifdef __WIN32__
254 file_handle = CreateFile(file,
255 GENERIC_READ | GENERIC_WRITE,
256 FILE_SHARE_READ | FILE_SHARE_WRITE,
257 NULL,
258 CREATE_ALWAYS,
259 FILE_ATTRIBUTE_NORMAL,
260 NULL);
261
262 if (file_handle == INVALID_HANDLE_VALUE)
263 {
264 fprintf(stderr, message, file);
265 constructor_clear();
266 return true;
267 }
268
269 SetFilePointer(file_handle, size - 1, NULL, FILE_BEGIN);
270 char dummy = 0;
271 DWORD check = 0;
272 WriteFile(file_handle, &dummy, 1, &check, NULL);
273
274 if (check != 0)
275 {
276 CloseHandle(file_handle);
277 DeleteFile(file);
278 fprintf(stderr, message, file);
279 constructor_clear();
280 return true;
281 }
282 CloseHandle(file_handle);
283 open(file, O_RDWR);
284#else
285 fd = ::open(file, O_RDWR|O_CREAT|O_TRUNC, 0666);
286 if(fd == -1)
287 {
288 fprintf(stderr, message, file);
289 constructor_clear();
290 return true;
291 }
292
293 lseek(fd, (off_t) size - 1, SEEK_SET);
294 char dummy = 0;
295 if(write(fd, &dummy, 1)!=1)
296 {
297 fprintf(stderr, message, file);
298 constructor_clear();
299 return true;
300 }
301
302 data = ::mmap(NULL, size, PROT_READ|PROT_WRITE,
303 MAP_SHARED, fd, offset);
304
305 if (data == MAP_FAILED)
306 {
307 ::close(fd);
308 unlink(file);
309 fprintf(stderr, message, file);
310 constructor_clear();
311 return true;
312 }
313 mapped_length = total_length = size;
314#endif
315 return false;
316}
virtual bool open(const char *file, int flags=O_RDONLY)
open a previously created mapped vector
Definition: MemoryMap.cpp:156

References open().

Referenced by MemoryMapArray< elementT, indexT, cookieVal, versionVal, accessorFunc, setterFunc, elementCount2BytesFunc, arrayHeaderClass >::create(), and create().

◆ create() [2/2]

bool MemoryMap::create ( size_t  size)
virtual

store in allocated memory (malloc), not mmap:

This is for code that needs to more flexibly the case when an mmap() file might be available, but if it is not, we want to load it as a convenience to the user. GenomeSequence::populateDBSNP does exactly this.

Definition at line 319 of file MemoryMap.cpp.

320{
321 return create(NULL, size);
322}
virtual bool create(const char *file, size_t size)
create the memory mapped file on disk
Definition: MemoryMap.cpp:243

References create().

◆ debug_print()

void MemoryMap::debug_print ( )

Definition at line 60 of file MemoryMap.cpp.

61{
62#if defined(_WIN32)
63 std::cout << "fd = " << file_handle << std::endl;
64#else
65 std::cout << "fd = " << fd << std::endl;
66#endif
67 std::cout << "data = 0x" << std::hex << data << std::endl;
68 std::cout << "offset = 0x" << std::hex << offset << std::endl;
69 std::cout << "mapped_length = 0x" << std::hex << mapped_length << std::endl;
70 std::cout << "total_length = 0x" << std::hex << total_length << std::endl;
71 std::cout << "page_size = 0x" << std::hex << page_size << std::endl;
72};

◆ destructor_clear()

void MemoryMap::destructor_clear ( )

Definition at line 89 of file MemoryMap.cpp.

90{
91#if defined(_WIN32)
92 if (data!=NULL)
93 {
94 // free windows mapped object
95 ::UnmapViewOfFile((LPVOID) data);
96 }
97 if (map_handle != NULL)
98 ::CloseHandle(map_handle);
99 if (file_handle != NULL)
100 ::CloseHandle(file_handle);
101#else
102 if (data!=NULL)
103 {
104 // free unix mapped object
105 munmap(data, mapped_length);
106 }
107 // free unix resources
108 if (fd!=-1)
109 {
110 ::close(fd);
111 }
112#endif
113
114 constructor_clear();
115}

◆ length()

size_t MemoryMap::length ( )
inline

Definition at line 211 of file MemoryMap.h.

212 {
213 return mapped_length;
214 }

◆ open()

bool MemoryMap::open ( const char *  file,
int  flags = O_RDONLY 
)
virtual

open a previously created mapped vector

useMemoryMapFlag will determine whether it uses mmap() or malloc()/read() to populate the memory

Reimplemented in MemoryMapArray< elementT, indexT, cookieVal, versionVal, accessorFunc, setterFunc, elementCount2BytesFunc, arrayHeaderClass >, and GenomeSequence.

Definition at line 156 of file MemoryMap.cpp.

157{
158 const char * message = "MemoryMap::open - problem opening file %s";
159#if defined(_WIN32)
160 file_handle = CreateFile(file,
161 (flags==O_RDONLY) ? GENERIC_READ : (GENERIC_READ | GENERIC_WRITE),
162 FILE_SHARE_READ | FILE_SHARE_WRITE, // subsequent opens may either read or write
163 NULL,
164 OPEN_EXISTING,
165 FILE_ATTRIBUTE_NORMAL,
166 NULL);
167
168 if(file_handle == INVALID_HANDLE_VALUE)
169 {
170 fprintf(stderr, message, file);
171 constructor_clear();
172 return true;
173 }
174
175 LARGE_INTEGER file_size = {0};
176 ::GetFileSizeEx(file_handle, &file_size);
177 mapped_length = total_length = file_size.QuadPart;
178
179#else
180 struct stat buf;
181 fd = ::open(file, flags);
182 if ((fd==-1) || (fstat(fd, &buf) != 0))
183 {
184 fprintf(stderr, message, file);
185 constructor_clear();
186 return true;
187 }
188 mapped_length = total_length = buf.st_size;
189#endif
190
191 if(!useMemoryMapFlag)
192 {
193 return allocate();
194 }
195
196#if defined(_WIN32)
197 assert(offset == 0);
198
199 map_handle = CreateFileMapping(file_handle, NULL,
200 (flags==O_RDONLY) ? PAGE_READONLY : PAGE_READWRITE,
201 file_size.HighPart, // upper 32 bits of map size
202 file_size.LowPart, // lower 32 bits of map size
203 NULL);
204
205 if(map_handle == NULL)
206 {
207 ::CloseHandle(file_handle);
208 fprintf(stderr, message, file);
209 constructor_clear();
210 return true;
211 }
212
213 data = MapViewOfFile(map_handle,
214 (flags == O_RDONLY) ? FILE_MAP_READ : FILE_MAP_ALL_ACCESS,
215 0, 0, mapped_length);
216
217 if (data == NULL)
218 {
219 CloseHandle(map_handle);
220 CloseHandle(file_handle);
221
222 fprintf(stderr, message, file);
223 constructor_clear();
224 return true;
225 }
226#else
227 data = ::mmap(NULL, mapped_length,
228 (flags == O_RDONLY) ? PROT_READ : PROT_READ | PROT_WRITE,
229 MAP_SHARED, fd, offset);
230
231 if (data == MAP_FAILED)
232 {
233 ::close(fd);
234 fprintf(stderr, message, file);
235 constructor_clear();
236 return true;
237 }
238#endif
239 return false;
240}

References open().

Referenced by create(), open(), and MemoryMapArray< elementT, indexT, cookieVal, versionVal, accessorFunc, setterFunc, elementCount2BytesFunc, arrayHeaderClass >::open().

◆ operator[]()

char MemoryMap::operator[] ( unsigned int  index)
inline

Definition at line 216 of file MemoryMap.h.

217 {
218 return ((char *)data)[index];
219 };

◆ prefetch()

int MemoryMap::prefetch ( )

Definition at line 349 of file MemoryMap.cpp.

350{
351 int sum = 0;
352 size_t i;
353
354 for (i=0; i<mapped_length; i += page_size) sum += *(i + (char *) data);
355
356 return sum;
357}

◆ test()

void MemoryMap::test ( )

Definition at line 330 of file MemoryMap.cpp.

331{
332 int result;
333
334 result = this->open("test/test_memmap_data.txt");
335 assert(result == 0);
336 assert(data!=NULL);
337 assert(mapped_length == 183); // length of the above file
338 close();
339
340 // now try non memory mapped (direct slow file I/O)
341 useMemoryMap(false);
342 result = this->open("test/test_memmap_data.txt");
343 assert(result == 0);
344 assert(data!=NULL);
345 assert(mapped_length == 183); // length of the above file
346 close();
347}

◆ useMemoryMap()

void MemoryMap::useMemoryMap ( bool  flag = true)
inline

Definition at line 227 of file MemoryMap.h.

228 {
229 useMemoryMapFlag = flag;
230 }

Member Data Documentation

◆ data

void* MemoryMap::data

Definition at line 171 of file MemoryMap.h.


The documentation for this class was generated from the following files: