Posts Tagged ‘memory’

Memcached for small objects

Thursday, December 25th, 2008

Memcached quite often ends up as a store for very small objects (small key and some integer value), though it isn’t really designed to do this kind of work by default. Current memory management is based on slabs (200 of them), where objects are grouped by similar size – though actual sizes are pre-defined at startup based on few configuration parameters.

By default memcached would have slabs based on assumption, that smallest object size will have 48 bytes of data (thats without item header), and will increase the slab sizes in +25% steps:

slab class   1: chunk size    104 perslab 10082
slab class   2: chunk size    136 perslab  7710
slab class   3: chunk size    176 perslab  5957
slab class   4: chunk size    224 perslab  4681
...

So, in this case, it allocates at least 104 bytes per object, and next steps are way behind. Fortunately, there’re some quick steps to have better efficiency: (more…)

mmap()

Sunday, August 17th, 2008

I’ve seen quite some work done on implementing mmap() in various places, including MySQL.
mmap() is also used for malloc()’ing huge blocks of memory.
mmap() data cache is part of VM cache, not file cache (though those are inside kernels tightly coupled, priorities still remain different).

If a small program with low memory footprint maps a file, it will probably make file access faster (as it will be cached more aggressively in memory, and will provide pressure on other cached file data -thats cheating though).

If a large program with lots and lots of allocated memory maps a file, that will pressure the filesystem cache to flush pages, and then… will pressure existing VM pages of the very same large program to be swapped out. Thats certainly bad.

For now MySQL is using mmap() just for compressed MyISAM files. Vadim wrote a patch to do more of mmap()ing.

If there’s less data than RAM, mmap() may provide somewhat more efficient CPU cycles. If there’s more data than RAM, mmap() will kill the system.

Interesting though, few months ago there was a discussion on lkml where Linus wrote:

Because quite frankly, the mixture of doing mmap() and write() system calls is quite fragile – and I’m not saying that just because of this particular bug, but because there are all kinds of nasty cache aliasing issues with virtually indexed caches etc that just fundamentally mean that it’s often a mistake to mix mmap with read/write at the same time.

So, simply, don’t.

Update: Oh well, 5.1: –myisam_use_mmap option… Argh.
Update on update: after few minutes of internal testing all mmap()ed MyISAM tables went fubar.

Wasting InnoDB memory

Thursday, May 29th, 2008

I usually get strange looks when I complain about memory handling inside InnoDB. It seems as if terabytes of RAM are so common and cheap, that nobody should really care about memory efficiency. Unfortunately for me, I do.

Examples:

  • The infamous Bug#15815 – buffer pool mutex contention. The patch for the bug added lots of small mutexes, and by ‘lots’ I mean really really lots – two mutexes (and rwlock structure) for each buffer pool page. That makes two million mutexes for 16GB buffer pool, um, four million mutexes for 32GB buffer pool, and I guess more for larger buffer pools. Result – 16GB buffer pool gets 625MB locking tax to solve a 8-core locking problem. Solution? Between giant lock and armies of page mutexes there lives a land of mutex pools, where locks are shared happily by multiple entities. I even made a patch, unfortunately it gets some ibuf assertion after server restart though at first everything works great :)
  • InnoDB data dictionary always grows, never shrinks. It is not considered a bug, as it isn’t memory leak – all memory is accounted by (hidden) dict_sys->size, and valgrind doesn’t print errors. 1-column table takes 2k of memory in InnoDB data dictionary, a table with few more columns and indexes takes already 10k. 100000 tables, and 1GB of memory is wasted. Who needs 100000 tables? People running application farms do. Actually, there even is a code for cleaning up data dictionary, just wasn’t finished, and is commented out at the moment. Even worse, the fix for #20877 was a joke – reducing the in-memory structure size, still not caring about structure count. And of course, do note that every InnoDB partition of a table takes space there too…

So generally if you’re running bigger InnoDB deployment, you may be hitting various hidden memory taxes – in hundreds of megabytes, or gigabytes – that don’t provide too much value anyway. Well, memory is cheap, our next database boxes will be 32GB-class instead of those ‘amnesia’ 16GB types, and I can probably stop ranting :)