Getting apache core dumps in Linux

If you want to get core dumps for intermittent Apache/mod_php crashes in Linux, you will probably need this module (otherwise Linux kernel will refuse to dump core, whatever you put into your OS or Apache configuration):


/*
* Author: Domas Mituzas
* Released to public domain
*
* apxs -c -i mod_dumpcore.c
* and...
* LoadModule dumpcore_module .../path/to/mod_dumpcore.c
* CoreDumpDirectory /tmp/cores/
* and...
* sysctl -w kernel.core_pattern=/tmp/cores/core.%p.%t
*/

#include "httpd.h"
#include "http_config.h"
#include <sys/prctl.h>

static int dumpcore_handler(request_rec *r)
{ prctl(PR_SET_DUMPABLE,1,0,0,0); return DECLINED; }

static void dumpcore_register_hooks(apr_pool_t *p)
{ap_hook_handler(dumpcore_handler, NULL, NULL, APR_HOOK_MIDDLE);}

module AP_MODULE_DECLARE_DATA dumpcore_module = {
STANDARD20_MODULE_STUFF, NULL, NULL, NULL, NULL, NULL,
dumpcore_register_hooks };

P.S. I was quite astonished to find out that nobody ever needed this, I remember quite a few discussions after which we fixed this in MySQL two years ago.

Update: there’s also ‘echo 1 > /proc/sys/fs/suid_dumpable’ or ‘sysctl -w fs.suid_dumpable=1′ – now I recall whole story, RHEL3 and RHEL4 didn’t have this, so we had to do prctl() hack, whereas later Linux kernel versions allowed this workaround.

Posted in mysql | Comments Off

Dear IT Security Industry…

… You are full of shit.

I don’t know how effective your scare-mongering cash-extortion tactics are, but they don’t really help neither your users, nor vendors, nor anyone else.

It all starts when major vulnerability databases start authoritatively spouting out crap like this:

A vulnerability has been reported in MySQL, which can be exploited to compromise a vulnerable system.
The vulnerability is caused due to an unspecified error and can be exploited to cause a buffer overflow. (Secunia)

Or crap like this:

MySQL is prone to a buffer-overflow vulnerability because if fails to perform adequate boundary checks on user-supplied data.
An attacker can leverage this issue to execute arbitrary code within the context of the vulnerable application. Failed exploit attempts will result in a denial-of-service condition. (Securityfocus)

Continue reading

Posted in mysql | Tagged | 9 Comments

Spikes are not fun anymore

English Wikipedia just scored “three million articles”, so I thought I’d give some more numbers and perspectives :) Four years ago we observed impressive +50% traffic spike on Wikipedia – people came in to read about the new pope. Back then it was probably twenty additional page views a second, and we were quite happy to sustain that additional load :)

Nowadays big media events can cause some troubles, but generally they don’t bring huge traffic spikes anymore. Say, Michael Jackson’s English Wikipedia article had peak hour of one million page views (2009-06-25 23:00-24:00) – and that was merely 10% increase on one of our projects (English Wikipedia got 10.4m pageviews that hour). Our problems back then were caused by complexity of page content – and costs got inflated because of lack of rendering farm concurrency control.

Other interesting sources of attention are custom Google logos leading to search results leading to Wikipedia (of course!). Last ones, for Perseids or Hans Christian Ørsted sent over 1.5m daily visitors each – but thats mere 20 article views a second or so.

What makes those spikes boring nowadays is simply the length of long-tail. Our projects serve over five million different articles over the course of an hour (and 20m article views) – around 3.5m articles are opened just once. If our job would be serving just hot news, our cluster setup and software infrastructure would be very very very different – and now we have to accommodate millions of articles, that aren’t just stored in archives, but also are constantly read, even if once an hour (and daily hot set is much larger too).

All this viewership data is available in raw form, as well as nice visualizations at trendingtopics, wikirank and stats.grok.se. It is amazing to hear about all the research that is built on this kind of data, and I guess it needs some improved interfaces and APIs already for all the future uses ;-)

Posted in wikipedia, wikitech | 3 Comments

plugin and 5.1

You can check it yourself – 5.1 seems to be shipped with InnoDB plugin in future :-) (oh the joy of open source repositories, always ready to spoil the surprise, eh?:)

Posted in mysql | Tagged , | 1 Comment

MySQL DBA, python edition

In the age of jetsetting and space travel and ORMs and such, MySQL DBAs are the least sophisticated ones nowadays, usually fighting terabytes or petabytes of data with army of shell scripts – as there’re no nice frameworks to explain what you want to do in MySQL administration. The nice thing about proper object frameworks is that they allow to concentrate on the work and logic done, allowing to think on the process done, rather on languages/APIs/etc.

For example, moving a slave to another master down a replication topology could be expressed this way (this is a working code, actually):

slave = mysql(options.slave)
oldmaster = mysql(slave.get_master())
newmaster = mysql(options.newmaster)

oldmaster.lock()
oldpos = oldmaster.pos()
newmaster.wait(oldpos)
newmaster.lock()
oldmaster.unlock()
slave.wait(oldpos)
slave.change_master(newmaster)
newmaster.unlock()

I’m sure transaction group/global IDs would simplify the process a lot, but still, having building blocks one can write pretty much self-documenting narrow code, shuffle actions done without having to rethink whole programming logic too much. Implementation of methods like .sync(), .clone(), .promote() ends up environment-specific, but may save quite some time afterwards too.

As much as I’d like everyone around to get their data management actions written down into scripts, I’d like every DBA action I do to be written down in such code too :-) I’d love to have code, which detects resource shortages, orders servers, deploys software and re-shards data automatically… well, you know what I mean :)

Posted in mysql | Tagged , | 3 Comments

Evil replication management

When one wants to script automated replication chain building, certain things are quite annoying, like immutable replication configuration variables. For example, at certain moments log_slave_updates is more than needed, and thats what the server says:

mysql> show variables like 'log_slave_updates';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| log_slave_updates | OFF   |
+-------------------+-------+
1 row in set (0.00 sec)

mysql> set global log_slave_updates=1;
ERROR 1238 (HY000): Variable 'log_slave_updates' is a read only variable

Of course, there are few options, roll in-house fork (heheeeee!), restart your server, and keep warming up your tens of gigabytes of cache arenas, or wait for MySQL to ship a feature change in next major release. Then there are evil tactics:

mysql> system gdb -p $(pidof mysqld)
                       -ex "set opt_log_slave_updates=1" -batch
mysql> show variables like 'log_slave_updates';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| log_slave_updates | ON    |
+-------------------+-------+
1 row in set (0.00 sec)

I don’t guarantee safety of this when slave is running, but… stopping and starting slave threads is somewhat cheaper, than stopping and starting big database instance, right?

What else can we do?

mysql> show slave status \G
...
     Replicate_Do_DB: test
...
mysql> system gdb -p $(pidof mysqld)
          -ex 'call rpl_filter->add_do_db(strdup("hehehe"))' -batch
mysql> show slave status \G
...
      Replicate_Do_DB: test,hehehe
...

It is actually possible to add all sorts of filters this way, rpl_filter.h can be good reference :) So now that you want to throw out some data from your slaves, restart isn’t needed. Unfortunately, deleting entries isn’t possible via rpl_filter methods, but you can always edit base_ilists, can’t you?

P.S. having this functionality inside server would definitely be best.

Posted in mysql | Tagged , , | 5 Comments

DBAs of all countries, unite!

I’m observing the process of most awesome SHOW commands being abolished, destroyed and some weird information_schema tables are introduced instead.

Say, even though you can select configuration variables using @@syntax, you can’t do same for much more interesting to DBAs status variables in any more interesting logic.

Apparently instead of doing

SHOW STATUS LIKE "questions"

one has to do this now (I’m being dramatic here, above hasn’t been removed yet, but hasn’t been expanded for better usage either):

SELECT VARIABLE_NAME, VARIABLE_VALUE
FROM INFORMATION_SCHEMA.GLOBAL_STATUS
WHERE VARIABLE_NAME="QUESTIONS"

Do note, those SQL standard followers will get caps-lock button swapped with space bar soon.

Of course, we, DBAs, know that one can simplify stuff by creating stored routines:

CREATE FUNCTION `gstatus`(v varchar(64)) returns varchar(1024)
return
( SELECT variable_value
  FROM information_schema.global_status
  where variable_name=v LIMIT 1
)

So we can do such simple things as:

mysql> select m.gstatus("questions");
+------------------------+
| m.gstatus("questions") |
+------------------------+
| 140                    |
+------------------------+
1 row in set (0.00 sec)

Of course, this leads to solution of one of most common DBA problems, how to get decent status variable values per time:

CREATE PROCEDURE m.report(in timer float)
begin

DROP TEMPORARY TABLE IF EXISTS status_old;
CREATE TEMPORARY TABLE status_old
SELECT * FROM INFORMATION_SCHEMA.GLOBAL_STATUS;

SELECT SLEEP(timer) into @x;
SELECT
    s.variable_name status,
    (s.variable_value-o.variable_value)/timer value
FROM INFORMATION_SCHEMA.GLOBAL_STATUS s
    JOIN status_old o USING (variable_name)
WHERE s.variable_value>0;

DROP TEMPORARY TABLE status_old;

end

So, the “show me changes-per-second for values in last 0.5s” would look like this:

ysql> call m.report(0.5) //
+-----------------------------------+---------+
| status                            | value   |
+-----------------------------------+---------+
| ABORTED_CLIENTS                   |       0 |
| ABORTED_CONNECTS                  |       0 |
| BYTES_RECEIVED                    |  532662 |
| BYTES_SENT                        | 1140894 |
...
| QUERIES                           |    2884 |
| QUESTIONS                         |    2878 |
| SELECT_FULL_JOIN                  |       2 |
| SELECT_RANGE                      |     196 |
| SELECT_SCAN                       |     146 |
...
| THREADS_CACHED                    |      12 |
| THREADS_CONNECTED                 |     -28 |
| THREADS_CREATED                   |       4 |
| THREADS_RUNNING                   |      -2 |
| UPTIME                            |       2 |
| UPTIME_SINCE_FLUSH_STATUS         |       2 |
+-----------------------------------+---------+
125 rows in set (0.53 sec)

Query OK, 0 rows affected, 1 warning (0.54 sec)

So, by spending five minutes on writing very simple INFORMATION_SCHEMA procedure we can resolve one of usual nightmares in MySQL DBA environments.

I can get back now to the initial idea of this post – if one DBA can write such small neat thing in few minutes, would you imagine how useful can a collaboratively built repository of DBA-assisting stored procedures in functions, and how we can spit at all the SQL standard verbosity, and make our systems easy to manage? :) I think we shouldn’t allow not to share such utilities, as widespread use and “expect it already there” would make overall work much much easier. Let’s use and reuse (and someone should set up framework for building such thing ;-))

Posted in mysql | Tagged , | 5 Comments

Board again (perhaps)

Tomorrow voting for Wikimedia Foundation Board of Trustees Election starts – and Yours truly is a candidate.

You can find most of my views on various issues in our question pages (I was somewhat boiling when answering the What will you do about the WMF mishandling it’s funding? one – it probably takes great effort to phrase such a bad question, and so easy to answer it :), as well as Wikipedia Signpost ‘interview’.

I was appointed to the Board back in January 2008, after holding various other volunteer (at some point in time – ‘officer’) positions within the organization since 2004 – and brought in the core technology and operational efficiency skill set there. The appointment was supposed to be somewhat temporary, but board restructure appeared to be much longer process than we expected – both the chapters part, and nomination committee work. As a community member, after the restructure I was in ‘community-elected’ seat, though I never participated in any election – so that wasn’t too fair to the actual community, need to fix that :)

So, even though I wasn’t too visible to actual community (people would notice me mostly when things go wrong, and I’m not in best mood then, usually :-), I feel that the values I’ve worked on, evangelized and supported for all these years – efficiency and general availability of our projects – can win mindshare not only of our read-only users I work mostly for, but also eligible voters.

And I do think, that internal technology expertise has to be represented on board, as things we’ve been doing, and methods we’ve been using, are very much unique in the technology world. Oh, and somewhere I mentioned, our technology spending is close to 50%, that has to be represented too :-)

Posted in wikipedia, wikitech | 2 Comments

Profile guided optimization with gcc

Yesterday I wrote how certain build optimizations can have performance differences – and I decided to step a bit deeper into a quite interesting field – profile guided binary optimization. There’re quite a few interesting projects out there, like LLVM (I hear it is used extensively in iphone?) – which analyze run-time profile of compiled code and can do just in time adjustments of binary code. Apparently, you don’t need that fancy technology, and can use plain old gcc.

The whole plan is:

  1. Compile all code with -fprofile-generate in {C|CXX|LD}FLAGS
  2. Run the binary
  3. Run your application/benchmark against that binary
  4. Recompile all code with -fprofile-use (above steps will place lots of .gcda files in source tree)
  5. PROFIT!!! (note the omission of “???” step)

How much profit? I measured ~7% of sysbench performance increase (and probably would see much higher value in CPU-tight benchmarks). YMMV. Can such PGO be useful for every user out there? Maybe – but the best results are achieved once looking at actual use patterns – though of course, lots of them are similar everywhere around.

Also, I am showing the actual profiling process with too much of pink. Apparently gcc/gcov profiles tend to get corrupted in multithreaded applications, so I did multiple profile/build passes, until I managed to assemble final binary. :-)

Now I have to figure out how to use -combine flag in gcc, and treat whole MySQL codebase as one huge .c file (apparently compilers can make much much better decisions then).

Posted in mysql | Tagged | 6 Comments

On binaries and -fomit-frame-pointer

Over last few years 64-bit x86 platform has became ubiquitous, thus making stupid memory limitations a thing of some forgotten past. 64-bit processors made internal memory structures bigger, but compensated that with twice the amount and twice larger registers.

But there’s one thing that definitely got worse – gcc, the compiler, has a change in default compilation options – it omits frame pointers from binaries in x86_64 architecture. People have advocated against that back in 1997 because of very simple reasons, that are still very much existing today too – frame pointers are needed for efficient stack trace calculations, and stack traces are very very useful, sometimes.

So, this change means that oprofile is not able to give hierarchical profiles, it also means that MySQL crash information will not include most useful bit – the failing stack trace. This decision has been just because of performance reasons – frame pointer takes whole register (though on x86_32 that meant 1 out of 8, on x86_64 it is 1 out of 16), which could be used to optimize the application.

I tested two MySQL builds, one built with ‘-O3 -g -fno-omit-frame-pointer’ and other with -fomit-frame-pointer instead – and performance difference was negligible. It was around 1% in sysbench tests, and slightly over 3% at tight-loop select benchmark(100000000,(select asin(5+5)+sin(5+5))); on a 2-cpu Opteron box.

The summary suggestion for this flag would be very simple. If you don’t care about fixing or making your product faster, you can probably use 1% speed-up. But if getting actual real-time performance data can lead to much better performance fixes, and fast introspection means qualified engineers can diagnose problems much faster, 1% or even 3% is not that much. So, add ‘-fno-omit-frame-pointer’ to CFLAGS and CXXFLAGS, and enjoy things getting back to normal :)

By the way, while I was at it, I did some empiric tests of other options, but one that irks me most is not using ‘-g’ on production binaries. See, debugging information, symbol tables, etc – they all cause around 0% performance difference. The only difference is that e.g. mysqld binary will weight 30M, instead of 6M (though that fat will not be loaded into RAM, and will only cost diskspace).

Why does debugging information matter? It doesn’t, if you don’t attempt to be power-user. It does, if you enjoy crazy debugger tricks (like one here or here). Oh, and of course, bonus GDB trick, how to run KILL without connecting to server:

(gdb) thread apply all bt
...
(gdb) thread 2
[Switching to thread 2 (Thread 0x44a76950 (LWP 23955))]#0  ...
(gdb) bt
#0  0x00007f7821e68e1d in pthread_cond_timedwait...
#1  0x000000000052dc0e in Item_func_sleep::val_int (this=0x12317f0)...
#2  0x0000000000501484 in Item::send (this=0x12317f0, ...
...
#15 0x00000000005caf29 in do_command (thd=0x11da290) ...
...
(gdb) frame 15
#15 0x00000000005caf29 in do_command (thd=0x11da290) at sql_parse.cc:854
854	  return_value= dispatch_command(command, thd, packet+1, ...
(gdb) set thd->killed = THD::KILL_QUERY
(gdb) continue

And the client gets ‘ERROR 1317 (70100): Query execution was interrupted’ :-)

Posted in mysql | Tagged , , , | 1 Comment