<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>domas mituzas &#187; oprofile</title>
	<atom:link href="http://mituzas.lt/tag/oprofile/feed/" rel="self" type="application/rss+xml" />
	<link>http://mituzas.lt</link>
	<description></description>
	<lastBuildDate>Thu, 12 Aug 2010 14:09:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1-alpha</generator>
		<item>
		<title>On binaries and -fomit-frame-pointer</title>
		<link>http://mituzas.lt/2009/07/26/on-binaries-and-fomit-frame-pointer/</link>
		<comments>http://mituzas.lt/2009/07/26/on-binaries-and-fomit-frame-pointer/#comments</comments>
		<pubDate>Sun, 26 Jul 2009 21:16:17 +0000</pubDate>
		<dc:creator>Domas Mituzas</dc:creator>
				<category><![CDATA[mysql]]></category>
		<category><![CDATA[compiling]]></category>
		<category><![CDATA[gdb]]></category>
		<category><![CDATA[oprofile]]></category>
		<category><![CDATA[x86_64]]></category>

		<guid isPermaLink="false">http://mituzas.lt/?p=545</guid>
		<description><![CDATA[Over last few years 64-bit x86 platform has became ubiquitous, thus making stupid memory limitations a thing of some forgotten past. 64-bit processors made internal memory structures bigger, but compensated that with twice the amount and twice larger registers. But &#8230; <a href="http://mituzas.lt/2009/07/26/on-binaries-and-fomit-frame-pointer/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Over last few years 64-bit x86 platform has became ubiquitous, thus making stupid memory limitations a thing of some forgotten past. 64-bit processors made internal memory structures bigger, but compensated that with twice the amount and twice larger registers. </p>
<p>But there&#8217;s one thing that definitely got worse &#8211; gcc, the compiler, has a change in default compilation options &#8211; it omits frame pointers from binaries in x86_64 architecture. People have advocated against that <a href='http://lkml.indiana.edu/hypermail/linux/kernel/9702.3/0214.html'>back in 1997</a> because of very simple reasons, that are still very much existing today too &#8211; frame pointers are needed for efficient stack trace calculations, and stack traces are very very useful, sometimes. </p>
<p>So, this change means that oprofile is not able to give hierarchical profiles, it also means that MySQL crash information will not include most useful bit &#8211; the failing stack trace. This decision has been just because of performance reasons &#8211; frame pointer takes whole register (though on x86_32 that meant 1 out of 8, on x86_64 it is 1 out of 16), which could be used to optimize the application. </p>
<p>I tested two MySQL builds, one built with &#8216;-O3 -g -fno-omit-frame-pointer&#8217; and other with -fomit-frame-pointer instead &#8211; and performance difference was negligible. It was around <b>1%</b> in sysbench tests, and slightly over <b>3%</b> at tight-loop <code>select benchmark(100000000,(select asin(5+5)+sin(5+5)));</code> on a 2-cpu Opteron box.</p>
<p>The summary suggestion for this flag would be very simple. If you don&#8217;t care about fixing or making your product faster, you can probably use 1% speed-up. But if getting actual real-time performance data can lead to much better performance fixes, and fast introspection means qualified engineers can diagnose problems much faster, 1% or even 3% is not that much. So, add &#8216;-fno-omit-frame-pointer&#8217; to CFLAGS and CXXFLAGS, and enjoy things getting back to normal :) </p>
<p>By the way, while I was at it, I did some empiric tests of other options, but one that irks me most is not using &#8216;-g&#8217; on production binaries. See, debugging information, symbol tables, etc &#8211; they all cause around 0% performance difference. The only difference is that e.g. mysqld binary will weight 30M, instead of 6M (though that fat will not be loaded into RAM, and will only cost diskspace). </p>
<p>Why does debugging information matter? It doesn&#8217;t, if you don&#8217;t attempt to be power-user. It does, if you enjoy crazy debugger tricks (like one <a href='http://mituzas.lt/2009/05/28/checksums-again-some-io-too/'>here</a> or <a href='http://mituzas.lt/2009/03/14/stupid-innodb-tricks/'>here</a>). Oh, and of course, bonus GDB trick, how to run KILL without connecting to server:</p>
<pre>
(gdb) thread apply all bt
...
(gdb) thread 2
[Switching to thread 2 (Thread 0x44a76950 (LWP 23955))]#0  ...
(gdb) bt
#0  0x00007f7821e68e1d in pthread_cond_timedwait...
#1  0x000000000052dc0e in Item_func_sleep::val_int (this=0x12317f0)...
#2  0x0000000000501484 in Item::send (this=0x12317f0, ...
...
#15 0x00000000005caf29 in do_command (thd=0x11da290) ...
...
(gdb) frame 15
#15 0x00000000005caf29 in do_command (thd=0x11da290) at sql_parse.cc:854
854	  return_value= dispatch_command(command, thd, packet+1, ...
(gdb) set thd->killed = THD::KILL_QUERY
(gdb) continue
</pre>
<p>And the client gets <i>&#8216;ERROR 1317 (70100): Query execution was interrupted&#8217;</i> :-)</p>
]]></content:encoded>
			<wfw:commentRss>http://mituzas.lt/2009/07/26/on-binaries-and-fomit-frame-pointer/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
