<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
> <channel><title>Thiago Macieira&#039;s blog</title> <atom:link href="http://www.macieira.org/blog/feed/" rel="self" type="application/rss+xml" /><link>http://www.macieira.org/blog</link> <description>An Open Source hacker&#039;s ramblings</description> <lastBuildDate>Thu, 18 Apr 2013 15:34:17 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3.1</generator> <item><title>The strange world of (release) candidates</title><link>http://www.macieira.org/blog/2012/12/the-strange-world-of-release-candidates/</link> <comments>http://www.macieira.org/blog/2012/12/the-strange-world-of-release-candidates/#comments</comments> <pubDate>Fri, 07 Dec 2012 06:54:02 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[Open Governance]]></category> <category><![CDATA[Qt]]></category> <category><![CDATA[opengov]]></category> <category><![CDATA[qt]]></category> <category><![CDATA[qt-project]]></category> <category><![CDATA[releases]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=463</guid> <description><![CDATA[You&#8217;ve seen the news: the first release candidate for Qt 5.0 has just been released. .And if you haven&#8217;t, you can go download it from http://qt-project.org/downloads. I&#8217;d like to first of all congratulate everyone involved in getting it out, with a special nod to the release team. Thanks for all the work! But I&#8217;d like &#8230;</p><p><a
class="more-link block-button" href="http://www.macieira.org/blog/2012/12/the-strange-world-of-release-candidates/">Continue reading &#187;</a>]]></description> <content:encoded><![CDATA[<p>You&#8217;ve seen the news: <a
href="http://blog.qt.digia.com/blog/2012/12/06/qt-5-0-release-candidate/">the first release candidate for Qt 5.0 has just been released</a>. .And if you haven&#8217;t, you can go download it from <a
href="http://qt-project.org/downloads">http://qt-project.org/downloads</a>. I&#8217;d like to first of all congratulate everyone involved in getting it out, with a special nod to the release team. Thanks for all the work!</p><p>But I&#8217;d like to talk about what will exactly happen in the next couple of weeks from now until the 5.0 final.<br
/> <span
id="more-463"></span></p><p>If you&#8217;re familiar with previous Qt releases, you may have noticed that our release candidates weren&#8217;t really release candidates. In particular, the release plan called for exactly one release candidate and we knew the time between that release and the final. We also knew that the release candidate wasn&#8217;t really a candidate because there were still changes we needed to make. Finally, after those changes were in &#8212; and some of which were quite significant &#8212; we released the final directly, without a second release candidate.</p><p>No more. This time, we&#8217;re doing it the right way. More importantly, it&#8217;s all done in the proper, Open Source and Openly Governed way.</p><p>The Qt 5.0 Release Team is composed of people from many different companies, not just one, testing many different platforms. We&#8217;ve released the alpha, the two betas and this release candidate as a group. Sure, there have been growth pains, especially in the Beta 1 release, but those have mostly been ironed out now.</p><p>The last mistake we fixed was one that I had unfortunately caused: in order to support the Tier specification for the 5.0 release, I had required that packages produced by the Qt Project be tested for 48 hours before they could be released. The idea was to make sure that everyone got the chance to participate in the release process and especially the opportunity to find and fix bugs on their platforms before the release went out. And that&#8217;s what we were doing for the Release Candidate.</p><p>Then it dawned on me: that <em>IS</em> the release candidate model. In other words, we were doing release candidate release candidates!</p><p>After a discussion on Monday&#8217;s release team meeting, <a
href="http://lists.qt-project.org/pipermail/releasing/2012-December/000890.html">we agreed</a> to drop that indirection and just do regular release candidates. Here&#8217;s what we&#8217;ll do on releases from now on:</p><ol><li>Release team prepares a package set;</li><li>Release team does a sanity check on the packages:<ul><li>did the build succeed?</li><li>do the packages include the latest / correct commits?</li><li>do the installers work?</li><li>etc.</li></ul></li><li>If the sanity check went ok, the release team releases this package set as a new <strong>release candidate</strong>;</li><li>For a week, we&#8217;ll collect bug reports and other issues, as well as fixes;</li><li>Then a subjective decision needs to be made: do we have outstanding showstopper issues? Were there any important changes that require wide testing?<ul><li>If so, go back to the first step and let&#8217;s do a new Release Candidate;</li><li>If not, repeat the packaging and sanity-checking steps but for the Final.</li></ul></li></ol><p>We released RC1 today and I can bet we will find issues that require intrusive fixes. That means you should expect an RC2 package in a week or a week and a half. Hopefully, that RC2 should be the last we need, though.</p><p>So, please, get RC1 now, test it and let us know. There&#8217;s usually a lot of helpful people in the <a
href="irc://irc.freenode.net/qt-labs">#qt-labs</a> IRC channel on <a
href="http://freenode.net">Freenode</a>, especially during European daytime. And any issues you find that could potentially be showstoppers, file them in our bug tracking system at <a
href="http://bugreports.qt-project.org">http://bugreports.qt-project.org</a>.</p><p>Let&#8217;s try and get the final out by the end of the year.</p> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/12/the-strange-world-of-release-candidates/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Qt 5.0 beta 1 is out</title><link>http://www.macieira.org/blog/2012/08/qt-5-0-beta-1-is-out/</link> <comments>http://www.macieira.org/blog/2012/08/qt-5-0-beta-1-is-out/#comments</comments> <pubDate>Thu, 30 Aug 2012 13:11:46 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[Qt]]></category> <category><![CDATA[linux]]></category> <category><![CDATA[mac]]></category> <category><![CDATA[qt]]></category> <category><![CDATA[qt-project]]></category> <category><![CDATA[qt5]]></category> <category><![CDATA[releases]]></category> <category><![CDATA[windows]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=456</guid> <description><![CDATA[Lars Knoll, Qt Project&#8217;s Chief Maintainer, blogs to let us know that Qt 5.0 beta 1 is out. Hurray! Go forth and download. You can dowload the source code and binaries from the official Qt Project CDN. The tarballs for building on Unix systems like Linux are in the split_sources subdir &#8212; Linux distribution packagers &#8230;</p><p><a
class="more-link block-button" href="http://www.macieira.org/blog/2012/08/qt-5-0-beta-1-is-out/">Continue reading &#187;</a>]]></description> <content:encoded><![CDATA[<p><a
href="http://labs.qt.nokia.com/author/lars/">Lars Knoll</a>, Qt Project&#8217;s Chief Maintainer, <a
href="http://labs.qt.nokia.com/2012/08/30/qt-5-beta-is-here/">blogs</a> to let us know that <a
href="http://qt-project.org/wiki/Qt-5-Beta">Qt 5.0 beta 1 is out</a>. Hurray! Go forth and download.</p><p>You can dowload the source code and binaries from the official <a
href="http://releases.qt-project.org/qt5.0/beta1/">Qt Project CDN</a>. The tarballs for building on Unix systems like Linux are in the <tt><a
href="http://releases.qt-project.org/qt5.0/beta1/split_sources/">split_sources</a></tt> subdir &#8212; Linux distribution packagers will want to use them. Once those packages exist for distributions, they will be listed in the <a
href="http://qt-project.org/wiki/Qt-5-unofficial-builds">Qt 5 unofficial builds</a> page.</p><p>This is the first Qt 5 release to also include installable binaries. Windows and Mac users have had them for ages in Qt 4, and Linux users have enjoyed them in the past in the form of the Qt SDK binary installers. As far as I can remember, this is the first pure Qt library release to contain Linux installers.</p><p>But, as the nature of the beast goes, those installers are known to work only on Ubuntu distributions. The main reason for that is because Qt 5 depends on the ICU libraries, whose developers went to the &#8220;OpenSSL School of Releasing&#8221; (along with the Boost developers) and haven&#8217;t learned yet to make binary-compatible releases. Sorry about that. If you don&#8217;t have the build capacity to compile the sources yourself, you may want to wait until packaged, binary builds show up for your distribution (in the form of RPMs and DEBs).</p><p>The goal of the beta, as I explained in a <a
href="/blog/2012/01/qt-temperatures-drop-from-january-to-june/">previous blog post</a> release is to gather feedback on the <strong>implementation</strong> and to get bug reports. From this point on, the Qt 5 API is &#8220;soft-frozen&#8221;, meaning that it will not change incompatibly any more, except to fix major issues that we encounter or we&#8217;re told about in the form of feedback. If that happens, we&#8217;ll make sure to make a note of it in the release notes.</p><p>That means that Qt 5.0 beta1 is a suitable starting point for porting applications and writing new code. Your work will not be wasted. But you might run into bugs, so please report them to us, in the <a
href="http://bugreports.qt-project.org">Qt Project Task Tracker</a>. We&#8217;re also very interested in bugs related to packaging, building, the installers, documentation, etc. Just be sure to look first at the <a
href="http://qt-project.org/wiki/Qt500beta1KnownIssues">Known Issues</a> page before reporting anything.</p> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/08/qt-5-0-beta-1-is-out/feed/</wfw:commentRss> <slash:comments>5</slash:comments> </item> <item><title>forkfd part 4: proposed solutions</title><link>http://www.macieira.org/blog/2012/07/forkfd-part-4-proposed-solutions/</link> <comments>http://www.macieira.org/blog/2012/07/forkfd-part-4-proposed-solutions/#comments</comments> <pubDate>Sat, 21 Jul 2012 00:30:18 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[Algorithms]]></category> <category><![CDATA[C++]]></category> <category><![CDATA[Linux]]></category> <category><![CDATA[Programming]]></category> <category><![CDATA[c++11]]></category> <category><![CDATA[linux]]></category> <category><![CDATA[low-level]]></category> <category><![CDATA[optimisation]]></category> <category><![CDATA[unix]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=434</guid> <description><![CDATA[Last week, I wrote three blogs about the situation with starting child processes on Unix and being notified of their exit. I raised several problems with the current implementation, which I have tried to solve and I have now a proposal for. If you haven&#8217;t yet, you should take some time to read the previous &#8230;</p><p><a
class="more-link block-button" href="http://www.macieira.org/blog/2012/07/forkfd-part-4-proposed-solutions/">Continue reading &#187;</a>]]></description> <content:encoded><![CDATA[<p>Last week, I wrote three blogs about the situation with starting child processes on Unix and being notified of their exit. I raised several problems with the current implementation, which I have tried to solve and I have now a proposal for. If you haven&#8217;t yet, you should take some time to read the previous three blogs:</p><ul><li>Part 1: <a
href="../forkfd-part-1-launching-processes-on-unix/">Launching processes on Unix</a>;</li><li>Part 2: <a
href="../forkfd-part-2-finding-out-that-a-child-process-exited-on-unix/">http://www.macieira.org/blog/2012/07/forkfd-part-2-finding-out-that-a-child-process-exited-on-unix/</a>;</li><li>Part 3: <a
href="../forkfd-part-3-qprocesss-requirements-and-current-solution/">QProcess&#8217;s requirements and current solution</a>;</li></ul><p><span
id="more-434"></span></p><h2>The road so far</h2><p>I explained in the first blog how one launches processes on Unix, by way of the <tt>fork</tt> and <tt>execve</tt> system calls, and the problem associated with file descriptors being inherited without closing in the child processes. I also showed how Linux has solved this problem. Since no other Unix system has yet done the same, they are excluded from going forward. They need to be brought into the 2010s first and leave the 1970s behind.</p><p>In the second blog, I went over the contortions required to be notified that a child process has exited, which uses the <tt>SIGCHLD</tt> signal. Designed in the early Unix times, signal handlers have conceptual problems with two modern requirements: libraries and multi-threading. And in the third blog, I explained the requirements that <a
href="http://qt-project.org/doc/qprocess.html">QProcess</a> presents and how Qt has tried so far to solve those problems.</p><p>Unfortunately, there are two issues that can&#8217;t be solved. One is the race condition involved in the installation of the <tt>SIGCHLD</tt> signal handler, and the other is its uninstallation when the library is being unloaded. With the current API, unless I missed something, it&#8217;s not possible to do this cleanly. That leads me to the conclusion that signal handlers should really have been left in the 1970s.</p><h2>The solution I propose</h2><p>The solution I&#8217;d like to see implemented requires another change to the Linux kernel. Attentive readers may have guessed what I want by the title of the blog: I want a new system call named <tt>forkfd</tt>. Similar to additions to Linux like the <tt>signalfd</tt>, <tt>timerfd_create</tt> and <tt>eventfd</tt>, this would be a function that opens a new file descriptor.</p><p>Its man page would be something like the following:</p><blockquote><p><tt><dl><dt><strong>NAME</strong></dt><dd> forkfd - create a child process and a file descriptor for being notified of its exit</dd><dt><strong>SYNOPSIS</strong></dt><dd> int forkfd(int flags, pid_t *pid);</dd><dt><strong>DESCRIPTION</strong></dt><dd> <strong>forkfd()</strong> creates a file descriptor that can be used to be notified of when a child process exits. This file descriptor can be monitored using <strong>select(2)</strong>, <strong>poll(2)</strong> or similar mechanisms.</p><p>The <u>flags</u> parameter can contain the following values ORed to change the behaviour of <strong>forkfd()</strong>:</p><dl><dt><strong>FFD_NONBLOCK</strong></dt><dd>Set the <strong>O_NONBLOCK</strong> file status flag on the new open file descriptor. Using this flag saves extra calls to <strong>fnctl(2)</strong> to achieve the same result.</dd><dt><strong>FFD_CLOEXEC</strong></dt><dd>Set the close-on-exec (<strong>FD_CLOEXEC</strong>) flag on the new file descriptor. This flag applies to the parent process side of the fork and new processes created after that. The child process created by <strong>forkfd()</strong> does not have this file descriptor open.</dd></dl><p>The file descriptor returned by <strong>forkfd()</strong> supports the following operations:</p><dl><dt><strong>read(2)</strong></dt><dd>When the child process exits, then the buffer supplied to <strong>read(2)</strong> is used to return information about the status of the child in the form of one <u>siginfo_t</u> structure. The buffer must be at least <u>sizeof(siginfo_t)</u> bytes. The return value of <strong>read(2)</strong> is the total number of bytes read.</dd><dt><strong>poll(2), select(2)</strong> (and similar)</dt><dd>The file descriptor is readable (the <strong>select(2)</strong> readfds argument; the <strong>poll(2) POLLIN</strong> flag) if the child has exited or signalled via SIGCHLD.</dd><dt><strong>close(2)</strong></dt><dd>When the file descriptor is no longer required it should be closed.</dd></dl></dd><dt><strong>RETURN VALUE</strong></dt><dd> On success, in the parent process <strong>forkfd()</strong> returns a new forkfd file descriptor and sets the PID of the child process to <u>*pid</u>; in the child process, it returns <u>FFD_CHILD_PROCESS</u> and sets <u>*pid</u> to zero. On error, -1 is returned and <u>errno</u> is set to indicate the error, with no process being created.</dd></dl><p></tt></p></blockquote><p>This solution has the following benefits:</p><ul><li>No signal handler installation or uninstallation is necessary, which avoids both outstanding unfixable issues;</li><li>If no signal handler is needed, there is no need to start a thread for managing the child process status;</li><li>Notification is sent via a read notification on a file descriptor, which all event-driven applications know how to handle, plus it matches the requirements that <tt>QProcess</tt> has in its own <tt>waitFor</tt> functions;</li><li>The child process is automatically reaped by the <tt>read()</tt> call, which avoids the need to call <tt>wait</tt> or <tt>waitpid</tt>.</li></ul><h2>Implementing it in userland</h2><p>I tried to implement the above function in userland, in pure C using only POSIX calls. The idea was that this code could be used in many different libraries to solve their process-management problems. I came up with three implementations:</p><h3>Using pthreads</h3><p><font
size="-2">Source code: <a
href="/~thiago/forkfd.h">header</a>, <a
href="/~thiago/forkfd1.c">source</a></font></p><p>The first attempt was a direct rewrite of the <tt>QProcess</tt> solution in C, using only POSIX calls. The code has a global <tt>pthread_mutex_t</tt> that protects a doubly-linked list of currently-running processes. It installs the <tt>SIGCHLD</tt> handler under a mutex lock, creates a pipe, and forks. In the child side of the fork, it closes the pipe and returns the magic constants. In the parent side, it adds the writing end of the pipe and the PID to the list, and returns the reading end.</p><p>Since I wrote this code before I realised the fatal flaw with <tt>SA_SIGINFO</tt>, this code is still using it. It writes the <tt>siginfo_t</tt> structure received in the signal handler to the process manager, by way of a private pipe. That one, in turn, will read the PID from the structure and proceed to write the structure again to the user, via the writing end of the pipe that was saved in the <tt>forkfd()</tt> call.</p><p>This code is fixable, by making the signal handler write one byte to the pipe (or use <tt>eventfd</tt>) and have the process manager thread loop over the currently-known child processes, calling <tt>waitpid</tt> on each and synthesising <tt>siginfo_t</tt> for the user.</p><h3>Lock-free</h3><p><font
size="-2">Source code: same header, <a
href="/~thiago/forkfd3.c">source code</a></font></p><p>The problem with the first implementation, besides relying on <tt>SA_SIGINFO</tt>, is that it requires pthreads and mutexes. I wanted something lock-free, so I started writing that. I wrote a lock-free structure to replace the doubly-linked list of PID and writing pipe pairs, based on previous experience with Qt&#8217;s lock-free timer ID allocator. I haven&#8217;t done exhaustive testing on it, but it&#8217;s simple enough that it&#8217;s probably correct (pending further reviews).</p><p>This solution is, unfortunately, still based on <tt>SA_SIGINFO</tt> (why? because I hadn&#8217;t realised it was a problem by then; I only did so when writing the blog). The way it works is that the signal handler will read from this structure and figure out, based on the PID that from <tt>siginfo_t</tt>, what the file descriptor to write is. The signal handler also does the necessary <tt>waitpid</tt> call to reap the child process.</p><p>Unfortunately, this solution doesn&#8217;t work, since it introduces several race conditions. One is an extremely rare situation, in which the library with this code is being unloaded in one thread while the signal handler is still running in another. More importantly, though, there&#8217;s a race condition between the time of the <tt>fork</tt> and the addition of the PID to the list of children. This condition didn&#8217;t happen before because of the mutex: the process manager thread would not read from the list until the <tt>forkfd</tt> function released the lock, after adding the child process.</p><p>This implementation is still salvageable, though. First, it needs to stop relying on <tt>SA_SIGINFO</tt>, which means it must iterate over all the known children inside the signal handler, doing <tt>waitpid</tt> calls on each. Second, with the absence of a lock, it must <strong>prevent</strong> the child process from exiting before its PID and pipe are added to the list. That can be done by adding an extra, blocking pipe between the parent and child process: the child process tries to <tt>read()</tt> from it, suspending itself, until the parent process releases it by writing something.</p><h3>Adding a spin lock</h3><p><font
size="-2">Source code: same header, <a
href="/~thiago/forkfd2.c">source code</a></font></p><p>The solution I ended up writing to the race conditions of the previous implementation was to add a spin lock (why this and not the pipe lock I described above? Because it hadn&#8217;t occurred to me until just now). It&#8217;s a step back from the fully lock-free solution, but not all the way back to the pthreads implementation. For one thing, it doesn&#8217;t  start a thread for the process management. For another, since it implements the spinlock on its own, it <strong>can</strong> lock inside the signal handler (note that <tt>pthread_mutex_lock</tt> is not a permitted function inside one).</p><p>I just had to be careful about one thing: before locking the spin lock, the calling thread must block <tt>SIGCHLD</tt> using <tt>pthread_sigmask</tt>. If it didn&#8217;t do that, the signal handler could be called asynchronously in the same thread as the one where the spin lock is locked, producing a deadlock.</p><h2>Choosing a solution</h2><p>To be honest, <strong>none</strong> of the three solutions are the ideal ones. If I had to choose between one of them, I&#8217;d go for the lock-free one for personal reasons, but the spin-lock one might have fewer bugs in the threading code.</p><p>But that&#8217;s not what I want. What I really want is that <tt>forkfd</tt> be implemented in the kernel, so that no signal handler is involved, eliminating the unsolvable problems that those introduce.</p><p>If there are any kernel hackers listening in, do you think there&#8217;s a chance?</p> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/07/forkfd-part-4-proposed-solutions/feed/</wfw:commentRss> <slash:comments>12</slash:comments> </item> <item><title>forkfd part 3: QProcess&#8217;s requirements and current solution</title><link>http://www.macieira.org/blog/2012/07/forkfd-part-3-qprocesss-requirements-and-current-solution/</link> <comments>http://www.macieira.org/blog/2012/07/forkfd-part-3-qprocesss-requirements-and-current-solution/#comments</comments> <pubDate>Fri, 13 Jul 2012 22:46:31 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[Algorithms]]></category> <category><![CDATA[C++]]></category> <category><![CDATA[Programming]]></category> <category><![CDATA[Qt]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=428</guid> <description><![CDATA[In the previous posts onmy series of blogs about starting and managing sub-processes on Unix, I talked about how it&#8217;s implemented and how the current solutions have limitations. On this post, I&#8217;ll show how QtCore has solved the problem (to the extent that it can be solved) and what requirements a new solution must fulfill. &#8230;</p><p><a
class="more-link block-button" href="http://www.macieira.org/blog/2012/07/forkfd-part-3-qprocesss-requirements-and-current-solution/">Continue reading &#187;</a>]]></description> <content:encoded><![CDATA[<p>In the previous posts onmy series of blogs about starting and managing sub-processes on Unix, I talked about how it&#8217;s implemented and how the current solutions have limitations. On this post, I&#8217;ll show how QtCore has solved the problem (to the extent that it can be solved) and what requirements a new solution must fulfill.</p><p>Links to the previous posts:</p><ul><li><a
href="http://www.macieira.org/blog/2012/07/forkfd-part-1-launching-processes-on-unix/">Part 1: Launching processes on Unix</a></li><li><a
href="http://www.macieira.org/blog/2012/07/forkfd-part-2-finding-out-that-a-child-process-exited-on-unix/">Part 2: Finding out that a child process exited on Unix</a></li></ul><p><span
id="more-428"></span></p><h2><tt>QProcess</tt>’s API</h2><p>The <a
href="http://qt-project.org/doc/qprocess.html"><tt>QProcess</tt></a> class is a very powerful and flexible class. Its API in Qt 4 and Qt 5 is the same, an evolution from the Qt 3 API. Like other Qt I/O classes, QProcess&#8217;s API is entirely asynchronous, with functions to make it work under synchronous circumstances.</p><p>For those not familiar with Qt&#8217;s API design, Qt classes have named callbacks called &#8220;signals&#8221; (that have nothing to do Unix signals, except the name) that identify a particular change or event that has happened. QProcess has four important signals:</p><ul><li><a
href="http://qt-project.org/doc/qprocess.html#started">started</a></li><li><a
href="http://qt-project.org/doc/qiodevice.html#readyRead">readyRead</a> (inherited from <tt>QIODevice</tt>)</li><li><a
href="http://qt-project.org/doc/qiodevice.html#bytesWritten">bytesWritten</a> (inherited too)</li><li><a
href="http://qt-project.org/doc/qprocess.html#finished">finished</a></li></ul><p>Each of these represent one activity or state change in the sub-process. Those signals are emitted by the <tt>QProcess</tt> object from inside the main application&#8217;s event loop, allowing the application code to be entirely asynchronous and simply react to the changes as needed.</p><p>In addition, each of those four signals is paired with a synchronous function whose name is <tt>waitFor</tt> followed by the name of the signal. As the name indicates, the function blocks until either a timeout expires or until that particular signal is emitted. The <tt>waitFor</tt> functions the application code to be synchronous (blocking) where it needs to.</p><p>The first of those four signals, <tt>started</tt>, is quite simple. It may seem weird at first glance to have a signal indicating that the process has started. After all, shouldn&#8217;t the <a
href="http://qt-project.org/doc/qprocess.html#start"><tt>start</tt></a> function return an error condition instead? In fact, it exists because we didn&#8217;t want to put any requirements in the OS scheduler: the parent process could continue executing for some time before the scheduler decided to let the child execute and, eventually, <tt>execve</tt>.</p><p>I&#8217;m not going to go into the details of how to determine whether the sub-process successfully <tt>execve'd</tt>. It&#8217;s not relevant to our story: in the first blog, I explained how starting a process is a done deal and works just fine.</p><p>In turn, the fourth of those signals, the <tt>finished</tt> signal, is the interesting one for us. The other two are relevant, though, for one particular reason: the parent process does not know what the child process will do next. The child process could write something to its <tt>stdout</tt>, it could consume its <tt>stdin</tt>, or it could exit, crash, core dump, or otherwise disappear. That means we need to monitor at all times up to three pipes (<tt>stdin</tt>, <tt>stdout</tt> and <tt>stderr</tt>) as well as whatever mechanism we&#8217;re using to be notified that the process has exited. That&#8217;s the first of our requirements.</p><h2>The requirements</h2><p>The first requirement, as I&#8217;ve just explained, is that the main application event loop be able to monitor up to three pipes and the child process&#8217;s exit mechanism (whichever we make it). Not only that, the three <tt>waitFor</tt> functions that pair with runtime signals &#8212; that is, <a
href="http://qt-project.org/doc/qiodevice.html#waitForReadyRead"><tt>waitForReadyRead</tt></a>, <a
href="http://qt-project.org/doc/qiodevice.html#waitForBytesWritten"><tt>waitForBytesWritten</tt></a>, and <a
href="http://qt-project.org/doc/process.html#waitForFinished"><tt>waitForFinished</tt></a> &#8212; need to do the same. In both cases, the requirement boils down to the same: whatever the exit notification mechanism is, it needs to be accessible from a <tt>select</tt> or <tt>poll</tt> or it needs to interrupt such a call.</p><p>The second requirement is that this mechanism scale for multiple child processes simultaneously. One event loop needs to be able to monitor multiple pipes and the exit notification from multiple processes. Though by the design of the API, this requirement does not apply to the <tt>waitFor</tt> functions.</p><p>The third requirement is that this needs to work in a multithreaded environment. That is, multiple event loops or multiple <tt>waitFor</tt> functions might be running simultaneously.</p><p>And finally, the fourth requirement is that, if we write a UNIX signal handler, we obey the requirements for signal handlers.</p><h2>The Qt solution</h2><p>Let&#8217;s build it step by step. As I&#8217;ve shown in the previous blogs, the current state of the art solution is to install a signal handler. Yes, it has an unfixed and non-workaroundable race condition during the installation, and it has an unfixable problem with uninstallation of the handler. Those issues are out of our control, though of course the point of this series of blogs is to try and solve them, or propose a solution that avoids them completely.</p><p>The current solution is also something that exists today and has existed for over 8 years. That means it does not use Linux&#8217;s <tt>signalfd</tt> solution. I&#8217;ll explore that possibility later, when we discuss how to improve the current solution.</p><p>The signal handler that Qt installs is very similar to the code block I had in part two: it &#8220;does something&#8221;, then it chains to the previous handler. What is that &#8220;something&#8221;?</p><p>When our <tt>SIGCHLD</tt> handler is called, we don&#8217;t know which child process has exited because we didn&#8217;t set the SA_SIGINFO flag (again, more on that later). Since we don&#8217;t have the PID of the child process, the only way of getting it is by doing a <tt>wait</tt> or <tt>waitpid</tt> call. But as I discussed in the first part, we can&#8217;t do a &#8220;wait for any child&#8221; in Qt, because it could interfere with the operations of other libraries like GLib. Since we can&#8217;t be told which child has exited, the only solution we have is to check all of our children to see if any has exited.</p><p>Enter the fourth requirement: we can only use functions that are explicitly listed in the list of <a
href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03">POSIX signal-safe functions</a>. That excludes locking any mutex: remember that the signal handler could be called in any thread, including a thread that has a particular mutex currently locked.</p><p>The Qt solution is to write a single byte on a pipe that ends in a process manager. That way, we exit from the signal handler and can continue operating from the a regular context, where we can lock mutexes and get a list of all children currently managed by <tt>QProcess</tt>. What happens next is that Qt launches a storm of <tt>waitpid</tt> calls, one for each child process, with the <tt>WNOHANG</tt> flag set. We&#8217;re hoping that this will yield one child having exited, but it could be more than one and it could also be zero (e.g., a child process that was managed by GLib).</p><p>But wait, there&#8217;s the third requirement to consider: all the user threads, including the main thread, could be blocked doing something or other. Therefore, this process manager needs to be run in either a thread that is dedicated, is not blocked, or is currently in one of the <tt>QProcess::waitFor</tt> functions. The Qt solution is the first one: there&#8217;s a dedicated thread. This solution makes sense if you consider that, otherwise, all Qt-based event loops would need to add the reading end of the SIGCHLD handler&#8217;s pipe to their <tt>select</tt> or <tt>poll</tt> list, in which case all of those threads might be woken up by the activity, even if only one can do the work.</p><p>If this process manager finds out a child process that has exited, it needs to notify the thread that will emit the <tt>finished</tt> signal. It does that by writing to a pipe, whose reading end is in the source list for <tt>select</tt> or <tt>poll</tt>.</p><p>Actually, I embellished the Qt solution a little here. The process manager doesn&#8217;t do the <tt>waitpid</tt> call. It simply writes a byte to all QProcess&#8217;s pipes and then lets each <tt>QProcess</tt> object call <tt>waitpid</tt>. There&#8217;s an obvious improvement opportunity here &#8212; though of course, that requires that the process manager communicate what the exit status of the child process was.</p><h2>Improving by using <tt>SA_SIGINFO</tt></h2><p>Turns out that this is exactly what set me upon this path. A few days ago, I was talking to some colleagues on our internal IRC channel when the subject of signals and <tt>SA_SIGINFO</tt> came up. It occurred to me that the OS <strong>does</strong> tell us which child process exited, as part of an extra parameter to the signal handler.</p><p>With that information in hand, we don&#8217;t have to call <tt>waitpid</tt> on each and every one of our child processes to figure out which one has exited. In fact, we know from the start which one it is, even if it&#8217;s not one of ours. I actually wrote a patch to do this and you can see it in <a
href="https://codereview.qt-project.org/#change,30614">Qt&#8217;s code review tool</a> (it&#8217;s not approved yet at the time of this writing).</p><p>I modified the existing code by writing not a simple byte from the <tt>SIGCHLD</tt> handler to the process manager, but the 4 bytes of the <tt>pid_t</tt> type containing the PID of the child that exited. Then the process manager simply needs to search for that PID and notify the exact <tt>QProcess</tt> object whose child exited.</p><p>The improvement is clear: as a response to a <tt>SIGCHLD</tt> being delivered, the application executes exactly <strong>one</strong> <tt>waitpid</tt> call.</p><p>What&#8217;s the problem with this? Well, two of them. First, what happens if the actual signal handler for <tt>SIGCHLD</tt> was not ours, but something like I wrote on the second blog:</p><pre class="brush: cpp; title: ; notranslate">
static struct sigaction old_sigaction;
static void sigchld_handler(int signum)
{
    /* my code goes here */
 
    if (old_sigaction.sa_handler != SIG_IGN
            &amp;&amp; old_sigaction.sa_handler != SIG_DFL)
        old_sigaction.sa_handler(signum);
}
</pre><p>Do you see what will happen when our signal handler tries to dereference the second parameter, of type <tt>siginfo_t *</tt>? Crash.</p><p>The other problem is more serious: what happens if a second child process exits while our signal handler is running? POSIX requires that this second signal be queued, so our handler is executed again. And what happens if a <strong>third</strong> child exits too? Well&#8230; in that case we&#8217;re toast: the third <tt>SIGCHLD</tt> simply gets dropped.</p><p>That&#8217;s pretty much a showstopper. Coupled with the comments I&#8217;ve received in the review tool and on IRC, which pointed me to an issue with the pipe buffer being full causing either an unhandled <tt>EWOULDBLOCK</tt> error or a possible deadlock, it indicates to me that I must abandon this solution.</p><h2>Improving by using <tt>signalfd</tt></h2><p>Since we&#8217;re talking about Linux only since the beginning of the second blog, we could use Linux&#8217;s <tt>signalfd</tt> mechanism and avoid a signal handler altogether. Here are some implementation possibilities and the problems with them:</p><ol><li>We could have one distinct <tt>signalfd</tt> on SIGCHLD per thread, but I don&#8217;t think it&#8217;s determined which of the <tt>signalfd</tt> get woken up by the delivery of the signal. If the answer isn&#8217;t &#8220;all of them&#8221;, this won&#8217;t work for us and I&#8217;m sure the answer isn&#8217;t that.</li><li>We could have the same <tt>signalfd</tt> in all threads, but then we run into the problem of &#8220;which <tt>select</tt> gets woken up&#8221; by the activity. What&#8217;s more, we don&#8217;t know if GLib doesn&#8217;t have its own <tt>signalfd</tt> installed, causing the problems from #1 above.</li><li>We could have a dedicated &#8220;process manager&#8221; thread that is the only one listening for the signal. However, this solution is hardly different from the time-tried signal handler. Moreover, it also falls short in the multiple-library-criterion as #2.</li></ol><p>Add to that the fact that a <tt>signalfd</tt> is only delivered if <strong>all</strong> threads in the process have blocked the signal. If even one of them has it unblocked, the signal will be delivered to a signal handler in that thread, bypassing the <tt>signalfd</tt> completely. We can&#8217;t be sure that the user hasn&#8217;t created a thread and cleared the signal blocking mask.</p><p>No matter what we do, any solution in userspace will require a signal handler and will run into the unsolved problems from the second blog. In the next issue, I&#8217;ll explore what solutions I&#8217;m proposing, both in userspace and in kernel space.</p> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/07/forkfd-part-3-qprocesss-requirements-and-current-solution/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>forkfd part 2: Finding out that a child process exited on Unix</title><link>http://www.macieira.org/blog/2012/07/forkfd-part-2-finding-out-that-a-child-process-exited-on-unix/</link> <comments>http://www.macieira.org/blog/2012/07/forkfd-part-2-finding-out-that-a-child-process-exited-on-unix/#comments</comments> <pubDate>Fri, 13 Jul 2012 12:56:38 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[Algorithms]]></category> <category><![CDATA[Linux]]></category> <category><![CDATA[Programming]]></category> <category><![CDATA[Qt]]></category> <category><![CDATA[linux]]></category> <category><![CDATA[low-level]]></category> <category><![CDATA[optimisation]]></category> <category><![CDATA[posix]]></category> <category><![CDATA[unix]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=416</guid> <description><![CDATA[On my previous blog, I said that the solutions we&#8217;ve got implemented on Linux are a good start, but not the full solution. We can start a child process properly, but we still can&#8217;t properly find out when it exited. Linux or nothing First of all, let me get one thing straight: from this point &#8230;</p><p><a
class="more-link block-button" href="http://www.macieira.org/blog/2012/07/forkfd-part-2-finding-out-that-a-child-process-exited-on-unix/">Continue reading &#187;</a>]]></description> <content:encoded><![CDATA[<p>On my <a
href="../forkfd-part-1-launching-processes-on-unix/">previous blog</a>, I said that the solutions we&#8217;ve got implemented on Linux are a good start, but not the full solution. We can start a child process properly, but we still can&#8217;t properly find out when it exited.</p><p><span
id="more-416"></span></p><h2>Linux or nothing</h2><p>First of all, let me get one thing straight: from this point on, I&#8217;ll only be thinking about Linux. If you&#8217;re running anything else, I don&#8217;t care about you. But you may continue reading anyway.</p><p>I have a couple of reasons for doing that, one of them being that it&#8217;s the easiest to get the new API I&#8217;m proposing accepted. It&#8217;s probably just as easy to get the other open source BSDs to do the same, but not so much on the commercial Unixes, which would involve a long lead time, product management, NDAs, etc.</p><p>But the most important reason is that any of those other Unixes are <strong>years</strong> behind Linux on the state of the art. If your OS hasn&#8217;t done its homework for the past 4 years and introduced the API I mentioned in the previous blog, then I don&#8217;t care about that OS.</p><p>More to the point: the problem I am trying to solve is related to multithreading and the race conditions that are involved in such a scenario. Without those APIs, there&#8217;s no possibility of thread-safety <em>anyway</em>.</p><h2>How to be notified of a child process exiting</h2><p>There are two ways of being notified that a child process has exited: a blocking (synchronous) and a non-blocking (asynchronous) method. The blocking one is fairly simple: a call to <tt>waitpid</tt> without the <tt>WNOHANG</tt> flag (i.e., &#8220;do hang&#8221;). On Linux, the waitpid call is backed by the kernel system call of the same name, so we know the kernel is doing the right thing.</p><p>The problem with the blocking API is that, as it turns out, it&#8217;s <strong>blocking</strong>. It&#8217;s unsuitable to be run in the same thread that is handling user interaction and painting in a GUI application. If you want to use it, you need to start a thread for it. To make matters worse, to implement this in a generic-purpose library like Qt or Glib, you&#8217;d need to start <em>one thread per child process</em>.</p><p>The <tt>waitpid</tt> function can be used to wait in one of three conditions: one specific child process given by its PID, one specific process group given its PGID, or all child processes.  The process group idea looks interesting at first, since each library could create one such group and move all the child processes it cares about into it. However, process groups have other purposes and side-effects, including the fact that the child process can change its session ID and process group, which exclude them from a generic solution. And clearly waiting for any and all child processes is not acceptable for a generic library, for it cannot know whether there are processes started outside of its control, like when both Qt and Glib are being used.</p><h2>Problem two: chaining signal handlers</h2><p>That leaves us with the asynchronous method of being notified of a child process exiting, which is done via POSIX signals. More specifically, by the delivery of the <tt>SIGCHLD</tt> (also spelt <tt>SIGCLD</tt>). And here we run into a series of problems that aren&#8217;t solved today.</p><p>The first of them is that there&#8217;s no thread-safe way of installing a signal handler in a generic library. The system call <a
href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/signal.html"><tt>signal</tt></a> can be used to install a signal handler. This function is fine if the handler is installed by the application developer, like the case of handling <tt>SIGINT</tt> (the signal that Ctrl+C sends) or <tt>SIGTERM</tt> and performing some clean-ups.</p><p>But it&#8217;s not acceptable for a generic library. Again let&#8217;s take the case of an application using both Qt and Glib, either directly or indirectly. Since both libraries need to install a signal handler, it stands to reason that one handler must somehow call the other to let it do its work. That means <tt>signal</tt> is out.</p><p>Fortunately, there&#8217;s <a
href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/sigaction.html"><tt>sigaction</tt></a> and this system call not only installs a new handler, it also returns the old handler too, so one handler can call the other. There&#8217;s a multithreading problem here, but let&#8217;s look into that later. The code to install the handler would look something like this:</p><pre class="brush: cpp; title: ; notranslate">
static struct sigaction old_sigaction;
static void sigchld_handler(int signum)
{
    /* my code goes here */
 
    if (old_sigaction.sa_handler != SIG_IGN
            &amp;&amp; old_sigaction.sa_handler != SIG_DFL)
        old_sigaction.sa_handler(signum);
}
</pre><p>Installation can be achieved with this API, but how about uninstallation? That&#8217;s where the unsolved problem lies. The first solution that comes to mind, a gross one and the simplest possible, is to ignore uninstallation and simply decree that the handler will remain there until the process exits completely.</p><p>That brute-force solution breaks down the moment we introduce unloading of libraries. Now, you may remember my saying that library and plugin unloading are a bad idea and should be avoided, because they create a whole number of problems. Yes, that&#8217;s true, and this is one of them. So I do recommend that you avoid unloading libraries and plugins in your applications. However, we&#8217;re looking for a generic solution here, so we must take unloading into account.</p><p>During the library unloading process, the signal handler must be uninstalled. The way to uninstall it is to install something else. But what? The options that come to mind are:</p><ol><li>install the default signal handler, <tt>SIG_DFL</tt>; or</li><li>ignore the signal, by installing <tt>SIG_IGN</tt>; or</li><li>install the previous signal handler, the one we saved from the <tt>sigaction</tt> call.</li></ol><p>Unless you&#8217;re trying to be funny or you see some other problem that I don&#8217;t, you&#8217;ll suggest the third option, right? Therefore, we uninstall our handler like this:</p><pre class="brush: cpp; title: ; notranslate">
    sigaction(SIGCHLD, NULL, old_sigaction);
</pre><p>Can you see the problem?</p><p>What happens if our handler is not the currently-installed handler? That could happen if the de-initialisation order is different from the initialisation one. As a concrete example, imagine an application that uses Qt, so QtCore got loaded at process start and will not be unloaded until the process exit. Then the application does this, in order:</p><ol><li>it loads a plugin that uses Glib;</li><li>the plugin uses the <a
href="http://developer.gnome.org/glib/2.32/glib-Spawning-Processes.html">g_spawn_async</a> function, causing Glib to install its <tt>SIGCHLD</tt> handler;</li><li>the application uses <a
href="http://qt-project.org/doc/qprocess.html"><tt>QProcess</tt></a>, causing Qt to install its <tt>SIGCHLD</tt> handler;</li><li>the Glib-using plugin is unloaded and Glib tries to uninstall its handler.</li></ol><p>At this point, Glib will uninstall Qt&#8217;s handler too, rendering <tt>QProcess</tt> unusable. The current code in <tt>qprocess_unix.cpp</tt> tries to work around this problem by doing:</p><pre class="brush: cpp; title: ; notranslate">
    struct sigaction currentAction;
    ::sigaction(SIGCHLD, 0, &amp;currentAction);
    if (currentAction.sa_handler == qt_sa_sigchld_handler) {
        ::sigaction(SIGCHLD, &amp;qt_sa_old_sigchld_handler, 0);
    }
</pre><p>Let&#8217;s ignore for a moment the fact that the above code is neither thread-safe nor async signal-safe. Another thread could be trying to install a handler at the same time, and a SIGCHLD could be delivered in-between the two calls to <tt>sigaction</tt>. Let&#8217;s ignore it because we have a bigger problem: if Qt&#8217;s handler isn&#8217;t the topmost handler installed, Qt is forced to leave its handler installed. And if QtCore is about to be unloaded, there&#8217;s a very big chance that the next <tt>SIGCHLD</tt> delivery will crash the application!</p><h2>Problem three: thread-safety in <tt>sigaction</tt></h2><p>This problem exists not because of the API, but because of the implementation. I am assuming that the kernel side of <tt>sigaction</tt> is correctly implemented and it will do its proper locks in case two threads of the same process try to install signal handlers for the same signal at the same time. Let&#8217;s also assume that the kernel does not allow a signal to be delivered to the process while it is modifying its own structures of the signal handlers.</p><p>The problem exists in the userland because glibc&#8217;s <tt>struct sigaction</tt> is different from the kernel&#8217;s ABI. That forces glibc to allocate a local (stack-based) structure so its address can be passed to the system call. Upon return, it needs to copy the contents into our <tt>old_sigaction</tt> variable.</p><p>Do you see the problem? There&#8217;s a race condition there.</p><p>A signal could be delivered to the process after the kernel returned to user-space, but before the glibc code could copy the contents to our variable. That means our newly-installed signal handler could be called before the <tt>old_sigaction</tt> was filled in. That would mean the old handler would not get called as it should be. And chances are that the signal being delivered wasn&#8217;t meant to our handler anyway &#8212; after all, you&#8217;d have problems in your code if you could receive your own SIGCHLD before your handler were ready.</p><p>In reality, it is actually worse than the above description: since the <tt>struct sigaction</tt> structure is not filled in atomically, our signal handler could see a partially-filled structure. And in any case, since glibc&#8217;s code allocates it on the stack and does not pre-fill it with zeroes before the system call, if the handler is called with the chain link not completely filled, there&#8217;s a good chance that the chain call will end up in a garbage address.</p><h2>Propsed solution for problem three</h2><p>The solution for this problem is simple: glibc must not <tt>memcpy</tt>. That means the userspace <tt>struct sigaction</tt> must be equal to the kernel&#8217;s. And therein lies another problem: to change that structure now, we&#8217;d have to break the C library ABI. It can be done with ELF versioning, but it does not remove an ABI change, which could turn up again if there were code that shared <tt>struct sigaction</tt> across libraries.</p><p>However, there&#8217;s no solution that I can see for problem two and it&#8217;s a serious issue. On the next blog, I&#8217;ll explore the solution that Qt has used for a few years and the problems with it.</p> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/07/forkfd-part-2-finding-out-that-a-child-process-exited-on-unix/feed/</wfw:commentRss> <slash:comments>5</slash:comments> </item> <item><title>forkfd part 1: Launching processes on Unix</title><link>http://www.macieira.org/blog/2012/07/forkfd-part-1-launching-processes-on-unix/</link> <comments>http://www.macieira.org/blog/2012/07/forkfd-part-1-launching-processes-on-unix/#comments</comments> <pubDate>Fri, 13 Jul 2012 11:05:24 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[Algorithms]]></category> <category><![CDATA[Linux]]></category> <category><![CDATA[Programming]]></category> <category><![CDATA[Qt]]></category> <category><![CDATA[linux]]></category> <category><![CDATA[low-level]]></category> <category><![CDATA[optimisation]]></category> <category><![CDATA[posix]]></category> <category><![CDATA[unix]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=411</guid> <description><![CDATA[Have you ever tried to launch a sub-process on Unix? POSIX.1 has several APIs for doing that, including: fork+execve and posix_spawn. Starting a child process is not difficult, but ensuring that they behave properly and you get notified when the child dies, that is difficult. First, posix_spawn I want to concentrate only on fork(2)+execve(2), so &#8230;</p><p><a
class="more-link block-button" href="http://www.macieira.org/blog/2012/07/forkfd-part-1-launching-processes-on-unix/">Continue reading &#187;</a>]]></description> <content:encoded><![CDATA[<p>Have you ever tried to launch a sub-process on Unix? <a
href="http://pubs.opengroup.org/onlinepubs/9699919799/">POSIX.1</a> has several APIs for doing that, including: <a
href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html">fork</a>+<a
href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/execve.html">execve</a> and <a
href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/posix_spawn.html">posix_spawn</a>. Starting a child process is not difficult, but ensuring that they behave properly and you get notified when the child dies, that is difficult.</p><p><span
id="more-411"></span></p><h2>First, <tt>posix_spawn</tt></h2><p>I want to concentrate only on <tt>fork(2)</tt>+<tt>execve(2)</tt>, so let&#8217;s talk about the new API first. The <tt>posix_spawn</tt> API was first introduced in <a
href="http://en.wikipedia.org/wiki/POSIX#POSIX.1-2001">POSIX.1-2001</a>, derived from the earlier &#8220;1d&#8221; specification. If your Unix system is not POSIX.1-2001-compliant, you won&#8217;t have it. And even if you have a 2001- or 2008-compliant system, the specification still says today:</p><blockquote><p><strong><em>APPLICATION USAGE</em></strong></p><p>These functions are part of the Spawn option and need not be provided on all implementations.</p></blockquote><p>So if you want to write cross-platform code, you&#8217;re not going to depend on it. (Unless you need to use it for systems that don&#8217;t have <tt>fork()</tt>, like QNX Neutrino).</p><p>In any case, for my interest here &#8212; Linux &#8212; two other factors come into play:</p><ol><li>The kernel has no <tt>posix_spawn</tt> API, so the C library would need to implement it in userspace, using <tt>fork</tt> and <tt>execve</tt> anyway.</li><li>Glibc does not implement it today, it simply returns ENOSYS.</li></ol><p>With that in mind, let&#8217;s focus on the traditional API.</p><h2><tt>fork</tt> and <tt>execve</tt></h2><p>The traditional API for launching a process on Unix systems is to first fork your process and then replace it with another. This two-step process is extremely flexible and has allowed for many uses and abuses over the years. For example, before we had proper thread support, forking and communicating with the child forks was a common way of operating. In fact, even today the extremely popular <a
href="http://httpd.apache.org">Apache web server</a> continues to offer a module that handles requests on the time-proven <a
href="http://httpd.apache.org/docs/2.4/mod/prefork.html">non-threaded fork-based</a> implementation.</p><p>When the call to <tt>fork</tt> succeeds, execution will continue in two different processes: the parent and the child. The parent process receives the child process&#8217;s identifier (the PID) for later use, like <tt>kill(2)</tt> or to distinguish between notifications from multiple children.</p><p>Usually, the child process will perform some clean up and preparation before later calling <tt>execve</tt>. The POSIX API offers several variations of in the <tt>exec</tt> family, but they all boil down to <tt>execve</tt>: the path to the executable is absolute, the arguments are in a vector and the environment to be passed down is known.</p><p>As I said in the introduction, so far so good. This is easy, flexible and proven by time. Yet it has some problems we&#8217;ll explore.</p><h2>First problem: inheriting file descriptors</h2><p>The POSIX specification for <a
href="http://pubs.opengroup.org/onlinepubs/9699919799/functions/execve.html"><tt>execve</tt></a> declares:</p><blockquote><p>File descriptors open in the calling process image shall remain open in the new process image, except for those whose close-on-exec flag FD_CLOEXEC is set.</p></blockquote><p>This allows the child process to inherit the standard streams of the C library &#8212; stdin, stdout and stderr. When you launch a process from a shell, for example, the process will inherit the connection to the terminal so you&#8217;ll get the output on your screen.</p><p>This feature is also what allows parent and child process to communicate and for redirections to happen. When you type in the terminal something like:</p><p><code>$ process > output.log</code></p><p>What the shell is doing is making sure that the stdout stream is connected to the file <tt>output.log</tt> before it calls <tt>execve</tt>.</p><p>Yet the major flaw in this API is that <tt>FD_CLOEXEC</tt> flag is not the default. That means at every point in that you call a function that opens a file descriptor, you must remember to also make the file descriptor close-on-exec if you don&#8217;t want to leak it to the child process.</p><p>It was a major flaw in the 1970s when this API was designed, but not catastrophic. With proper care, one could make it work. And if a particular function was going to close the file descriptor anyway before any chance of forking, it did not have to bother.</p><p>It became showstopper at the end of the 1990s when we got threads. That means that even careful code that sets the flag immediately upon opening the file descriptor, like the following, is not safe:</p><pre class="brush: cpp; title: ; notranslate">
    int fd = open(&quot;/dev/null&quot;, O_WRONLY);
    fcntl(fd, F_SETFD, FD_CLOEXEC);
</pre><p>Why it isn&#8217;t safe? Because another thread could call <tt>fork</tt> in-between the opening of the file descriptor and the setting of the <tt>FD_CLOEXEC</tt> attribute. That means that, despite the care made in ensuring that the file descriptor doesn&#8217;t leak, it can still leak.</p><h2>First solution: add new APIs</h2><p>Recently, through the efforts of the former glibc maintainer Ulrich Drepper, we&#8217;ve got a few new system calls or modifications to the existing ones on Linux and on glibc that solve the problem above. All of the system calls on the Linux kernel that can create a file descriptor take an extra parameter that indicates whether the <tt>FD_CLOEXEC</tt> should be set upon creation. The above source code becomes on a relatively recent glibc:</p><pre class="brush: cpp; title: ; notranslate">
    int fd = open(&quot;/dev/null&quot;, O_WRONLY | O_CLOEXEC);
</pre><p>And similarly for a call to <tt>socket</tt>:</p><pre class="brush: cpp; title: ; notranslate">
    int server_fd = socket(AF_INET, SOCK_STREAM | SOCK_CLOEXEC, IPPROTO_TCP);
</pre><p>For the system calls that created file descriptors but had no way of passing extra flags, a new system call was created, such as:</p><pre class="brush: cpp; title: ; notranslate">
    int pipe_fd[2];
    pipe2(pipe_fd, O_CLOEXEC);
    dup3(pipe_fd[0], STDIN_FILENO, O_CLOEXEC);
    accept4(server_fd, &amp;addr, &amp;addrlen, SOCK_CLOEXEC);
</pre><p>This also allows us to pass <tt>O_NONBLOCK</tt> or <tt>SOCK_NONBLOCK</tt> and save us another pair of system calls to set the flag on.</p><p>The solution that Ulrich Drepper and the kernel community came up with is elegant and solves the race condition problem. I also made Qt use those system calls automatically a couple of releases ago and contributed a patch to Glib to do the same.</p><p>That part of the problem is solved, on Linux at least, and using a modern glibc or eglibc. Still, it&#8217;s not enough, as I&#8217;ll explore on my <a
href="http://www.macieira.org/blog/2012/07/forkfd-part-2-finding-out-that-a-child-process-exited-on-unix/">next blog</a>.</p> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/07/forkfd-part-1-launching-processes-on-unix/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>Continue using QPointer</title><link>http://www.macieira.org/blog/2012/07/continue-using-qpointer/</link> <comments>http://www.macieira.org/blog/2012/07/continue-using-qpointer/#comments</comments> <pubDate>Tue, 10 Jul 2012 16:13:36 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[C++]]></category> <category><![CDATA[Qt]]></category> <category><![CDATA[Uncategorized]]></category> <category><![CDATA[optimisation]]></category> <category><![CDATA[qt]]></category> <category><![CDATA[qt5]]></category> <category><![CDATA[smart pointers]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=403</guid> <description><![CDATA[Early in the Qt 5 development cycle, we had made the decision to deprecate QPointer and replace it with the more modern QWeakPointer. That decision is now reversed, so please continue using QPointer where you were using them. Moreover, don&#8217;t use QWeakPointer except in conjunction with QSharedPointer. To understand the reason behind this back and &#8230;</p><p><a
class="more-link block-button" href="http://www.macieira.org/blog/2012/07/continue-using-qpointer/">Continue reading &#187;</a>]]></description> <content:encoded><![CDATA[<p>Early in the Qt 5 development cycle, we had made the decision to deprecate <a
href="http://qt-project.org/doc/qpointer.html"><tt>QPointer</tt></a> and replace it with the more modern <a
href="http://qt-project.org/doc/qweakpointer.html"><tt>QWeakPointer</tt></a>. That decision is now reversed, so please continue using <tt>QPointer</tt> where you were using them. Moreover, don&#8217;t use <tt>QWeakPointer</tt> except in conjunction with <a
href="http://qt-project.org/doc/qsharedpointer.html"><tt>QSharedPointer</tt></a>.</p><p>To understand the reason behind this back and forth, we need to go back a little in history.</p><p><span
id="more-403"></span></p><h2>Understanding <tt>QPointer</tt></h2><p>Almost 3 years ago, I wrote a <a
href="http://labs.qt.nokia.com/2009/08/25/count-with-me-how-many-smart-pointer-classes-does-qt-have/">blog about the smart pointer classes in Qt</a>. In that blog, I talked about how QPointer had some broken semantics and how it should be deprecated. I even mentioned that it was slow.</p><p>The reason it is slow in Qt 4 is its implementation. It&#8217;s a direct descendant of the Qt 3 <a
href="http://doc.trolltech.com/3.3/qguardedptr.html"><tt>QGuardedPtr</tt></a> and  Whenever you created a <tt>QGuardedPtr</tt> for a given object, Qt would simply add the pointer and the guard to a global hashing table. In Qt 4, the implementation is the same, except that now it required locking a global mutex. (For those who don&#8217;t remember Qt 3, QObject could not be used outside the main thread there).</p><p>That means that creating a <tt>QPointer</tt> in Qt 4 requires locking a global mutex and inserting an item into a global hash, which may involve rehashing depending on how many items it holds. And when that <tt>QObject</tt> was destroyed, it would need to iterate over the hash and notify all the watchers that the object had died.</p><h2>Enter <tt>QWeakPointer</tt></h2><p>With Qt 4.5, I introduced <tt>QSharedPointer</tt> and <tt>QWeakPointer</tt>, which are a lot more efficient. The way that those two communicate is by way of a <em>&#8220;reference-counted reference-counter&#8221;</em>. That is, the private data common to those two classes contains basically two reference counters: the strong and the weak one. The strong counter counts the lifetime of the pointer, where a value of 1 or higher indicates that the object is still alive, whereas a zero indicates that it&#8217;s deleted. The weak counter controls the lifetime of the private data itself.</p><p>In other words, the strong counter counts how many <tt>QSharedPointer</tt> objects are attached to this particular pointer, whereas the weak counter counts both <tt>QSharedPointer</tt> and <tt>QWeakPointer</tt> instances. So you see how the last <tt>QSharedPointer</tt> instance being destroyed causes the pointer to be deleted too.</p><p>With Qt 4.6, I added a feature that allowed one to use <tt>QWeakPointer</tt> to track <tt>QObject</tt> instances directly, without going through <tt>QSharedPointer</tt>. When trying to figure out how to optimise <tt>QPointer</tt>, I realised that the &#8220;reference-counted reference counter&#8221; is the best solution &#8212; actually, for tracking without <tt>QSharedPointer</tt>, a &#8220;reference-counted boolean&#8221; would be enough, but since we didn&#8217;t have QAtomicBool, we regular reference counter would do just fine.</p><p>The solution is simple: if you create a the first instance of the guard, it allocates this private block and sets the strong counter to indicate that the object is alive. When the object is deleted, it simply sets the strong counter to zero. The use of the weak counter allows us to share the ownership of this private, so it&#8217;s quite fast.</p><h2>And in Qt 5&#8230;</h2><p>When we started working on Qt 5, it was clear that the previous implementation of <tt>QPointer</tt> had to go. We don&#8217;t want to keep old cruft for another 3-6 years, or however long it&#8217;s going to take until we reach Qt 6. But we also promised to maintain source compatibility as much as possible with Qt 4, so we couldn&#8217;t just remove the class.</p><p>Instead, last November, one developer simply <a
href="https://codereview.qt-project.org/#change,9936">rewrote <tt>QPointer</tt> on top of <tt>QWeakPointer</tt></a> and deprecated it. By keeping the old API and coupling it with the new implementation, we suddenly had a fast <tt>QPointer</tt>. But the deprecation stayed, because we thought we had &#8220;too many smart pointer classes&#8221; (see the title of my blog).</p><p>In March, when we started to try and clean up the Qt code base of our own deprecated classes, we realised that <tt>QPointer</tt> is used just about <strong>everywhere</strong>. Since we had a properly fast implementation, there was actually no good reason to keep generating compiler warnings about the use of that class. That led to <a
href="https://codereview.qt-project.org/#change,20203"><tt>QPointer</tt> being un-deprecated</a>.</p><p>Finally, a month and a half ago, the other shoe dropped. The argument of &#8220;too many classes&#8221; is not valid if it&#8217;s trumping over having proper API and classes that work correctly. What we realised is that overloading <tt>QWeakPointer</tt>&#8216;s purpose was making its API worse instead, leading to doubts about when to use some of its member functions. Therefore, we deprecated instead <a
href="https://codereview.qt-project.org/#change,20203"><tt>QWeakPointer</tt> use without <tt>QSharedPointer</tt></a>.</p><h2>Conclusion</h2><p>That means you should continue using <tt>QPointer</tt> in your code. Unfortunately, if you&#8217;re still targetting Qt 4, it means your code will not be as fast as it could be, but at least it will be clean.</p><p>It also means the <a
href="http://qt-project.org/doc/qt-4.8/qweakpointer.html#data"><tt>QWeakPointer::data()</tt></a> member is deprecated and you should not use it.</p><p>Qt&#8217;s smart pointer classes are neatly grouped now as follows:</p><table
border="1"><tr><th>Group</th><th>Classes</th><th>Description</th></tr><tr><td><strong>Shared data</strong></td><td><a
href="http://qt-project.org/doc/qshareddatapointer.html"><tt>QSharedDataPointer</tt></a>, <a
href="http://qt-project.org/doc/qexplicitlyshareddatapointer.html"><tt>QExplicitlySharedDataPointer</tt></a></td><td>Sharing of <em>data</em> (not of <em>pointers</em>), implicitly and explicitly. Also known as &#8220;intrusive pointers&#8221;.</td></tr><tr><td><strong>Shared pointers</strong></td><td><a
href="http://qt-project.org/doc/qsharedpointer.html"><tt>QSharedPointer</tt></a>,<br
/> <a
href="http://qt-project.org/doc/qweakpointer.html"><tt>QWeakPointer</tt></a></td><td>Thread-safe sharing <em>pointers</em>, like C++11&#8242;s <tt>std::shared_ptr</tt> only with a nice Qt API</td></tr><tr><td><strong>Scoped pointers</strong></td><td><a
href="http://qt-project.org/doc/qscopedpointer.html"><tt>QScopedPointer</tt></a>,<br
/> <a
href="http://qt-project.org/doc/qscopedpointerarray.html"><tt>QScopedPointerArray</tt></a></td><td>For <a
href="http://en.wikipedia.org/wiki/RAII">RAII</a> usage: takes ownership of a pointer and ensures it is properly deleted at the end of the scope. No sharing.</td></tr><tr><td><strong>Tracking <tt>QObjects</tt></strong></td><td><a
href="http://qt-project.org/doc/qpointer.html"><tt>QPointer</tt></a></td><td>Tracks the lifetime of a given <tt>QObject</tt> instance</td></tr></table> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/07/continue-using-qpointer/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>AVX-optimised raster painting for Windows too</title><link>http://www.macieira.org/blog/2012/06/avx-optimised-raster-painting-for-windows-too/</link> <comments>http://www.macieira.org/blog/2012/06/avx-optimised-raster-painting-for-windows-too/#comments</comments> <pubDate>Wed, 13 Jun 2012 07:19:49 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[C++]]></category> <category><![CDATA[Qt]]></category> <category><![CDATA[buildsystem]]></category> <category><![CDATA[optimisation]]></category> <category><![CDATA[qt]]></category> <category><![CDATA[qt5]]></category> <category><![CDATA[simd]]></category> <category><![CDATA[windows]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=390</guid> <description><![CDATA[Yesterday, one of my contributions to Qt was merged which finally adds better support for optimised raster painting on Windows, with SSE2 and AVX instructions. This feature has long been present on the Unix systems, but it was somewhat lacking on Windows. If you&#8217;ve read my past blogs, you know I often talk about and &#8230;</p><p><a
class="more-link block-button" href="http://www.macieira.org/blog/2012/06/avx-optimised-raster-painting-for-windows-too/">Continue reading &#187;</a>]]></description> <content:encoded><![CDATA[<p>Yesterday, one of my <a
href="https://codereview.qt-project.org/#change,27720">contributions to Qt</a> was merged which finally adds better support for optimised raster painting on Windows, with SSE2 and AVX instructions. This feature has long been present on the Unix systems, but it was somewhat lacking on Windows.</p><p>If you&#8217;ve read my past blogs, you know I often talk about and work on <a
href="http://en.wikipedia.org/wiki/SIMD">Single Instruction Multiple Data</a> (SIMD) improvements. The idea is quite simple: if you have a lot of identical operations to do and your source data is independent from one another, you can execute all of those operations in parallel, improving the throughput (processors are optimised for loading chunks of memory of a certain size, so if we only use small quantities, we&#8217;ve wasted resources). In the past, I&#8217;ve mostly worked on SIMD for string operations, like comparison, searching, and conversion to and from <a
href="http://en.wikipedia.org/wiki/Latin1">Latin-1</a>. That&#8217;s sometimes unrewarding because strings are quite small, so we don&#8217;t get the full gain of the improved throughput.</p><p>But you might not know that SIMD in Qt actually started in the QtGui library, in the raster drawing code. There, the data sizes are often in the order from several kilobytes to multiple megabytes &#8212; a tiny 16&#215;16 icon has 256 pixels, each of which is 4 bytes wide, which adds up to 1&nbsp;kB; you reach 1&nbsp;MB at 512&#215;512. As you might gather, even copying such data blocks is a somewhat expensive operation. So it&#8217;s no wonder that the more common ones of compositing, alpha blending, etc., needed optimisation. And I cannot claim credit for doing them, those were done by very talented hackers working at Trolltech back in the day.</p><p>My history with the drawables started about 6 months ago, during the last <em><a
href="http://en.wiktionary.org/wiki/romjul">romjul</a></em>, when I realised that the optimisations applied to the raster painting code could use some love. Back then, we were still mixing <a
href="http://en.wikipedia.org/wiki/MMX_(instruction_set)">MMX</a> code into the painting code, even when we reported we were using SSE. In fact, when Qt said it was enabling <a
href="http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions">SSE</a> (not <a
href="http://en.wikipedia.org/wiki/SSE2">SSE2</a>), it was actually just using some new instructions that came with SSE, but on the old MMX technology registers. My first action in that area for Qt 5.0 was to finally remove support for the old MMX-era optimisations, all of which only increased the code size in a Qt build but weren&#8217;t used anywhere. The next-level of optimisations (SSE2 and above) overrode the older ones &#8212; remember that all 64-bit capable processors have SSE2 support.</p><p>Another thing I noticed back then was that we weren&#8217;t using the full extent of the optimisations possible. With GCC, we were forced to pass some extra compiler options so GCC would allow us to use some <a
href="http://en.wikipedia.org/wiki/Intrinsic_function">intrinsic functions</a> to execute SSE2 and <a
href="http://en.wikipedia.org/wiki/SSSE3">SSSE3</a>, but that was not the case for the <a
href="http://en.wikipedia.org/wiki/Visual_C++">Microsoft compiler</a>. In addition, the Windows configuration did not try to use the intrinsics to verify if they were really available, it simply checked for the presence of the header that usually declares them. What&#8217;s more, those checks had not been updated for the SSSE3 optimisations that were done in 2010 in cooperation with <a
href="http://intel.com">Intel</a>, which meant that those optimisations were disabled on Windows.</p><p>On Unix, right after removing the old MMX-era code, I proceeded to a very quick and easy gain: add <a
href="http://en.wikipedia.org/wiki/Advanced_Vector_Extensions">AVX </a>support, the new generation of SIMD instructions from Intel. It was easy because I barely had to write code: if you compile SSE2-era code with GCC&#8217;s <tt>-msse2avx</tt> option (which is automatically enabled by <tt>-mavx</tt>), it will generate the code using the new AVX instructions. The advantage lies in the fact that the AVX instructions use a new coding mechanism (called the <a
href="http://en.wikipedia.org/wiki/VEX_prefix">VEX prefix</a>) which specifies an additional register, allowing the compiler to use fewer instructions to accomplish the same goal. Using the expanded 256-bit registers will have to wait for <a
href="http://en.wikipedia.org/wiki/AVX2">AVX2</a>, coming next year.</p><p>Except that even this easy improvement had never come to Windows either. Until now.</p><p>To enable Windows support, I had to update the way that the configuration detected the capabilities of the compiler, which is what took most of my time: dealing with building on Windows and with the binary configure.exe is not exactly my <a
href="http://en.wiktionary.org/wiki/forte#Etymology_1">forte</a>. Now, like on Unix, the Windows configuration will ask the compiler to try and compile some code. The checks are now shared with Unix, so we have the full range of checks available: SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, and AVX2. Previously, the only one that remained after I removed the MMX-era checks was SSE2.</p><p>Another update I made was to tell the Microsoft compiler to improve code generated. Since it did not require special compiler options to enable its support for SSE2, no one had thought until now to pass it the <tt><a
href="http://msdn.microsoft.com/en-us/library/7t5yh4fd(v=vs.100).aspx">/arch:SSE2</a></tt> option. Like on Unix, now we pass this option to the compiler whenever we&#8217;re compiling code that uses SSE2 anyway, making the compiler use the extended instruction set for generic code, not just what we wrote with intrisincs. From there, adding support for <tt>/arch:AVX</tt> was trivial: if you have Microsoft Visual C++ 10.0 or higher (it comes with <a
href="http://en.wikipedia.org/wiki/Microsoft_Visual_Studio">Visual Studio 2010</a>), you also now get the AVX-era instructions and Qt will enable them at runtime if it detects that your processor has them.</p><p>I&#8217;m not done. I have also a couple of other quick wins in terms of performance, all by improving code generation. Those changes are a bit more complex than the previous ones and I haven&#8217;t cleaned them up properly after 6 months of rebasing. I hope to add them to Qt 5.1 soon after its branch opens.</p> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/06/avx-optimised-raster-painting-for-windows-too/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> <item><title>&#8220;Doesn&#8217;t work&#8221; doesn&#8217;t work</title><link>http://www.macieira.org/blog/2012/05/doesnt-work-doesnt-work/</link> <comments>http://www.macieira.org/blog/2012/05/doesnt-work-doesnt-work/#comments</comments> <pubDate>Mon, 28 May 2012 11:20:54 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[Programming]]></category> <category><![CDATA[irc]]></category> <category><![CDATA[open advice]]></category> <category><![CDATA[problem solving]]></category> <category><![CDATA[support]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=385</guid> <description><![CDATA[Every now and again, someone posts on IRC or to a mailing lists about an issue they&#8217;ve had and their description of the problem is that &#8220;it doesn&#8217;t work&#8221;. There&#8217;s nothing more annoying to the person giving help than to see that description&#8230; That happened to me twice again today, which is what prompted me &#8230;</p><p><a
class="more-link block-button" href="http://www.macieira.org/blog/2012/05/doesnt-work-doesnt-work/">Continue reading &#187;</a>]]></description> <content:encoded><![CDATA[<p>Every now and again, someone posts on IRC or to a mailing lists about an issue they&#8217;ve had and their description of the problem is that &#8220;it doesn&#8217;t work&#8221;. There&#8217;s nothing more annoying to the person giving help than to see that description&#8230;</p><p>That happened to me twice again today, which is what prompted me to write this blog. At this point, I&#8217;d like to shamelessly plug in here my work on Lydia Pintscher&#8217;s <a
href="http://open-advice.org/">Open Advice book</a>: I wrote chapter 10 in that book, called &#8220;<em>The Art of Problem Solving</em>&#8220;. And if you haven&#8217;t read the book yet, or even skimmed through it, I recommend you do it. It&#8217;s full of great advice from experienced people, in many areas related to open source development, contribution, advocacy, or other forms of participation.</p><p>The first section of my chapter is called <em>Phrasing the question correctly</em>, where I wrote:</p><blockquote><p>The most useless problem statement that one can face is “it doesn’t work”, yet we seem to get it far too often. It is a true statement, as evidently something is off. Nevertheless, the phrasing does not provide any clue as to where to start looking for answers.</p></blockquote><p>The question is where we start off. In the context of asking for assistance on IRC, mailing list, or forum, it&#8217;s supposed to give the help-giver hints as to what is askew, so that they can begin forming theories as to the root causes of the problem and applying problem-solving techniques to confirm or deny it. But note how all the techniques rest on knowing what exactly is wrong.</p><p>I&#8217;m not saying I expect a full analysis of the situation by the original poster, just as I don&#8217;t expect a person who is not a health professional to do the same when going to the doctor&#8217;s. But a minimum of information is necessary. Imagine you were to go to the doctor and tell him or her that &#8220;I&#8217;m feeling ill&#8221;. What do you think the doctor will do with that information? So why do people think that &#8220;it doesn&#8217;t work&#8221; is enough information for an engineering help-giver?</p><p>At this point, you might say that &#8220;it&#8217;s just a conversation starter,&#8221; a way to break the ice and begin the discussion. And while I might be inclined to agree with you in a social context, in a live face-to-face discussion, I do not when it comes to interaction via the Internet. It&#8217;s definitely the case when the communication is not in real-time: if it takes six hours for an answer to come, then the first usable theory won&#8217;t reach the poster until 12 hours after the first post.</p><p>But even in real-time communications it&#8217;s necessary, as more often than not, it&#8217;s a matter of attracting attention of the help-givers. If I&#8217;m somewhat busy, you cannot expect me to spend precious minutes asking, &#8220;so, what exactly happened? what did you expect to happen?&#8221;</p><p>How would you know what to say in the original post, then? Here are a couple of suggestions, some of which are, I hope, obvious:</p><ul><li>the description of what actually happened;</li><li>the description of what was supposed to happen;</li><li>the actions that you took that led up to the event;</li><li>a description of the environment, such as versions of the relevant programs and settings you changed;</li><li>the logs of any programs involved that might include relevant information;</li><li>if you&#8217;ve tested other conditions and whether they&#8217;ve failed or succeeded;</li><li>if it&#8217;s a recent issue, when it started happening and when the last time you noticed it not to happen was;</li><li>any theories you might have about the issue;</li><li>what you have already done, so far, to fix the issue;</li><li>what sources of information you&#8217;ve used to diagnose the problem.</li></ul><p>Try to provide as much information as possible, in a concise manner, appropriate to the medium. For example, on IRC, you cannot write a 20-line description of all the theories you may have, but it&#8217;s certainly doable to describe the event that led you to think, &#8220;hang on, this isn&#8217;t right,&#8221; and provide a link to further information such as a pastebin of the logs.</p><p>Some other quick advice:</p><ul><li>Use your brain! Exercise it, that&#8217;s how it develop. Read the logs that you&#8217;ve got, especially compiler error logs, and interpret them. Form your own theories and test them if you can, disproving some and proving others.</li><li>Don&#8217;t argue with the evidence. If the compiler tells you there&#8217;s an error, then you&#8217;ve got  an error (the exception is when you suspect a compiler bug);</li><li>Do your homework: use Google and other search tools to find out more information. For example, if you&#8217;ve got an error message, search for that specific message. If you don&#8217;t do this, you may get the answer in the form of a <a
href="http://lmgtfy.com/">LMGTFY</a> link.</li><li>Use the appropriate channels: asking the wrong audience will not get you closer to the answer, but might raise your frustration, that of your audience and may delay the process.</li><li>Know your tools, know how to use them. It might be acceptable for a newbie, student or hobbyist not to know them, but it&#8217;s not for a professional. Not only should you know how to use them, but also a little of how they work.</li><li>The first error is usually the most important one.</li><li>A warning is (often) not an error, but warnings weren&#8217;t meant to be ignored.</li></ul><p>I&#8217;m sure there are more advice that my readers can give. What else would you suggest as advice or as information a help-giver could use?</p> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/05/doesnt-work-doesnt-work/feed/</wfw:commentRss> <slash:comments>14</slash:comments> </item> <item><title>I&#8217;m going to Akademy and the Qt Contributor Summit</title><link>http://www.macieira.org/blog/2012/05/im-going-to-akademy-and-the-qt-contributor-summit/</link> <comments>http://www.macieira.org/blog/2012/05/im-going-to-akademy-and-the-qt-contributor-summit/#comments</comments> <pubDate>Fri, 18 May 2012 10:42:42 +0000</pubDate> <dc:creator>Thiago Macieira</dc:creator> <category><![CDATA[KDE]]></category> <category><![CDATA[Qt]]></category> <category><![CDATA[akademy]]></category> <category><![CDATA[berlin]]></category> <category><![CDATA[conferences]]></category> <category><![CDATA[qt]]></category> <category><![CDATA[qtcs]]></category> <category><![CDATA[tallinn]]></category> <guid
isPermaLink="false">http://www.macieira.org/blog/?p=379</guid> <description><![CDATA[Just a quick post so I can say I&#8217;m going to both events: Akademy 2012 and the Qt Contributors Summit 2012. I hope to see many of you there, and we have a lot to discuss and work on.]]></description> <content:encoded><![CDATA[<p>Just a quick post so I can say I&#8217;m going to both events: <a
href="http://akademy2012.kde.org">Akademy 2012</a> and the <a
href="http://qt-project.org/groups/qt-contributors-summit-2012/wiki">Qt Contributors Summit 2012</a>. I hope to see many of you there, and we have a lot to discuss and work on.</p><p><img
src="http://community.kde.org/images.community/0/03/Ak2012_imgoing2.png" alt="'I'm going to Akademy 2012'" /> <img
src="http://i.imgur.com/LYiEH.png" alt="" /></p> ]]></content:encoded> <wfw:commentRss>http://www.macieira.org/blog/2012/05/im-going-to-akademy-and-the-qt-contributor-summit/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> </channel> </rss>