Skip to content

Categories:

The two most common Solr performance blunders, and a rant about the dumbification of computer programming

Recently I’ve seen friends at work fall into a couple of well-worn traps,
and I wondered – why do these same simple but devastating problems keep
turning up again and again? The answer lies in deceptive software API’s,
and the solution I think may come from video games. But before explaining
all that, I want to just describe these two problems in a little detail.

Many programmers learn to deal with databases. Fewer work with full text
search indexes like Lucene. I think newcomers to Lucene often bring with
them the mental models acquired from databases, since they share many
similarities. Both implement an “atomic” transactional model in which
writes don’t become visible to readers until after a commit point is
passed. This is critical in a transaction processing environment to ensure
that two things succeed or fail together: for example, the concert ticket
doesn’t get allocated to you unless ticketmaster confirms your credit card,
and your card doesn’t get charged unless you get the ticket.

Lucene implements this model, but it isn’t designed to support transaction
processing of that sort: it’s mostly optimized for batch updates and
superfast querying across large numbers of documents (Yes, Lucene experts,
I know about soft commits and near real-time, but that’s a story for
another evening). This design criterion led to tradeoffs that tend to make
committing expensive, very much more expensive than it is in a typical
database.

Programmers writing their “Hello World” Lucene program don’t particularly
need to be worrying about performance problems, but as soon as they start
using Lucene for its intended purpose – indexing and querying large amounts
of text – they do need to worry. Too often though it seems we fall into the
trap of committing after every insert, causing a dramatic fall-off in
indexing (and querying) performance, even to the extent of making a search
service non-responsive. A very common version of this problem manifests
itself in Solr, where you can see the dreaded “PERFORMANCE WARNING:
Overlapping onDeckSearchers” message in the logs.

There are pounds and pounds of curative blog posts, wiki pages and stack
overflow answers explaining this problem, why it arises, and what to do
about it. And thanks to search technology, it’s pretty easy to find them if
you go looking. But an ounce of prevention would save a lot of headaches
here.

My number 2 most common Solr performance blunder is `fl=\*`. Solr search
results include the values of fields stored in the index associated with
the result document. Typical search applications show a title, with a link
to a full version of the document, and possibly an associated contextual
‘snippet’. A few other fields like date, publisher or book title might be
included too. Such applications must store the full text of every document
in the index to make it available to the snippeting component (the
highlighter, as it’s called). However, if that text is *retrieved* as part
of the search result, and the documents are not tiny, this practice vastly
increases the amount of data that needs to be transferred, often leading to
a 10-100x slowdown in search performance. Programmers tend to do this
because it’s just easier to retrieve all the fields (that’s what `fl=*`
does) than to list explicitly only the fields required to display the
result.

No decent Solr tutorial will lead a programmer to do this, and again, there
is plenty of good information explaining how to select fields, but the Solr
default is to select all fields, so this is a very easy trap to fall into.
And it becomes even easier not to notice that this is happening when your
queries are mediated by a middle tier.

I maintain an API for searching across multiple different backend data
storage and indexing systems, and in that API we once defaulted to
returning full document results. I believe our thinking was that beginners
would have everything they need and not stumble over having to learn too
much of a complex API even to get started. But I got tired of leaning over
people’s chairs with a knowing grin and pointing out their n00b mistake.
It really wasn’t their fault, anyway (it was mine.) They were just doing
the natural thing, following the most straightforward path the API made
available. So I changed the default, and I think the people I work with
must be really smart since they seemed to be able to figure out how to get
the missing field values when they really did need them. Even if they had
to ask about that, it was much better to field a question like “how do I
get the full text of a document in the search result?” than a question like
“why do my search results sometimes take 10 seconds to come back?” because
often a situation like that would be intermittent, only arising when really
large documents made it into the search results, and could persist for a
long time with merely mediocre performance before anybody really took note
of the egregious outliers.

Apparently I’m not the only one making this
mistake. [Safari Flow](http://safariflow.com) uses
[Haystack](http://haystacksearch.org) as its internal search API. It’s
welcoming words – I just stumbled on this while writing this sentence -
“Search doesn’t have to be hard.” We recently found out that Haystack, by
way of its defaults, encourages users to make exactly these mistakes. The
Solr connector automatically commits after every insert, and I couldn’t
even find any way to limit the set of stored fields returned. In both
cases we had to fix these serious performance problems by editing
Haystack’s Solr “backend” connector source code (in spite of its promise
that “\[Haystack\] plays nicely with third-party apps without needing to
modify the source…”

OK I’m a bit peeved about Haystack right now, but I truly hope the
maintainers will read this and take it as constructive criticism, because
their library actually does provide a lot of convenience to Django
framework programmers grappling with search. Here’s my advice.

There is a notion gaining currency that programming computers is becoming
easier. Sites like codeacademy teach JavaScript using a glossy, game-like
spoon-fed interface. “Learn Python The Hard Way” presents Python (and other
languages, in spinoff titles) using a baby-step scaffolded teaching
approach (the only hard thing about it for me was sticking with it – to be
fair, it acknowledges on page 1 that it wasn’t designed for impatient
smartypants). There are many other “learn to code in 5 minutes” websites
and courses that offer an easy path to software mastery.

This conceit that programming can be easy is partially fueled by the
development of software languages and tools. It *is* easier to incorporate
other people’s code now using reasily-available libraries and frameworks,
and to make use of existing systems, so not every program has to start as a
*tabula rasa*. It is *not* necessary to understand computer architecture in
a deep way in order to write much useful and/or entertaining code now. In
some ways, things have gotten easier.

There is also a cultural component to this new easy-going attitude: it’s a
deliberate effort to be more inclusive, to shed the high-priest hacker
snobbery that has been the stock-in-trade of software gurus for thirty
years and more. “RTFM” with its veiled obscenity was always a little rude,
even when uttered in jest, where its moral successor, “lmgtfy,” is simply
peevish, but they reflect the same unpleasant underlying attitude of
condescension. I’m glad to see some reflection on that negative side of
hacker culture and thecorresponding openness to newcomers.

The positive side of the “learn to program” movement is that there are
numerous ways to contribute without being a master. More than ever it is
possible to go very far with very little. Silicon valley startups no longer
sweat hardware: they just rent space with Amazon. This is healthy: it means
that the culture as a whole is able to learn and grow, to stand on the
shoulders of the previous generation rather than their faces.

I’m sure you saw this coming: yes, Virginia there is a dark side. The thing
is, the obnoxious attitude grows out of a hard reality. Expert programming
requires knowledge. Mental nimbleness and a problem-solving bent count for
a lot, but true mastery of any craft, including programming, is only
available to those willing and able to devote years of study, trial, error
and correction. And there are still problems to solve that demand mastery,
where beginners should be cautioned to read lightly.

So let’s stop saying that search can be easy. We do learners a real
disservice by pretending that things are going to be easier than they
are. There are complex problems in search, getting them wrong can kill
performance (*i.e.* your web site), and our role as guides should be to
offer paths to learning that have the right degree of steepness, and to
offer warnings about potential pitfalls. If we take you on a
mountain-climbing journey, and just tell you everything will be taken care
of and there’s nothing to worry about, we’re leading you into a potentially
dangerous situation without any preparation: in that setting, this kind of
attitude would be criminal. Tell people to bring their helmets, and teach
them how to self-arrest! Wat? Metaphor getting out of hand …

At the same time, we don’t want to scare people away. There’s no call to be
going all high-priest-in-the-inner-sanctum with acolytes getting in only
after years of fasting and prayer. Here’s where I think we can take a cue
from video games. I read (this post)[http://robotinvader.com/blog/?p=402]
about luring gamers into playing the video game Devil May Cry. It talks
about a game that is notoriously difficult, but also offers an easy way
out. The interesting thing is that it challenges the player to try the hard
way first, and warns them that there will be no way back if the easy path
is chosen. Nethack, an insanely arcane video game, does a similar thing
with its wizard mode: players can use it to try out all kinds of stuff,
without dying, but it comes with a caveat: this is not for real, and your
scores won’t be reported.

Posted in All.

Zigzag wrapup

Phew – the polyurethane was drying on my last Zigzag chair while I wrote this post. It’s been a fascinating process, and I’ve learned a lot. Some questions got answered: the chair does not collapse under you when you sit on it, as numerous testers have confirmed, although it does bounce in a way that can be disconcerting if you are expecting it to give way at any moment. I still wondered how the idea for this form arose, and why it had to wait until the Twentieth Century.

IMG_0452

My initial thought was that something about the need for bolts to sustain the chair’s joints made it difficult to achieve in older times. However. screws are ancient, were used in furniture production as early as the fifteenth century, and began to be mass produced in the nineteenth century in much the same form as they are today (see http://cool.conservation-us.org/coolaic/sg/wag/Am_Wood_Screws.pdf). Nuts and bolts developed along similar lines, driven by the carriage industry, and widely used in furniture, especially beds, as early as the eighteenth century (when they were hand-forged). In fact nothing obvious about the construction of the chair itself would have made it impossible to achieve in an earlier time. However it certainly would have been more difficult. Highly-precise joinery that is more or less routine today and achievable by workers with a modicum of experience using power tools would have presented a daunting challenge to the master craftsmen of the past working only with hand tools.

The Zigzag exemplified the sleek design aesthetic of the machine age (in particular the De Stijl movement) with its minimal form, simple materials and lack of ornamentation. In previous generations, one could have achieved something similar, but there was simply no reason too: one imagines the idea would have been rejected as a bizarre freakish malformed thing. Acutely angled weight-bearing joints would have been difficult to achieve without machine tools; there was no call for them; they were simply not a part of the design vocabulary.

The chair is the most humane piece of furniture. Its job is to support our bodies, but in its simplest form a chair is a rigid unchanging object, while we remain free to fold and unfold, and we come in many sizes and shapes. Every chair embodies a series of compromises: the height of the seat, the angle of the back, the flexibility of the materials. These, and even carven ornamentations and surface finishes, influence the fitness of a chair for a certain person or activity. The demands placed on a chair by their very nature fuse its appearance and its function into a single inseparable shape, which is why a designer’s chairs often exemplify their aesthetic more than any other work.

I find the Zigzag attractively arresting, and I think it will make excellent chair for working at a desk and for dining, if not for napping. My only regret is not to have made at least a few of them higher, since I like to work at a high drafting chair. A slightly odd feature of the chair that takes some getting used to is its tendency to flex *forwards*, but as my father says, nobody has been pitched across the table yet.

OK I finished the chairs before I finished the blog post; here you see them installed in my dining room.

set of 6 in situ

set of 6 in situ

For comparison, here are some photos of an example from Rietveld’s studio produced in 1938:

1938 example from Rietveld's studio

1938 example from Rietveld’s studio, from the collection of the Carnegie-Mellon Museum of Art, photo: Lauren Hammer

zzorig2

Note the bolt-holes in the seat/back joint. I didn’t find it necessary to bolt this joint since the dovetails seem to be strong enough. On the other hand it took me many weekends over the course of four months to make eight chairs.

zzbolts

Also note that the screws are simply exposed
zzbolts2

And some outtakes from the production. I ended up spending an inordinate amount of time finishing off the plywood edges, which is thr process being depicted below. Every chair required 8 pieces of trim that had to be fitted and glued. Because I made these from a standard 3/4-inch fir plank, the thickness matched the plywood’s exactly. Unfortunately this meant a very small tolerance for error during the gluing-up, and when it went wrong, the risk of sanding through the very thin plywood veneer was exacerbated. It would save a lot of time and effort to use solid wood. Still, the nicely-veneered fine-grained plywood surface is attractive and to my eye, enhances the modernist feel. Still, blue paint might be a nice way to go too!

IMG_0416

IMG_0420

IMG_0423

IMG_0424

IMG_0426

IMG_0430

IMG_0432

Posted in All, wood.

ZigZag fanny test #1

It’s been a while since my last update; I’ve been traveling, and I hit some snags in the production line. It turned out that the clever joint I created just wasn’t going to work, and I decided I really did need to bolt the acutely-angled zigzag joints, as shown in the first fully assembled chair below.

the first chair assembled dry

the first chair assembled dry

In this view the pieces have been trimmed to their final dimensions. Here we can see the chair’s graceful proportions. It spreads out towards the knee and gathers itself in at the waist and heel.

fully assembled and glued, but still unfinished

fully assembled and glued, but still unfinished

Unfortunately, there is just a bit too much spring in the joints, and they creak ominously. I’ll need to brace the zigzags a little more stiffly by increasing the size of the triangle in the joints.

Posted in All.

All seated

Finished up the joints for the remaining seats this weekend. Ellen didn’t think I was giving an honest portrayal of the working conditions in my shop, so here’s a full view, showing all eight seats in situ.

what a mess

what a mess

I’m relieved to have finished these joints, probably one of the more exacting tasks in this project. The joints came out pretty well, with only some small gaps (< 1/32″) visible on the exterior. I should be able to close most of those up with a little more filing and chiseling, but for now I want to move on to something else. Next up: I have a mind to try a pair of sliding dovetails inside the acute bends.

Posted in All.

first few joints

I forgot to post these awesome photos of the zigzag bevels from a few weeks back. It was a joy cutting them with a new supersmooth sawblade with lots of teeth.

tablesaw setup for cutting zigzag bevels

tablesaw setup for cutting zigzag bevels

The clamps hold the piece tight to a board that slides along the saw’s fence. Without this only a tiny edge would be riding on the surface of the saw. The orange plastic thing holds the piece being cut securely against that arrangement, helping to make sure the cuts come out straight and even.

all 22º zigzag bevels cut

all 22º zigzag bevels cut

The setup I made for cutting the dovetail coves wasn’t enough to cut the needed space in a single cut. Now I need to do another pass over them in order to get them deep enough and to cut the insides at the 8 degree angle needed to lean the backs.

IMG_0333

Notice how the board clamped to the front side is angled. The router base will rest on it and cut at the same angle inside the joint.

router with fence

router with fence

finishing up the dovetail coves

finishing up the dovetail coves

The router is an awesome, powerful tool that can make cove cuts that can’t be done with a saw. In the old days, you would use a some combination of saw & drill (auger, gimlet) to get rid of as much stock as possible and then finish up the edges with a chisel. The router gets much closer to the edges of the joint, but it still leaves rounded corners. For the final cleanup and fitting, we need the chisel and file.

hand tools

hand tools

The first of the tails, marked up and ready to be cut. I tried doing this free-hand with the router, but discovered it was too hard to cut a clean, straight line. For the rest, I’ll use the saw to cut along the marked lines before cleaning out the waste with the router. This will also help prevent me from ripping the veneer facing off a large section of the plywood.

marking the tails

marking the tails

Here you can see the surface torn out from one the tails cut before I outlined the tails with the handsaw. Thankfully it’ll be under the seat.

dovetail back side showing grain tearout (oops!)

dovetail back side showing grain tearout (oops!)

This one’s neater.

dovetail back side

dovetail back side

Our first seats! It’s so satisfying when the joints come together.

first seats joined

first seats joined

Posted in All, wood.

half-blind compound dovetail

Just a quick update on my progress; I bought myself a new 1/2″ dovetail router bit ($20), and got to work cutting the joints. Almost half the time spent planning, setting up, and cutting tests. The first picture below shows the setup:

Setup for cutting dovetails

Setup for cutting dovetails

In the next picture you can see the results of a few hours at the router table cutting the half-blind dovetail coves in the backs. Notice how the coves don’t pass all the way through the boards: that’s the half-blind part. Initially I had hoped to place the cuts in identical locations on all the backs so that the tail half of the joints could be cut all in one pass, across all the seats at once. Alas, I don’t think that’s to be. There was just too much variability in my setup, even with the nifty jig.

stack of chair backs with dovetail coves cut

stack of chair backs with dovetail coves cut

Next I’ll need to clean up the backs of the coves to make them square with the surface, and chisel out the rounded corners. Then bevel the ends (8 degrees, remember?) cut the tails, and of course the joints will slide together neatly on the first try!

Posted in All.

zigzag chairs

Zig-Zag chair

example of a Zig-Zag chair

Some friends (well, one friend) expressed some interest in the chair project I crowed about on facebook recently. So I decided to document the project here. The idea of making a Rietveld Zig-Zag chair was my father’s: he conceived a desire for one, and bought some books and plans. I think he was attracted by the idea that he could make it himself, but eventually abandoned that plan, and sent it to me. In fact the design is deceptively simple: it was described by Dutch designer Gerrit Rietveld in 1934 as a “plane zigzagging through space.” My father wanted a single chair. I looked at the plans and photos, and I realized this would be more than an afternoon’s work. An interesting project, but I had lot of interesting projects at the time, so I put it away.

When I finally came back to the project a few weeks ago, and looked more seriously at the design, I realized that making just a single chair of this type would involve a great deal of wasted effort. The chair’s construction involves several tricky features that belie its simple appearance. Setting up the cutting jigs for these angled cuts and joinery takes time, exacting measurement and careful planning, but the cutting itself is mere rote work: satisfying, easy work on a modern machine.

Furthermore, it was clear from looking at them that Rietveld’s designs were always intended to be mass-produced, and a little reading lends some credibility to this as an accepted historical idea. It is tempting to read the De Stijl movement’s obsession with right angles and rectangles as an embrace of industrial methods. Of course it’s difficult to know: there was a natural reaction against the earlier prevailing Impressionist style, and it may simply be that Rietveld saw an opportunity to capitalize on the style’s suitability for factory manufacturing techniques.

At any rate, I determined to make a set of eight. In this way I could achieve economy of scale by reusing the various table saw and other cutting jig setups I would need to make.

Chief among the difficulties with this chair are the two acutely-angled joints connecting the diagonal member, the “leg” if you will, with the seat and with the base. Usually in finer woodworking one joins pieces without nails, bolts or other hardware, ideally holding them together without even glue. But there really is no classical all-wood joint for the zigzag chair’s acute angles, at least not anything that would support a person’s weight. Classic zigzag chairs use bolts to hold together the two acutely-angled joints. Bolts! I never dreamed of holding together furniture with bolts. I consider using bolts and hiding them, but I don’t particularly like the idea of this kind of subterfuge. At any rate it would be impossible to truly hide the necessary plugs since their grain wouldn’t match. Still we might come up with some other solution.

The other challenging piece of joinery comes between the seat and back. Dovetail joints are called for here: these are the strongest joint for a right angle between boards. Cutting dovetails requires a certain finesse and careful craft with hand tools, or specialized cutting elements for machine tools. I resolved to acquire a dovetail bit for my router that would make it easier to cut the 40-odd tails in these eight chairs. But the back is not at a right angle to the seat. It reclines by 8°, accommodating the human form in a necessary compromise with the purity of the De Stijl movement’s Cubist-informed aesthetic. The resulting joints will have to be cut at an angle 8° from square, forming a mind-bending three dimensional puzzle that I struggle to visualize even with the aid of drawings.

Finally, I complicated the picture a bit further by the use of plywood. Plywood is not often used in fine furniture, but I felt that it was an attractive choice for this project for several reasons. For one, fine plywoods are available (not at all like the construction-grade kind you see at home improvement megastores), and these offer a surface with a large swath of continuous grain. To achieve a similar effect with solid wood would require very wide boards that are not easy to find. Also, I wanted a fine-grained appearance that would require quarter-sawn lumber, a wasteful luxurious product that is almost vanished in this time of scarcer forest resources. I simply felt the idea of using an economical material like plywood would fit better with the industrial aesthetic of the chair design. And finally manufactured boards (plywood) are greener: they make the most efficient use of trees since they are able to use junkier wood on the inside, and save the good pieces for the surface.

The problem with plywood is that its end grain is really unattractive, and tends to get uglier still as it loses little chips during the construction process. The usual technique for dealing with this is to cover the ends with thin strips of solid wood, or veneer, but there was one place on the chair where this technique would not work: the back/seat joint. So I’ve resolved to further complicate the construction of that joint by making it half-blind: the ends of the seat tails will be buried in the interior of the joint, allowing the exterior surface of the back to descend uninterrupted to the outside of the angle with the seat. We’ll see how I manage that as the work unfolds!

Next time: half-blind compound dovetail

Posted in All.

Dumping register.com

I recently registered luxdb.org with register.com. Why? It was cheap. Lesson learned (again) – there’s always a reason for cheap. I eventually decided to consolidate this domain with some others I have registered at dyn.com (an excellent provider). When I tried to transfer, I found out you need an authorization code to effect the transfer. Makes sense – keeps people from stealing your valuable domain. The thing is, register.com has instituted a 4-5 day delay before they will send you the code, and even then, they may decide not to send it to you (if they believe there has been “suspicious” activity related to your domain). On the face of it, there might be some idea that they are acting on the customer’s behalf, but based on a cursory read of various angry posters to online forums, and on my experience, it seems pretty clear they are abusing this policy in order to try to prevent customers from leaving.

When I requested the auth code and saw the notice of the 4-5 day delay, I called customer service. I had to hold for 10 minutes, and then for another 5 minutes after I demanded the auth code from customer service (I was a bit peeved at this point), but eventually he did come through with the code, no questions asked. OK, fair enough – it was a little annoying, but I got what I wanted, gave the code to dyn.com and initiated the transfer. That completed after 3 days or so. The funny part is that the very next day I received this e-mail from register.com:

Dear Michael Sokolov,

You recently requested an auth code to transfer your “luxdb.org” domain name.
Your request has been processed and at this time it has been declined due to recent suspicious activity in your account.

Register.com is committed to providing the most secure and reliable domain services for our customers.
We have implemented specific security measures to help prevent unauthorized transfer of domains to another registrar.
The type of suspicious activity that could have caused your request to be declined includes:
- Multiple failed attempts to login to customer’s account
- Recent changes to the account holder’s name, email address, or login ID
- Attempts to access the account over the phone without authorization
- Recent changes to the accounts password
- Domain name lock not removed
- Recent changes to billing or credit card information

To receive your auth code, please call one of our customer service consultants at 1.888.734.4783. They will confirm you to the account and then fulfill your request.

Thank You,

Sandy Ross
EVP, Customer Service
Register.com

Here is my reply:

I hate to tell you this (actually it gives me a dirty pleasure), but you guys are morons. Your policy of delay and obfuscation is insulting and is probably losing you many more customers than it retains. Or maybe you’d rather be the company that serves the feeble and ill-informed. Anyway the transfer is already complete, so not only is this message an insulting lie, but it is completely null and ineffectual.

Very sincerely yours, a happily departed customer,

Mike Sokolov

Posted in All.

Multithreaded testing with JUnit

This post demonstrates an easy way to test the effect of multiple concurrent threads running your software, using your existing unit tests.

Have you ever wondered whether your software is really thread safe? Whether it will stand up to the punishment of thousands of concurrent users when your site (or app) goes ballistic? Multi-threaded programming is notoriously difficult, and running software in a environment (like a web application server) that spawns multiple threads can often expose architectural problems that don’t arise in typical test scenarios. Running a set of tests in parallel is a good way to gain confidence that your code is thread safe.

Another benefit of running multiple tests at once is: they run faster! All modern computers have multiple cores, and many have multiple CPUs: when we run our tests single-threaded, we aren’t making use of all of that latent power we have just lying around.

Let’s say you have a test class called MyTestClass, and it defines a number of tests. Using the test runner we provide, you can run all of its tests in parallel by adding a single annotation (the standard org.junit.runner.RunWith annotation that comes with JUnit) to your class:

@RunWith (MultiThreadedRunner.class)

This runner plugs in to the JUnit framework by subclassing BlockJUnit4ClassRunner; this is the class that usually runs all tests from a test class. The runChild() method is called for each test that is run; we take that over and arrange for each test to run in its own thread. We also want to ensure that not too many threads run at once: each thread consumes memory, and, depending on the size of the test class, we may end up running hundreds of threads at once if we’re not careful, and run out of memory. Here is the code for runChild, which simply waits until there are fewer than maxThreads tests running, and then creates a Runnable called Test which actually runs the test:

    protected void runChild(final FrameworkMethod method, final RunNotifier notifier) {
        while (numThreads.get() > maxThreads) {
            try {
                Thread.sleep(25);
            } catch (InterruptedException e) {
                System.err.println ("Interrupted: " + method.getName());
                e.printStackTrace();
                return; // The user may have interrupted us; this won't happen normally
            }
        }
        new Thread (new Test(method, notifier)).start();
    }
 

Note that we keep track of the number of threads in a variable called numThreads. That is an AtomicInteger, which is a thread-safe primitive built into the standard JRE. We use it to ensure that the thread count isn’t updated simultaneously in two threads. Here is the core of the code for the Test class:

        public void run () {
            numThreads.incrementAndGet();
            MultiThreadedRunner.super.runChild(method, notifier);
            numThreads.decrementAndGet();
        }

All this does is keep track of the number of running threads. I haven’t shown the constructor and members used to track the test method and notifier, but as you can imagine, that is just straightforward copying of variables.

The only slight complication with using the code as shown so far is that JUnit will finish before all the tests do. It’s necessary to make the “outer loop” in the test runner wait until the last test has completed before it exits. Usually this happens implicitly, because tests are all run in the same thread, but now, when the runner starts a test, it returns immediately, while the test is still running. To solve this problem, we need to override the childrenInvoker method.


protected Statement childrenInvoker(final RunNotifier notifier) {
        return new Statement() {
            public void evaluate() throws Throwable {
                MultiThreadedRunner.super.childrenInvoker(notifier).evaluate();
                // wait for all child threads (tests) to complete
                while (numThreads.get() > 0) {
                    Thread.sleep(25);
                }
            }
        };
    }

Couldn’t be simpler: just call the super method to do all the real work, and then wait until there are no more test threads running before returning. Does anybody see the potential race condition here? It’s possible that when the last test is run, childrenInvoker will return and test numThreads before that last test has a chance to increment it. In practice this doesn’t seem to happen since there will generally be several threads running already when the last test is started, but just to be safe, it is better to increment the threadCount in the main runner thread, just before calling Thread.start(), and then to decrement it in the child thread, just before exiting. The attached file has that change.

Download the source code here:
MultiThreadedRunner

Note that this code is distributed under the Mozilla Public License (2.0), which basically says you can use this code freely, embed it in your software, and even redistribute it, as long as it retains its license and attribution (includes the comments it has now saying who wrote it), and as long as any changes you make to the software are distributed under the same terms: also I encourage you to post any changes you make here so I can incorporate them.

Posted in All.

linux firefox ipv6 snafu: FIXED!

I post this in the hope that it will save someone the headaches I suffered:

I found that certain web sites were unusable (super slow – minutes to load) in firefox on linux, although retrieving pages from the command line w/say wget worked just fine – python.org, for example, and the firefox addons site. It turns out that DNS records for these sites include ipv6 addresses, and I guess we can’t route ipv6 packets properly? I was able to fix the problem by disabling ipv6 on my box:

in /etc/modprobe.d/blacklist I added
blacklist ipv6

Posted in All.