Do you see the cat?
Author Archives: Mike Sokolov
My Long Trail
I began to realize how excited I was only a couple of days before I left. I could hardly sleep for thinking about my upcoming trip. Meal plans and other logistical details kept spinning through my brain; the last I checked the clock was 3:55 AM. This is actually good I think.
I’d been casting around for some meaningful way to spend the 5-week early summer sabbatical time I had blocked out months before thanks to a new company policy (thank you Safari!). I considered launching an effort to climb state high points, and did a warm-up climb of Mt Greylock, the highest point in Massachusetts, but something about the plan never quite sat right with me. It would provide an excuse to go to Hawaii, sure, but would also require me to go to some only mildly diverting places, like someone’s back yard in Rhode Island, and Hoosier Hill in Indiana. Even the more dramatic high points are often not among the more interesting hiking trips in their area, and are usually beset with crowds seeking the obvious. I knew this from previous climbs of Washington in New Hampshire, and Whitney in California. And then there was Denali: I wondered just how much time, effort and resources I would want to devote to this program, which couldn’t possibly be completed in five weeks.
Then some plans I had made for the early part of my time off fell through, and I realized I might just be able to squeeze in an end-to-end Long Trail trip. I spent four happy summers at camp in Southern Vermont, and some of the most memorable times I had were backpacking trips on and around the LT, a trail that stretches the length of Vermont, passing over the main ridge of the Green Mountains on the way. The idea of hiking it end to end was always in the air on these trips, and it has stayed with me for more than 30 years now.
The LT is the spiritual ancestor of the more well-known Appalachian Trail, and coincides with it for 100 miles or so. The distance, about 270 miles, is a great deal more manageable than the AT’s 2200-odd, but the walks both seem to promise a similar kind of communitarian ascetic experience. A big part of the draw for me was to pare down my life to bare essentials; to focus my attention on basic issues of transport, shelter and survival. I hoped to gain some perspective on my everyday life through a period of detachment, and I felt that an extended hiking trip would challenge me while maintaining my attention.
This is the story of my end-to-end Long Trail journey.
The two most common Solr performance blunders, and a rant about the dumbification of computer programming
Recently I’ve seen friends at work fall into a couple of well-worn traps, and I wondered – why do these same simple but devastating problems keep turning up again and again? The answer lies in deceptive software API’s, and the solution I think may come from video games. But before explaining all that, I want to just describe these two problems in a little detail.
Many programmers learn to deal with databases. Fewer work with full text search indexes like Lucene. I think newcomers to Lucene often bring with them the mental models acquired from databases, since they share many similarities. Both implement an “atomic” transactional model in which writes don’t become visible to readers until after a commit point is passed. This is critical in a transaction processing environment to ensure that two things succeed or fail together: for example, the concert ticket doesn’t get allocated to you unless ticketmaster confirms your credit card, and your card doesn’t get charged unless you get the ticket.
Lucene implements this model, but it isn’t designed to support transaction processing of that sort: it’s mostly optimized for batch updates and superfast querying across large numbers of documents (Yes, Lucene experts, I know about soft commits and near real-time, but that’s a story for another evening). This design criterion led to tradeoffs that tend to make committing expensive, very much more expensive than it is in a typical database.
Programmers writing their “Hello World” Lucene program don’t particularly need to be worrying about performance problems, but as soon as they start using Lucene for its intended purpose – indexing and querying large amounts of text – they do need to worry. Too often though it seems we fall into the trap of committing after every insert, causing a dramatic fall-off in indexing (and querying) performance, even to the extent of making a search service non-responsive. A very common version of this problem manifests itself in Solr, where you can see the dreaded “PERFORMANCE WARNING: Overlapping onDeckSearchers” message in the logs.
There are pounds and pounds of curative blog posts, wiki pages and stack overflow answers explaining this problem, why it arises, and what to do about it. And thanks to search technology, it’s pretty easy to find them if you go looking. But an ounce of prevention would save a lot of headaches here.
My number 2 most common Solr performance blunder is `fl=\*`. Solr search results include the values of fields stored in the index associated with the result document. Typical search applications show a title, with a link to a full version of the document, and possibly an associated contextual ‘snippet’. A few other fields like date, publisher or book title might be included too. Such applications must store the full text of every document in the index to make it available to the snippeting component (the highlighter, as it’s called). However, if that text is *retrieved* as part of the search result, and the documents are not tiny, this practice vastly increases the amount of data that needs to be transferred, often leading to a 10-100x slowdown in search performance. Programmers tend to do this because it’s just easier to retrieve all the fields (that’s what `fl=*` does) than to list explicitly only the fields required to display the result.
No decent Solr tutorial will lead a programmer to do this, and again, there is plenty of good information explaining how to select fields, but the Solr default is to select all fields, so this is a very easy trap to fall into. And it becomes even easier not to notice that this is happening when your queries are mediated by a middle tier.
I maintain an API for searching across multiple different backend data storage and indexing systems, and in that API we once defaulted to returning full document results. I believe our thinking was that beginners would have everything they need and not stumble over having to learn too much of a complex API even to get started. But I got tired of leaning over people’s chairs with a knowing grin and pointing out their n00b mistake. It really wasn’t their fault, anyway (it was mine.) They were just doing the natural thing, following the most straightforward path the API made available. So I changed the default, and I think the people I work with must be really smart since they seemed to be able to figure out how to get the missing field values when they really did need them. Even if they had to ask about that, it was much better to field a question like “how do I get the full text of a document in the search result?” than a question like “why do my search results sometimes take 10 seconds to come back?” because often a situation like that would be intermittent, only arising when really large documents made it into the search results, and could persist for a long time with merely mediocre performance before anybody really took note of the egregious outliers.
Apparently I’m not the only one making this mistake. [Safari Flow](http://safariflow.com) uses [Haystack](http://haystacksearch.org) as its internal search API. It’s welcoming words – I just stumbled on this while writing this sentence – “Search doesn’t have to be hard.” We recently found out that Haystack, by way of its defaults, encourages users to make exactly these mistakes. The Solr connector automatically commits after every insert, and I couldn’t even find any way to limit the set of stored fields returned. In both cases we had to fix these serious performance problems by editing Haystack’s Solr “backend” connector source code (in spite of its promise that “\[Haystack\] plays nicely with third-party apps without needing to modify the source…”
OK I’m a bit peeved about Haystack right now, but I truly hope the maintainers will read this and take it as constructive criticism, because their library actually does provide a lot of convenience to Django framework programmers grappling with search. Here’s my advice.
This conceit that programming can be easy is partially fueled by the development of software languages and tools. It *is* easier to incorporate other people’s code now using reasily-available libraries and frameworks, and to make use of existing systems, so not every program has to start as a *tabula rasa*. It is *not* necessary to understand computer architecture in a deep way in order to write much useful and/or entertaining code now. In some ways, things have gotten easier.
There is also a cultural component to this new easy-going attitude: it’s a deliberate effort to be more inclusive, to shed the high-priest hacker snobbery that has been the stock-in-trade of software gurus for thirty years and more. “RTFM” with its veiled obscenity was always a little rude, even when uttered in jest, where its moral successor, “lmgtfy,” is simply peevish, but they reflect the same unpleasant underlying attitude of condescension. I’m glad to see some reflection on that negative side of hacker culture and thecorresponding openness to newcomers.
The positive side of the “learn to program” movement is that there are numerous ways to contribute without being a master. More than ever it is possible to go very far with very little. Silicon valley startups no longer sweat hardware: they just rent space with Amazon. This is healthy: it means that the culture as a whole is able to learn and grow, to stand on the shoulders of the previous generation rather than their faces.
I’m sure you saw this coming: yes, Virginia there is a dark side. The thing is, the obnoxious attitude grows out of a hard reality. Expert programming requires knowledge. Mental nimbleness and a problem-solving bent count for a lot, but true mastery of any craft, including programming, is only available to those willing and able to devote years of study, trial, error and correction. And there are still problems to solve that demand mastery, where beginners should be cautioned to read lightly.
So let’s stop saying that search can be easy. We do learners a real disservice by pretending that things are going to be easier than they are. There are complex problems in search, getting them wrong can kill performance (*i.e.* your web site), and our role as guides should be to offer paths to learning that have the right degree of steepness, and to offer warnings about potential pitfalls. If we take you on a mountain-climbing journey, and just tell you everything will be taken care of and there’s nothing to worry about, we’re leading you into a potentially dangerous situation without any preparation: in that setting, this kind of attitude would be criminal. Tell people to bring their helmets, and teach them how to self-arrest! Wat? Metaphor getting out of hand …
At the same time, we don’t want to scare people away. There’s no call to be going all high-priest-in-the-inner-sanctum with acolytes getting in only after years of fasting and prayer. Here’s where I think we can take a cue from video games. I read (this post)[http://robotinvader.com/blog/?p=402] about luring gamers into playing the video game Devil May Cry. It talks about a game that is notoriously difficult, but also offers an easy way out. The interesting thing is that it challenges the player to try the hard way first, and warns them that there will be no way back if the easy path is chosen. Nethack, an insanely arcane video game, does a similar thing with its wizard mode: players can use it to try out all kinds of stuff, without dying, but it comes with a caveat: this is not for real, and your scores won’t be reported.
Phew – the polyurethane was drying on my last Zigzag chair while I wrote this post. It’s been a fascinating process, and I’ve learned a lot. Some questions got answered: the chair does not collapse under you when you sit on it, as numerous testers have confirmed, although it does bounce in a way that can be disconcerting if you are expecting it to give way at any moment. I still wondered how the idea for this form arose, and why it had to wait until the Twentieth Century.
My initial thought was that something about the need for bolts to sustain the chair’s joints made it difficult to achieve in older times. However. screws are ancient, were used in furniture production as early as the fifteenth century, and began to be mass produced in the nineteenth century in much the same form as they are today (see http://cool.conservation-us.org/coolaic/sg/wag/Am_Wood_Screws.pdf). Nuts and bolts developed along similar lines, driven by the carriage industry, and widely used in furniture, especially beds, as early as the eighteenth century (when they were hand-forged). In fact nothing obvious about the construction of the chair itself would have made it impossible to achieve in an earlier time. However it certainly would have been more difficult. Highly-precise joinery that is more or less routine today and achievable by workers with a modicum of experience using power tools would have presented a daunting challenge to the master craftsmen of the past working only with hand tools.
The Zigzag exemplified the sleek design aesthetic of the machine age (in particular the De Stijl movement) with its minimal form, simple materials and lack of ornamentation. In previous generations, one could have achieved something similar, but there was simply no reason too: one imagines the idea would have been rejected as a bizarre freakish malformed thing. Acutely angled weight-bearing joints would have been difficult to achieve without machine tools; there was no call for them; they were simply not a part of the design vocabulary.
The chair is the most humane piece of furniture. Its job is to support our bodies, but in its simplest form a chair is a rigid unchanging object, while we remain free to fold and unfold, and we come in many sizes and shapes. Every chair embodies a series of compromises: the height of the seat, the angle of the back, the flexibility of the materials. These, and even carven ornamentations and surface finishes, influence the fitness of a chair for a certain person or activity. The demands placed on a chair by their very nature fuse its appearance and its function into a single inseparable shape, which is why a designer’s chairs often exemplify their aesthetic more than any other work.
I find the Zigzag attractively arresting, and I think it will make excellent chair for working at a desk and for dining, if not for napping. My only regret is not to have made at least a few of them higher, since I like to work at a high drafting chair. A slightly odd feature of the chair that takes some getting used to is its tendency to flex *forwards*, but as my father says, nobody has been pitched across the table yet.
OK I finished the chairs before I finished the blog post; here you see them installed in my dining room.”
For comparison, here are some photos of an example from Rietveld’s studio produced in 1938:
Note the bolt-holes in the seat/back joint. I didn’t find it necessary to bolt this joint since the dovetails seem to be strong enough. On the other hand it took me many weekends over the course of four months to make eight chairs.
Also note that the screws are simply exposed
And some outtakes from the production. I ended up spending an inordinate amount of time finishing off the plywood edges, which is thr process being depicted below. Every chair required 8 pieces of trim that had to be fitted and glued. Because I made these from a standard 3/4-inch fir plank, the thickness matched the plywood’s exactly. Unfortunately this meant a very small tolerance for error during the gluing-up, and when it went wrong, the risk of sanding through the very thin plywood veneer was exacerbated. It would save a lot of time and effort to use solid wood. Still, the nicely-veneered fine-grained plywood surface is attractive and to my eye, enhances the modernist feel. Still, blue paint might be a nice way to go too!
ZigZag fanny test #1
It’s been a while since my last update; I’ve been traveling, and I hit some snags in the production line. It turned out that the clever joint I created just wasn’t going to work, and I decided I really did need to bolt the acutely-angled zigzag joints, as shown in the first fully assembled chair below.
In this view the pieces have been trimmed to their final dimensions. Here we can see the chair’s graceful proportions. It spreads out towards the knee and gathers itself in at the waist and heel.
Unfortunately, there is just a bit too much spring in the joints, and they creak ominously. I’ll need to brace the zigzags a little more stiffly by increasing the size of the triangle in the joints.
Finished up the joints for the remaining seats this weekend. Ellen didn’t think I was giving an honest portrayal of the working conditions in my shop, so here’s a full view, showing all eight seats in situ.
I’m relieved to have finished these joints, probably one of the more exacting tasks in this project. The joints came out pretty well, with only some small gaps (< 1/32") visible on the exterior. I should be able to close most of those up with a little more filing and chiseling, but for now I want to move on to something else. Next up: I have a mind to try a pair of sliding dovetails inside the acute bends.
first few joints
I forgot to post these awesome photos of the zigzag bevels from a few weeks back. It was a joy cutting them with a new supersmooth sawblade with lots of teeth.
The clamps hold the piece tight to a board that slides along the saw’s fence. Without this only a tiny edge would be riding on the surface of the saw. The orange plastic thing holds the piece being cut securely against that arrangement, helping to make sure the cuts come out straight and even.
The setup I made for cutting the dovetail coves wasn’t enough to cut the needed space in a single cut. Now I need to do another pass over them in order to get them deep enough and to cut the insides at the 8 degree angle needed to lean the backs.
Notice how the board clamped to the front side is angled. The router base will rest on it and cut at the same angle inside the joint.
The router is an awesome, powerful tool that can make cove cuts that can’t be done with a saw. In the old days, you would use a some combination of saw & drill (auger, gimlet) to get rid of as much stock as possible and then finish up the edges with a chisel. The router gets much closer to the edges of the joint, but it still leaves rounded corners. For the final cleanup and fitting, we need the chisel and file.
The first of the tails, marked up and ready to be cut. I tried doing this free-hand with the router, but discovered it was too hard to cut a clean, straight line. For the rest, I’ll use the saw to cut along the marked lines before cleaning out the waste with the router. This will also help prevent me from ripping the veneer facing off a large section of the plywood.
Here you can see the surface torn out from one the tails cut before I outlined the tails with the handsaw. Thankfully it’ll be under the seat.
This one’s neater.
Our first seats! It’s so satisfying when the joints come together.
half-blind compound dovetail
Just a quick update on my progress; I bought myself a new 1/2″ dovetail router bit ($20), and got to work cutting the joints. Almost half the time spent planning, setting up, and cutting tests. The first picture below shows the setup:
In the next picture you can see the results of a few hours at the router table cutting the half-blind dovetail coves in the backs. Notice how the coves don’t pass all the way through the boards: that’s the half-blind part. Initially I had hoped to place the cuts in identical locations on all the backs so that the tail half of the joints could be cut all in one pass, across all the seats at once. Alas, I don’t think that’s to be. There was just too much variability in my setup, even with the nifty jig.
Next I’ll need to clean up the backs of the coves to make them square with the surface, and chisel out the rounded corners. Then bevel the ends (8 degrees, remember?) cut the tails, and of course the joints will slide together neatly on the first try!
Some friends (well, one friend) expressed some interest in the chair project I crowed about on facebook recently. So I decided to document the project here. The idea of making a Rietveld Zig-Zag chair was my father’s: he conceived a desire for one, and bought some books and plans. I think he was attracted by the idea that he could make it himself, but eventually abandoned that plan, and sent it to me. In fact the design is deceptively simple: it was described by Dutch designer Gerrit Rietveld in 1934 as a “plane zigzagging through space.” My father wanted a single chair. I looked at the plans and photos, and I realized this would be more than an afternoon’s work. An interesting project, but I had lot of interesting projects at the time, so I put it away.
When I finally came back to the project a few weeks ago, and looked more seriously at the design, I realized that making just a single chair of this type would involve a great deal of wasted effort. The chair’s construction involves several tricky features that belie its simple appearance. Setting up the cutting jigs for these angled cuts and joinery takes time, exacting measurement and careful planning, but the cutting itself is mere rote work: satisfying, easy work on a modern machine.
Furthermore, it was clear from looking at them that Rietveld’s designs were always intended to be mass-produced, and a little reading lends some credibility to this as an accepted historical idea. It is tempting to read the De Stijl movement’s obsession with right angles and rectangles as an embrace of industrial methods. Of course it’s difficult to know: there was a natural reaction against the earlier prevailing Impressionist style, and it may simply be that Rietveld saw an opportunity to capitalize on the style’s suitability for factory manufacturing techniques.
At any rate, I determined to make a set of eight. In this way I could achieve economy of scale by reusing the various table saw and other cutting jig setups I would need to make.
Chief among the difficulties with this chair are the two acutely-angled joints connecting the diagonal member, the “leg” if you will, with the seat and with the base. Usually in finer woodworking one joins pieces without nails, bolts or other hardware, ideally holding them together without even glue. But there really is no classical all-wood joint for the zigzag chair’s acute angles, at least not anything that would support a person’s weight. Classic zigzag chairs use bolts to hold together the two acutely-angled joints. Bolts! I never dreamed of holding together furniture with bolts. I consider using bolts and hiding them, but I don’t particularly like the idea of this kind of subterfuge. At any rate it would be impossible to truly hide the necessary plugs since their grain wouldn’t match. Still we might come up with some other solution.
The other challenging piece of joinery comes between the seat and back. Dovetail joints are called for here: these are the strongest joint for a right angle between boards. Cutting dovetails requires a certain finesse and careful craft with hand tools, or specialized cutting elements for machine tools. I resolved to acquire a dovetail bit for my router that would make it easier to cut the 40-odd tails in these eight chairs. But the back is not at a right angle to the seat. It reclines by 8°, accommodating the human form in a necessary compromise with the purity of the De Stijl movement’s Cubist-informed aesthetic. The resulting joints will have to be cut at an angle 8° from square, forming a mind-bending three dimensional puzzle that I struggle to visualize even with the aid of drawings.
Finally, I complicated the picture a bit further by the use of plywood. Plywood is not often used in fine furniture, but I felt that it was an attractive choice for this project for several reasons. For one, fine plywoods are available (not at all like the construction-grade kind you see at home improvement megastores), and these offer a surface with a large swath of continuous grain. To achieve a similar effect with solid wood would require very wide boards that are not easy to find. Also, I wanted a fine-grained appearance that would require quarter-sawn lumber, a wasteful luxurious product that is almost vanished in this time of scarcer forest resources. I simply felt the idea of using an economical material like plywood would fit better with the industrial aesthetic of the chair design. And finally manufactured boards (plywood) are greener: they make the most efficient use of trees since they are able to use junkier wood on the inside, and save the good pieces for the surface.
The problem with plywood is that its end grain is really unattractive, and tends to get uglier still as it loses little chips during the construction process. The usual technique for dealing with this is to cover the ends with thin strips of solid wood, or veneer, but there was one place on the chair where this technique would not work: the back/seat joint. So I’ve resolved to further complicate the construction of that joint by making it half-blind: the ends of the seat tails will be buried in the interior of the joint, allowing the exterior surface of the back to descend uninterrupted to the outside of the angle with the seat. We’ll see how I manage that as the work unfolds!
Next time: half-blind compound dovetail
I recently registered luxdb.org with register.com. Why? It was cheap. Lesson learned (again) – there’s always a reason for cheap. I eventually decided to consolidate this domain with some others I have registered at dyn.com (an excellent provider). When I tried to transfer, I found out you need an authorization code to effect the transfer. Makes sense – keeps people from stealing your valuable domain. The thing is, register.com has instituted a 4-5 day delay before they will send you the code, and even then, they may decide not to send it to you (if they believe there has been “suspicious” activity related to your domain). On the face of it, there might be some idea that they are acting on the customer’s behalf, but based on a cursory read of various angry posters to online forums, and on my experience, it seems pretty clear they are abusing this policy in order to try to prevent customers from leaving.
When I requested the auth code and saw the notice of the 4-5 day delay, I called customer service. I had to hold for 10 minutes, and then for another 5 minutes after I demanded the auth code from customer service (I was a bit peeved at this point), but eventually he did come through with the code, no questions asked. OK, fair enough – it was a little annoying, but I got what I wanted, gave the code to dyn.com and initiated the transfer. That completed after 3 days or so. The funny part is that the very next day I received this e-mail from register.com:
Dear Michael Sokolov,
You recently requested an auth code to transfer your “luxdb.org” domain name.
Your request has been processed and at this time it has been declined due to recent suspicious activity in your account.
Register.com is committed to providing the most secure and reliable domain services for our customers.
We have implemented specific security measures to help prevent unauthorized transfer of domains to another registrar.
The type of suspicious activity that could have caused your request to be declined includes:
– Multiple failed attempts to login to customer’s account
– Recent changes to the account holder’s name, email address, or login ID
– Attempts to access the account over the phone without authorization
– Recent changes to the accounts password
– Domain name lock not removed
– Recent changes to billing or credit card information
To receive your auth code, please call one of our customer service consultants at 1.888.734.4783. They will confirm you to the account and then fulfill your request.
EVP, Customer Service
Here is my reply:
I hate to tell you this (actually it gives me a dirty pleasure), but you guys are morons. Your policy of delay and obfuscation is insulting and is probably losing you many more customers than it retains. Or maybe you’d rather be the company that serves the feeble and ill-informed. Anyway the transfer is already complete, so not only is this message an insulting lie, but it is completely null and ineffectual.
Very sincerely yours, a happily departed customer,