Archive for the ‘Stuff’ Category

Median of 2 sorted arrays

Sunday, May 4th, 2008

A nice young man wrote about working on an algorithm problem after being inspired to think thusly by his new copy of The Algorithm Design Manual. Coincidentally, I recently received my copy of the book as well. I was less inspired than Mr. Torreborre, probably because I’m very lazy. I thank him for presenting his interview question, because that got me to thinking yesterday morning. How would I find the median element of the merger of two sorted lists without actually doing a (linear) merge?

I’m now typing in what’s been going through my head, mostly while cooking breakfast for my kids. I contemplated trying to explain it to them, but I decided against it. Over the course of the explanation they would certainly concoct a variety of theories about why I was trying to punish them in such a strange and tedious way. I haven’t gone to search for a result in the book or on the web; maybe that’ll be obvious after you read this. I’ll check later.

First, I’ll say the median element of a sorted list afirst … alast is afloor((last-first+1)/2). In other words, it’s the element right in the middle (rounding down - arbitrarily - if the list has an even number of elements). So for this problem, I’ve got two lists, a and b, and I want to know what element would be smack in the middle of the list resulting from their merger.

I went around in confused circles for a while before hitting upon (what I think to be) a good way to view the problem. I know what the size of the merged list would be - it’s the sum of the sizes of a and b. Thus I know exactly where the median value will be in the result. However, all I know about the result list is that it’s a muddle of values from the two original lists. For any index i in the merged list, all I can say is that the value will be from either a or b.

Now, there are a few cheap things I can do to examine my lists. Looking at an element at a given position is cheap - well, at least, I’m going to declare it to be cheap. So, for example, I can look at the median of either source list, or at the first element, or the last. Another thing I can do, less cheaply, is to find where a number would go in one of the source lists. That’s an O(log2 n) operation (with n being the list length).

If I think about a couple of ordered lists of numbers (with no known bounds on the values in the lists), it’s clear that one list may have values larger than any value in the other, or smaller than any value in the other. If I take at the smaller of the two last values in my source lists (that is, alast and blast), and then find where it would go in the other list, I now know something really interesting: I know exactly what values occupy the positions at the end of the merged list! That is, if alast is smaller than blast, and I find that it would 100 numbers down from the last element of b, then I know that the last 100 elements of the merged array have to be the last 100 elements of b. Of course I can make the same discovery at the bottom ends of the lists.

In the degenerate case, one list might contain nothing but values completely beyond the end of the other list. In that case, I can immediately find the median of the combined lists because the merge is easy. If it’s not, then with 2 log n operations I can snip off some of both lists. Now, some portion of the low and high ends of my hypothetical merged lists are no longer muddled - I know that those portions contain values from one of the two lists. In fact, I can now see what my goal is: I need to get to the point where the median slot - the position smack in the middle of the hypothetical merged list - is not muddled.

So now I have two sublists of the original lists, which represent the “muddled middle” of the hypothetical merged result. Hmm. I’m not liking this approach, because I’m not lopping off sufficiently big chunks of the problem. Well, they might be big, but I’m not forcing any bigness; I’m just whittling the ends of the lists down, and the rapidity with which the lists get smaller is entirely dependent on the values in the lists. I need to get a little more radical. (I do still like checking to see whether one list lies completely beyond the other, however, at least for now. It’s cheap to do.)

Another thing I can do cheaply is look at the median of one of my lists, and then I can see where it’d go in the other list. If I do that, then I will definitely lop off half of one list. I’ll still have a muddled result, except now I know a little bit more: I now have four lists, not two, and by looking at the combined sizes I’ll know that two of those can be forgotten. Now that’s looking good, because on every iteration I’m throwing away half of one list and at least some of the other. I’ll always pick the larger list to be the one I cut in half of course. Eventually, I’ll get to the point where one of the two lists lies completely beyond the other, and then I’m home free.

Coding up something like this makes me feel uncomfortable. I fear “off by one” errors the way some people fear spiders. Sitting here now I can conjure up a vision of working through the day on this stupid thing. I know that I really should try. One thing that’s clear is that it’d be silly to do it in a fun language like Erlang, because it all depends on it being O(1) to look at values in the lists. (Well I guess I could merge two Erlang binaries, treating them as arrays of integers, but compounding my off-by-one fears with the need to essentially code up my own array indexing routines really freaks me out.) I’ll try not to be lazy and do this in Java or something boring like that, at least.

Later…
Well I just checked this blog by a smart person and he does this a totally ‘nuther way. I don’t have enough glucose left in my blood right now to figure out which is better, but I bet his is because he doesn’t have to do any searches. He has a weird definition of the median though in the case of lists with an even number of elements. I’ve always thought that the median value has to actually be in the data set, because otherwise there can be an arbitrary number of median values. I’m no statistician however. It probably makes no difference at all for this problem.

Later still…
I’ve been putting together a model cannon kit whose instructions are basically “glue it together”. My eyes are sore. Anyway, to the extent that I can get a lame approximation to the stuff I wrote above working, I think it’s still interesting. In the case of a significant disparity in list sizes, mine converges really quickly on the median because it chews up the shorter list very quickly. When the lists are pretty dense and about the same size, it takes about log n iterations. I’m not sure how to figure the bounding function - maybe it’s log n but maybe not (I suspect the later but I’m dumb).

Roadtrip Day 4: A Hike

Sunday, March 30th, 2008

We got up moderately early, packed the kids up with breakfast junk from the hotel buffet, and headed out to the park. It was in the low 50’s but it didn’t feel cold; we’re at 3000+ feet up here, and that really seems to affect the perception of air temperature. I had running shorts and a running t-shirt, anticipating (correctly) that I’d be running back from a finished-up crew to the van.

We opted for the “Juniper Cliffside Trail.” The trails on the parks department map are not very clearly described or marked. In this case, we started on one side of the river (the Prairie Dog Fork of the Red), and found we needed to ford the washed-over road crossing to continue. Wet feet. The trail is a multi-use foot-bike-horse trail, and a few squads of bikers - and one of runners - passed us along the way.

The lower reaches of the canyon are “Quartermaster” formation Permian sediments. The white bands are gypsum, in the form of selenite or alabaster (and something else I can’t remember). Up close the white bands look like masses of brilliant thin crystals all lined up vertically next to each other. The gypsum precipitated out over layers of sediment during dry periods of the late Permian, when the world climate was going bonkers due to the conglomeration of continents into Pangea. As the soil dried, got somewhat re-wetted, and dried again - a lot - the gypsum sort-of filtered out of the sediments and was left as those distinctive white stripes.

Above the Quartermaster layers are two groups of late Triassic sediments, the Tecovas group and the Trujillo group. Those consist of random shales, conglomerates, and sandstones in a weird variety of colors. In particular, a pale purple and a greenish-yellow layer can be seen all over the place. On the trail (which is almost completely in Quartermaster territory), occasional arroyos have washed down the grey-green, yellow-green, and purple-grey sands and pebbles to mix with the red and white Permian sediments. It’s really cool to see. The water erosion on the cliffsides give a thoroughly Georgia O’Keefe look.

I can’t resist cheap compositions with gnarly dead trees. The place was easy pickings for that sort of stuff today.


The kids were pretty good about the whole thing. When they’d exhausted their energy, I left the camera bag with Elaine and headed back down the road. About two or three hundred yards out, I heard a couple of panicked-sounding screams of “Mike! Mike!”. I stopped to verify, and then realized I had no choice but to run back. An annoying minivan showed up to mask any further sound as I ran back up the park road as fast as I could (not fast). When the fam came finally into view, they were just strolling along. Pat saw me and came back to tell me that Elaine wondered whether I had the car keys. I expressed dissatisfaction with that motive for a distress call in a way that apparently made Elaine a little upset. I felt bad about that after I found out the effects. It turned out we hadn’t gone that far, because even with my slow running pace it took no time at all to get back to the van, though it had taken two hours to get where we got via the trail.

I sat around the pool watching Allie and Pat “swim” in the pool while Elaine and Christopher returned to the museum. Christopher came back in a couple hours to construct a “replica” of a historic firearm he’d found at the museum. It’s good that he’s satisfied with the most vague topological similarity between his realizations and the actual originals.

We went back to the park in the late afternoon one more time, just to have a look. It’s a really nice park, and with the museum as icing I think it was a nice trip. It’s really far, true, but Canyon is a surprisingly nice little town (compared especially to Lubbock, which had us wondering whether things would keep getting worse as we went north). We ate at a Thai-Laotian (yes, really) place last night, and at the little “Something Different” place today we had a good lunch and a flawless free wireless connection.

Roadtrip Day 3: Cool Museum, Make-Believe Hikes

Saturday, March 29th, 2008

Local temperature was 34 degrees this morning, and the plain outside the hotel was loosely fogged. Our day started relatively late, due to this enormous multi-room suite and its black-out curtains. We headed out to the Panhandle-Plains Historical Society Museum to make it in before the crowds. We were successful.

This museum is large. It’s billed, in fact, as the largest history museum in Texas. It’s reminiscent of the randomness of the Witte Museum in San Antonio or the Texas Memorial Museum on the UT campus (in the latter case, its former randomness - it’s been rationalized in the recent past). The collections include paleontology, geology, Plains Indian anthropology, petroleum production, sea shells, stuffed birds, stuffed plains animals, Indian art, Western art, firearms, cars, old gas pumps, WWII memorabilia, “pioneer” living, local ranching history, and the Texas Revolution. If you’re ever in Canyon TX, it’d be ridiculous not to visit.

After a diffuse lunch period we headed back to the canyon. With our fresh new annual pass we breezed in and drove down to the canyon floor. We had no explicit goals. The day had cleared up and gotten sunny and warm.

The lower reaches of the canyon are walled and floored with orange-red Permian sediments, highlighted by white bands of gypsum. Above that are multi-colored Triassic sandstones and shales. The top is capped by relatively recent “caprock” limestone and caliche. The scrubby vegetation was in various stages of dormancy, but the stark twiggy look was dramatic and “classic.”

The dark twisty trees in this image:

reminded me of the witches from Macbeth. What would Shakespeare have thought of landscapes like this? The harsh light beaming down on our baby hikes over forty or fifty yards of wild terrain wouldn’t have done for those witches, but the terror thorns, the twisted, scraggling shrubs, and the abrasive polychromed crags would possibly have seemed too fantastic to believe. I have to wonder what it’s like on moonlit nights.



The Triassic strata include (in the Trujillo group) layers of “banded” sandstone. Up close it has a shiny gray look, but from a distance it looks dull gray to blue-gray to an almost malachite green. These rocks are pieces that have tumbled down from an original altitude a hundred feet or more up the slope.

The plan is that we’ll head back in the morning for a real hike. We’ll see how that goes.

Roadtrip Day 2: Gliders and a Canyon

Friday, March 28th, 2008

After the long day on Thursday, we went to bed pretty early. Everybody woke up at about 5:30. By 6:00 nobody could maintain the charade of trying to sleep, so Elaine got the kids out the door for the dee-luxe breakfast buffet at the Days Inn. The breakfast area smelled strongly of petrochemicals, as it was primarily inhabited by oilfield service workers. Several semi-toxic packaged sweet rolls and bowls of cereal later we were ready to head out. The sun wasn’t yet up.

The Caprock Escarpment comes into view a few miles south of Post, TX. After Post, highway 84 rises up onto the plateau that is the High Plains of North America. The top-level rock strata is the Ogallala Formation, a fairly recent deposit of more-or-less rocky sediments formed from the broad downwash from the rising Rocky Mountains, stretching south to north across much of the country. The topmost layer is a hard quasi-limestone caliche that’s what was originally called the “caprock”, as it sits atop the typical bluffs that appear up and down the edge of the High Plains.

An interesting thing about this day is that it began about 40 degrees colder than it was Thursday afternoon. Thus upon reaching Lubbock we stopped at a Starbucks (with idiotic for-pay wireless) and then sought out a WalMart so that we could pick up more clothes for the kids. Lubbock is - partisans, please pardon me - a horrid place. For some reason we attracted beggars consistently. After a trip to a cold, windblown Prairie Dog Town (pop. 1)

we headed out, hoping to figure out a way to avoid the place altogether on the way back. However just outside the pale we notice a sign for a museum at the Lubbock airport. We spun around to go there after seeing a pretty good-condition C-47 parked in front. The museum is the “Silent Wings” museum, and it preserves the story of USAAF pilots who flew the various transport gliders employed in WW2 for airborne assaults. Inside the museum was, among other very well-done exhibits, a complete CG-4A WACO glider.

We soon found that the glider was set up such that we could climb inside and look around - somewhat astounding, given the fact that the things required delicate care when new.


The museum was a great find overall. We headed out and got to Canyon at lunchtime. After some pizza for the kids we drove out to the park. The van was low on gas so we didn’t want to drive down into the low part of the park, but we stopped at the point you can take this picture:

Palo Duro Canyon looks to be a pretty spectacular place. It’s a hard thing to capture with photography. I hope to go back over the next couple days, and I want to try and capture small stuff that evokes the visual impression of the big stuff. We’ll see.

On the way out of the parking lot we saw this tricked-out extreme Land Cruiser, apparently ferried here from Switzerland.

Roadtrip Day 1: Fossils, Bombers, Windmills

Thursday, March 27th, 2008

San Saba is a ways away from places that are a ways away from places whose locations you know. Specifically, it’s about 20 miles west of Lometa. It’s a nice town, the “Pecan Capital of the World.” There’s a park there around a millpond. I saw no mill, but there was a bridge.

West of San Saba to Richland Springs, then north across Wilbarger Creek, then west about 12 miles, there’s a roadcut through some Pennsylvanian sediments. We stopped and looked around. Within about a minute Allie had found a slab of fossilized stuff about 8 inches square.


Little chunks of fossilized crinoids were all over the place, sometimes separately and sometimes in little concretions like that. About 3 vehicles passed us during the half hour we rooted around.

We headed out and onwards to Abilene, which we skirted on its southwest side. We stopped after joining I20 at an exit labeled “Tye”. The Conoco truck stop sported a gift shop filled with weird dolls. After gassing and washing up, and after the inevitable purchase of some Silly Putty, we walked out to the sound of something in the sky. There flew a B1B, back to its home at Dyess AFB. Three or four more flew in one after the other, while a C-130 flew over, a couple thousand feet higher, in the opposite direction. It was pretty cool.

There are a lot of windmills - big ones - on hills southwest of Abiliene and out basically all over the place along highway 84 northwest out of Sweetwater. They looked a lot bigger than the ones I remember from northern California, but maybe I’ve shrunk since then.

“We Are Not Shed People”

Saturday, March 1st, 2008

Backyard sheds are an important part of suburban American life. With a shed comes obligation, however, and some are not up to the demands of shed ownership. In particular, I am not a worthy shed owner.

Shed

Our shed was, as sheds go, a pretty nice shed. It had been added to the home (we think) by its original owner, perhaps at the time the house was finished by the builder. The outside walls were finished with similar wood siding to that on the house; the inside was left without real walls. We used it to store a shedful of stuff that we didn’t want, didn’t know what else to do with.

Shed

One thing our shed did for the local ecosystem was provide a home for small rodents. This I discovered surprisingly recently. As a grossly unqualified shed owner, I hadn’t been in the shed for at least two years. A large rose bush with gigantic, murderous thorns had grown completely over the shed door. Before that the last thing I’d done was replace the shed doorknob with a new one. Since that time, mice or rats, or both working as a team, had gnawed a classic cartoon-like access portal at the bottom of the bush-hidden side of the shed. Our dog - a “rat terrier” as unworthy of her title as we are of “shed people” - had been showing a lot of interest in those bushes and that area, but had never come anywhere close to actually getting a rat. One afternoon, as I walked out with Gypsy, she and I both heard a brief rustle, and I turned to see a rat on a low holly branch. Gypsy saw it too, and sprang to the attack by cleverly running the opposite direction towards the area where she really suspected the rats to be.

No Shed

At that point I wondered, “Where is that rat going?” Only then did I think to check whether there was another way into the shed besides the inaccessible well-locked door. The next day we cut the rose bush back and checked inside. There amidst the unwanted detritus steaming away in the shed was what must have been some of the most valuable prime rat real estate in the area. We didn’t see any rats at that point, and the dog couldn’t find any either, but it was clear that they’d made much better use of the shed than we ever had.

Rat Check

About $800 later, we’re shedless. It was somewhat embarrassing taking the pictures. There I was, an affluent yuppie unable to maintain a shed in his own back yard, taking pictures of guys forced to wear breathing masks to protect themselves from the rodent filth I’d allowed to accumulate. It made me feel contemptible, but as I was depriving myself of a shed I contented myself that that was appropriate punishment. I don’t deserve a shed.

We’ll put flowerpots on the slab, or something. Flowers and herbs I can take care of, usually. The rose bush will be a lot happier anyway. I don’t know where the rats will go. They were of course seriously traumatized, and to some extent I feel kind-of bad about that too.

The Gaff

Saturday, February 23rd, 2008

I’m writing this from the main dining hall of The Gaff, the #1 cyber pub in Port Aransas, TX. A well-known local just walked in to a shout of acclaim. The owner then told us about the Thursday pirate raid pub crawls they do from here.

I don’t have much to say - or, actually, I could probably stay in this place and write a few bad novels over the course of a season. I did however feel compelled to blog from here.

The pizzas just came, as did Allie’s pirate-sized meatball sandwich, so I sort-of have to wrap this up.

Later…

The Gaff is everything you could want from a run-down beach hangout. The menu has like four things on it. The weirdly efficient bartendress puts the pizzas together and they were approx. of BB’s quality. Pyramid Wheat on tap. Belt sanders used in periodic belt sander races decorate the grimy windows. There’s an enormous bigscreen tv that doesn’t seem to work properly. They have free wireless, and this paean is good evidence of what a fine idea that is.

We walked in, and for some reason the kids immediately started exploring all 1000 square feet of the place, with many declamations of “Classic!” as the piratical denizens paused (briefly) from their beer-cigarette cocktails.

After sitting down, Pat loudly announced, “This is just like Las Vegas!”

Crappy Job Ads on craigslist

Friday, January 18th, 2008

Opinions on how we should and shouldn’t compose our resumes are a dime a dozen in the blogosphere. Well, in fact they’re free, though some may be worth a little more. The topic is an old one, and though the Internet and modern practices (like using craigslist or monster) may have brought new rules into play, it’s still about making yourself look smart, experienced, and sane.

The flip side of resume writing is the art of advertising open positions when seeking to hire people. I can’t keep track of whether we’re in a “buyer” or “seller” market in the software development world, but it’s clearly true that individual companies in individual markets frequently find themselves fairly desperate to hire developers. My own experience is that it takes quite a while to find candidates, and even longer to hit upon a good one.

Thus it seems to me that writing good copy for craigslist “help wanted” adds would be something employers do as a matter of course. Sadly, that’s not the case. They’re generally:

  • ugly;
  • rife with spelling and grammar errors;
  • composed of weird or stilted language;
  • jammed with buzzwords and product/technology names.

I know that some of the problems stem from the route the ads take from the hiring manager through the HR group (and possibly through the recruiters), but that’s hardly an excuse. The end result is that I read the ads, you read the ads, and we form opinions about the personality and culture of the organization. I caught myself the other day rolling through craigslist ads, one after the other, shaking my head after almost every single one. Ad after ad leaves me with impressions of people in cheap suits, dingy offices, and wildly disorganized (or just crazy) product development environments. You know what I’m talking about: little businesses where the guy in charge tells you that lots of the work was done by “the smartest guy I know, he’s a wiz”. Dreary failure is the vague mental picture I form.

What’s going on? Is it really the case that most businesses trolling craigslist are seedy or dysfunctional? I know that the ads are dirt cheap (free here), but the page view rate is so high that even businesses with money to blow on recruiters would be foolish not to use the service when they’re in need. (The place I work now has used craigslist with no success; I never looked at our ads.)

Indulge my laughable fantasy that somebody in a position to be influenced by my advice might read these words: take care with your craigslist ads. Think about them carefully, about the tone, the language, and the image they create. Correct the misspellings and grammar errors. Clean up the random capitalization of Important Words, and NOTHING SHOULD BE ALL-CAPS. Get somebody who actually knows what the buzzwords mean to review the way you’ve piled them on top of each other, and think about whether the things you insist are essential skills are really essential. (Did your top developers on your key products have all those skills back a year or two ago?)

Steve Yegge wrote about “weasel words” in resumes. In my opinion there’s a parallel species of words that crop up in job ads. I’m talking about stuff like “dynamic work environment”. What does that mean? “Should be an excellent team player.” Again, what does that mean? Who do they expect to screen out with that? I don’t want to hear about your corporate “energy”; it’s vaguely creepy and I’ll get the real story (or close to it) in an interview anyway. Just be clear and friendly, and describe the work.

Sigma 105mm EX DG macro

Friday, October 26th, 2007




The UPS man delivered my new Sigma 105mm macro lens this afternoon, and I immediately set up a simple facility to try it out. I put an old Weston light meter down on a counter, and next to it I arranged a cheap Quantaray slave and my old Oly T32 on an optical slave hot shoe around it, both pointed up at the ceiling. I set the E-500 flash to 1/64th power and held the camera by hand.



For the coin, I set up a tripod and focused (manually) as close as I could get. The camera was not directly over the coin so it’s not in focus all the way across; I need to rig up a “ring box” so that I can do real close-up planar macro shots.

Danaus plexippus

Sunday, October 21st, 2007

Plant some milkweed, and you’ll get some monarchs.

Caterpillar photo

Several caterpillars hatched out pretty late in the year, almost certainly past the time at which they should have left for Mexico. There’s plenty to eat here however so maybe they won’t mind.

Chrysallis photo

The kids found this chrysallis on the fence, quite some distance from where any milkweed is planted.

Butterfly photo

Two adults, including this big one, were all over the garden this afternoon.