Life and code.
RSS icon Home icon
  • Living 64-bit: Search Filters for Windows

    Posted on November 19th, 2009 Brian No comments

    One of the greatest features in Windows Vista that carries forward to Windows 7 is the Windows Search-In-The-Start-Menu.  Just hit the Windows key and start typing, and voila! you are instantly graced with search results.  Suddenly desktop search is useful!

    Unfortunately, the utility of the search is greatly limited by whether or not an appropriate filter exists for a particular file type.  Windows ships with filters for various barebones formats, such as text files and web pages, as well as Microsoft Office documents (of course).  Though filters for some formats can be found on the web, normally it is the job of the installer to properly configure filters to handle the application’s file types.

    And herein lies the problem.

    You see, when you’re running a 64-bit OS, most application programs you have are actually running in 32-bit mode.  Why?  Well, from an end-user’s perspective of the application, there is usually no difference between 32-bit mode and 64-bit mode.  There are virtually no performance differences, no look-and-feel differences, and no functional differences.

    But from an application vendor’s perspective, 64-bit support requires often drastic API changes, as well as compiling, testing, and releasing a 64-bit version.  It’s a lot of work to support something that your customer probably won’t even notice, and that’s not to mention having to explain to a confused grandmother that she downloaded the 64-bit version for her 32-bit machine and could she please try again.  So for most application vendors, 64-bit is something only done when absolutely necessary, and thus most applications get released in 32-bit versions only.

    So back to search filters:  One of the gotchas of 64-bit is that you cannot load 32-bit libraries into a 64-bit process, and on a 64-bit machine, the Windows Indexing Engine is a 64-bit process.  Thus most 32-bit applications will be unable to properly install their search filters on 64-bit Windows unless they go out of their way to do so.  OpenOffice currently suffers from this problem, as does Adobe’s PDF Reader.

    Fortunately, it has been recognized as a problem, and applications are fixing it.  OpenOffice is supposed to have it fixed in version 3.2, and Adobe offers a free 64-bit version of their PDF filter.  And in the meantime, you can often find good filters for free on IFilter.org, or some for free and for sale on IFilterShop.com.

  • Just Because You Can Doesn’t Mean You Should

    Posted on November 11th, 2009 Brian No comments

    End Of Universe Warning - This operation is estimated to take 154,146,901,011 quadrillion years to complete.  Are you sure you want to continue? Yes | No

    How many is a quadrillion, again?

  • Fun Little Bug In Windows 7: Control Panel Back Button

    Posted on November 2nd, 2009 Brian 1 comment

    Here’s a fun little bug I stumbled across in Windows 7: It appears that the back button in the control panel does not properly update the quick navigation links on the left-hand side bar.

    1. Open Control Panel.
    2. Click on Appearance and Personalization.
    3. Click on Preview, delete, or show and hide fonts (under Fonts).
    4. Click the Back button (far upper left).
    5. Click on Network and Internet in the left-hand side bar.
    6. WTF?!

    Here’s a video of the bug in operation.  This is using the fancy new HTML 5 video tag, so if you can’t see it, here’s a lame Flash version instead.

  • Chronicles of Windows 7 Part 3: From Release Candidate to Final Version

    Posted on October 31st, 2009 Brian 1 comment

    I was running the Windows 7 Release Candidate for many months prior to the October 22 public release.  I had pre-ordered the new version, and it conveniently arrived on the release day.  Anxious to see what was changed, I promptly set about upgrading.

    Unfortunately, there is no easy upgrade path from the RC.  The process forces a complete re-install (although there are some work-arounds).  I’m okay with that, though, since I had beat my RC install to a pulp experimenting with different drivers and hacks to get my Qualcomm Gobi 3G card working.  (I never really did.)

    My Upgrade Process

    The upgrade process I took was simple: Plug in my external hard drive, back up my machine using Windows Backup – including a system image – and then wipe the drive and start from scratch.  I had used similar processes in the past, although usually using a Linux Live CD and dd.  However, the Windows 7 Backup creates system images in a VHD format, and Windows 7 can also mount VHD images natively, making this a much simpler solution.  Also, it neatly sidestepped any issues I might have had with my encrypted Bit Locker hard drive.

    I’m pleased to report that the re-install process was a cakewalk, and the recovery of my data was virtually flawless.  The only hiccups were caused by my own stupidly.  I limited the files I had backed up in order to speed up the process, and found out later I wanted them.  Fortunately, they were still on the system image, and the VHD mount worked as-expected.

    Though my technique may not be for everyone, it works for the tech-savvy control-freak like me.

    Stuff That’s Fixed

    The good news is that HP’s new drivers for the Qualcomm Gobi 3G modem work flawlessly in the final version of Windows 7.  Hopefully they’ll eventually switch to use the new broadband driver stack built in to the new OS, but I’m not holding my breath.  They do work, though, and that’s enough.

    The VMWare NAT issue was actually cleared up by an update to VMWare while I was still running the RC.  I am mentioning it here to close the loop on the earlier post.

    And that’s it, really.  It’s not that there aren’t any more fixes, but that the RC was so solid for me that I had no gripes worth mentioning.  For those who suffered through Vista’s growing pains, this is a huge step up for Microsoft.  I suspect the large beta program and massive release candidate program helped immensely in this area.

    Wish List

    Here is my one gripe: Windows sizes the desktop background based on which monitor is designated as your “Primary”.  I dual monitor using my wide screen laptop display and a 4:3 stand-alone monitor, and I prefer the stand-alone screen as my primary.  Thus, I often get stupid black bars surrounding the background on my laptop display, because the image has been sized for the non-wide screen.

    I am hard-pressed to think of a situation where this makes sense.  Hopefully Microsoft will make the “Fill” desktop background option actually fill on differently sized screens.  But in the meantime, this is a minor, minor thing.

    And it should tell you something that such a ridiculously minor thing is all that I can find to complain about.

  • How To Load Bundled Images In A Prism Webapp

    Posted on October 27th, 2009 Brian No comments

    One of the features in the Prism Webapp Bundle for Google Wave is a toaster pop-up notification of unread waves using the window.platform.showNotification() method.  The third parameter is named aImageURI, and is described by the nsIPlatformGlue IDL as, “The URI of an image to use in alert. Can be null for no image.”

    Which is great, except … what URI scheme and path does one use?  Every example I could find always passed null for the image, so after giving up on the web I joined the Prism mailing list and posted a question.  The first response was to use the inline data scheme, with a base64-encoded image.  It was ugly, but it worked.

    A later response, however, yielded the proper way:

    resource://webapp/path/to/image.png

  • A Prism Webapp Bundle For Google Wave

    Posted on October 26th, 2009 Brian 1 comment

    Though Prism and Google Wave go great together simply creating a web app from the Prism Firefox add-on, Prism supports some script extensions that allow for more desktop-like integration of apps running inside it.  For example, you can call the window.platform.showNotification() method to cause a little toaster pop-up with the number of unread Waves.

    I’ve created a webapp bundle that does just that.  Unfortunately, such bundles at present only work with the stand-alone version of Prism.  The Firefox add-on is really a better way to run Prism, but if you’re using it you’ll need to do a little manual mucking in your webapp profile to use this bundle.

    Stand-Alone Bundle

    So, if you just want the bundle, here you go.  Note that I haven’t really tested it on the stand-alone version, so please let me know if something is broken.

    Hack Your Webapp

    As I said, if you’re using the add-on version, you’ll need to do a little manual hacking.  After you create the webapp, as described in my earlier post, open up Explorer and navigate to your Prism webapp bundle cache.  On Windows, this is in %APPDATA%\WebApps (something like C:\Users\Brian\AppData\Roaming\WebApps); on Linux, it is ~/.webapps.  You should see your Google Wave webapp in that directory.  Add the webapp.js script to that directory, and also add in images/google-wave-52×32.png.  Now you should get a toaster pop-up and task bar notification when there are new waves.

    It would be nice if Google were to add a <link rel=”webapp”> to Wave, referencing an appropriate bundle.  If anybody there sees this and cares to use my code as a crude starting point, I am releasing this code under an MIT license.

  • Google Wave and Prism: A Match Made In Heaven

    Posted on October 21st, 2009 Brian 4 comments

    Google WaveI received a Wave invite from Tim this morning. (Thanks, Tim!)  I’m still not sure of Wave’s usefulness as a tool, although I had quite a positive experience doing a little collaborative feedback and editing.  However, after about five minutes of using it, I was sure of one thing:

    This thing screams for its own window.

    That’s where Prism comes in.  Prism allows web applications to be run in a separate browser process, complete with a separate profile, their own window, and a unique taskbar icon.  For long-lived applications like a calendar or a chat tool, this is far more useful and stable than opening yet-another tab.  Furthermore, I like to read web pages in a tall window (roughly the same proportions as an 8.5×11 piece of paper), but I prefer my communications tools in a wide window.  Prism let’s me easily size Wave however I’d like.

    How To Set Up Google Wave in PrismAfter you install the add-on and restart Firefox, just navigate to Wave, click on the Tools menu in Firefox, and click Convert Website To Application.  You’ll want to cut out the cruft from the end of the URL, leaving just https://wave.google.com/wave/.  And it’s usually helpful to leave the status bar in place.  If you’d prefer to have wave in the system tray, you can check that box here, too.

    Google Wave in Prism in Windows 7 TaskbarYou’ll also want to pick a different icon – especially if you’re on Windows 7.  The default favicon.ico that Prism auto-downloads is very small, and scales up really poorly.  Here’s a 256×256 one that I used as a PNG and as an ICO.  It looks great in my task bar.

    I also use Prism with Google Calendar and Toodledo, and love it.  And I’m about thiiiis close to pulling Google Docs into it as well.

    Google Wave in Prism on Windows 7

  • How To Make Java Ignore IPv6

    Posted on October 10th, 2009 Brian 1 comment

    Sure, IPv6 is going to save us all from the apocalypse, defeat communism, cure the swine flu, and bake you the most delicious brownies you’ve ever tasted.  Someday.  But in the meantime, for real people trying to do real work, it’s a fucking nuisance.

    As more systems have started shipping with the technology, little compatibility issues continue to crop up.  One of the more recurrent problems I’ve encountered is incompatibilities between Java and IPv6 on Linux – specifically Ubuntu.  Up until recently, it was quite easy to eliminate the problem by merely blacklisting the appropriate kernel modules, thusly:

    # echo 'blacklist net-pf-10' >> /etc/modprobe.d/blacklist.conf # echo 'blacklist ipv6' >> /etc/modprobe.d/blacklist.conf

    However, as of Ubuntu 9.04 (Jaunty), IPv6 support is no longer a module – it’s hard-compiled into the shipping kernels.  No big deal, though, because there’s a system control variable that allows you to remove IPv6 support from the kernel.

    # echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6

    Except that doesn’t work.  It seems there was a kernel bug where that setting was just plain broken.  And it hasn’t been shipped with the normal Ubuntu kernels yet.  So, what is one to do, short of re-compiling their own kernel?

    Here is a copy-paste from an IM exchange I had with Java earlier:

    # Java has entered the chat.

    Java: Hey dude, what’s up?

    Ardvaark: hey, i’m having a problem getting you to listen to an ipv4 socket when ipv6 is installed on my ubuntu box

    Java: Yeah! I totally support IPv6 now! You didn’t even have to do anything because I abstract you from the OS details! Isn’t that great?!

    Ardvaark: awesome, i guess, except that it doesn’t work.

    Ardvaark: i really need you to just listen on ipv4, because the tivo just doesn’t like galleon on ipv6

    Ardvaark: so sit the hell down, shut the hell up, and use ipv4

    Ardvaark: pretty please

    Java: Okay, geez, no need to get all pissy about it.

    Ardvaark: and while you’re at it, could you please stop using like half a gig RAM just for a silly hello world program?

    Java: Don’t push your luck.

    # Java has left the chat.

    And now that we’re back in reality, the magic word is -Djava.net.preferIPv4Stack=true.

  • Hadoop World 2009

    Posted on October 5th, 2009 Brian No comments

    I had the privilege of attending Hadoop World 2009 on Friday.  It was amazing to meet, listen to, and pick the brains of so many smart people.  The quantity of good work being done on this project is simply stunning, but it is equally stunning how much farther there remains to go.  Some interesting points for me include:

    Yahoo’s Enormous Clusters

    Eric Baldeschwieler from Yahoo gave an impressive talk about what they’re doing with Hadoop.  Yahoo is running clusters at a simply amazing scale.  They have several different clusters, totally some 86 PB of disk space, but their largest is a 4000-node cluster with 16 PB of disk, 64 TB of RAM, and 32,000 CPU cores.  One of the most compelling points they made was that Yahoo’s experiences prove that Hadoop really does scale as designed.  If you start with a small grid now, you can be sure that it will scale up – way up.

    Eric made it clear that Yahoo uses Hadoop because it so vastly improves the productivity of their engineers.  He noted that, though the hardware is commodity, the grid isn’t necessarily a cheaper solution; however, it easily pays for itself through the increased turnaround on problems.  In the old days, it was difficult for engineers to try out new ideas, but now you can try out a Big Data idea in a few hours, and see how it goes.

    A great example is the search suggestion on the front page.  Using Hadoop, they cut the time to generate the search suggestions on the front page from 26 days to 20 minutes.  Wow!  For the icing on the cake, the code was converted from C++ to Python, and development time went from 2-3 weeks to 2-3 days.

    HDFS For Archiving

    HDFS hasn’t been used much as an archival system yet, especially not with the time horizons of someplace like my employer.  When I asked him about it, Eric told me that the oldest data on Yahoo’s clusters is not much more than a year old.  Ironically, they tend to be concerned more about removing data from the servers due to legal mandates and privacy requirements, rather than keeping it around for a Very Long Time.  But he sees the need to hold some data for longer periods coming soon, and has promised he’ll be thinking about it.

    Facebook, though, is already making moves in this area.  They currently “back up” their production HDFS grid using Hive replication to a secondary grid, but they are working on (or already have – it wasn’t quite clear how far along this all was) an “archival cluster” solution.  A daemon would scan for least-recently used files and opportunistically move them to a cluster built with more storage-heavy nodes, leaving a symlink stub in place of the file.  When a request for that stub file comes in, the daemon intercepts it and begins pulling the data back off the archive grid.  This is quite similar to how SAM-QFS works today.  I had a chance to speak with with Dhruba Borthakur for a bit afterwards, and he had some interesting ideas about modifying the HDFS block scheduler to make it friendly for something like MAID.

    Jaesun Han from NexR gave a talk on Terapot, a system for long-term storage and discovery of emails due to legal requirements and litigation.  I asked him about whether they were relying on HDFS as their primary storage mechanism, or if they “backed up” to some external solution.  He laughed, and said that they weren’t using one now, but would probably get some sort of tape solution in the near future.  He also said that he believed HDFS was quite capable of being the sole archival solution, and I believe he was implying that it was fear from the legal and/or management folks that was driving a “back up” solution.  At this point, the Cloudera CTO noted that both Yahoo and Facebook had no “back up” solution for HDFS, except for other HDFS clusters.  It certainly seems like at least a couple multi-million dollar companies are willing to put their data where their mouth is on the reliability of HDFS.

    What’s Coming

    There is a tremendous sense that Hadoop has really matured in the last year or so.  But it’s also been noted that the APIs are still thrashing a bit, and it’s still awfully Java-centric.  Now that the underlying core is pretty solid, it seems like a lot of the work is moving towards making your Hadoop grid accessible to the rest of the company – not just the uber-geek Java coders.

    Doug Cutting talked about how they’re working on building some solid, future-proof APIs for 0.21.  Included in this is switching the RPC format to Avro, which is intended to solve some of the underlying issues with Thrift and Protocol Buffers while opening up the RPC and data format to a broader class of languages.  It’s worth noting that Avro and JSON are pretty easily transcoded to one another.  Also, they’ll finally be putting some serious thought into a real authentication and authorization scheme.  Yahoo (I think) mentioned Kerberos – let’s hope we get some OpenID up in that joint, too.

    There is a sudden push towards making Hadoop accessible via various UIs.  Cloudera introduced their Hadoop Desktop, Karmasphere gave a whirlwind tour of their Netbeans-based IDE, and IBM was showing off a spreadsheet metaphor on top of Hadoop called M2 (I can’t find any good links for it).  I hadn’t thought about that before, and it seemed so simple it was brilliant; Doug Cutting mentioned the idea, too, so it has some cachet.

    Final Thoughts

    It is worth noting that Facebook seems to be driving a lot of the really cool backend stuff, and people are noticing.  That’s not to say other organizations aren’t doing cool things, but during the opening presentations, Facebook got all the questions.  I mean, Dhruba recently posted a patch adding error-correcting codes on top of HDFS.  How cool is that?!

  • Death of My Kurobox

    Posted on October 4th, 2009 Brian No comments

    After serving faithfully for over three years, and a year-and-a-half after getting a hard drive upgrade, my Kurobox Lulu died this past week.  I suspect a power surge caused a stroke.

    You see, I went to turn on the printer, and it started cursing at me with long beeps and flashing lights.  Google told me that I should try unplugging it, holding down some keys, and then plugging it back in.  Unfortunately, because I didn’t have the plugs labeled, I accidentally unplugged Lulu.  She wouldn’t start up when I turned her back on.

    Connecting the net console to the bootloader, I discovered that it would hang shortly after decompressing the kernel.  Worse, it failed the exact same way when trying to boot into EM mode.  I could poke around on the hard drive, thanks to U-Boot’s features, so I was left to conclude that the motherboard had been fried – probably the memory.  The coincidence with the printer’s death leads me to believe it was a power surge.

    So now I had to decide with what to replace the venerable Lulu.  Despite her minuscule memory and pathetic processor, she had served admirably in several roles, including a Galleon TiVo server, a Samba file server, a print server, a scan server, running a TwitterBot in IRC, and regularly syncing data to Amazon S3.  Though three years had passed, the Kurobox was still selling new for $150, or more if you got their fancy ARM version.  Skeptical that they were still worth that much, I started shopping around.

    So, what can you buy today for the $150 you could spend on a Kurobox?  How about a 1.6 GHz Atom 230 CPU/motherboard combo ($64.99), 2GB of Kingston memory ($35.49), and a Mini ITX Computer Case with 200w Power Supply ($39.99) in which to house it all.  So for the exact same price as the Kuro, you can get a modern, mainstream CPU and a generous quantity of memory.  I was prepared to spend more – not having to deal with the headaches of U-Boot and a PPC Gentoo build would have been worth an extra hundred easy – but at the same price it was just a no-brainer.

    So I ordered it all, and it should all be arriving this week.  Hopefully I’ll be able to get new Lulu on her feet quickly; Hedda and I are both really missing her.