pigeonhole the fool: code

I've been backposting stuff originally posted on tumblr, SVrider¹, and ADVrider² onto my blog here. To make the older posts more accessible, I decided the blog needs a lazyloader.

I'd recently read a post by Alexander Micek on infinite scrolling, and I appreciated his aversion to hashbangs and poor user experience. After a day of adapting his code to this blog's purposes, I'd managed to make it very generic.

If you're interested in adding infinite scrolling to your own site, you're welcome to use it as inspiration or implementation. ~~I've added a whole bunch of documentation and put it on gist~~.

edited on friday january 13^th, 2012 at 14:06:

Henry took an interest in the code and we discovered that you can't easily pull others' changes back into a gist. The infinite scrolling code is now part of jslib, as infScr-x.y.z.js and infScr-x.y.z.min.js.

¹ the death valley trip (original)
² cross country trip parts: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13/14, 15, 16 (original)

posted on monday december 5^th, 2011 at 1:22

A while ago I was trying to get descriptors asynchronously from devices since in some situations, like if you catch a Mass Storage device while it's busy servicing a CDB, the device could drop control pipe requests on the floor, causing DeviceIoControl to block until Windows loses patience with the device and resets it (about five seconds).

The naïve approach is to use an OVERLAPPED structure with DeviceIoControl, since that's how you do async IO on Windows, but this doesn't work. It's up to the device driver to determine whether the call will be completed synchronously or not, and the Windows USB drivers complete all calls synchronously unless the pipe gets stalled by the device (normally between a URB submission and completion the device just NAKs until the result is ready). This is extremely uncommon, and impossible for the control pipe (which is where descriptor requests go) because the control pipe is used to clear stalls. Stall the control pipe and you wedge the device, leaving a device reset as the only option. In five seconds.

The solution I wound up using is not one I'm particularly proud of, but it does have the advantage of working: I didn't mind a synchronous call (in fact, it made things easier), but I didn't want to deal with getting wedged when a device went out to lunch, so I spawned a thread¹ and used WaitForSingleObject to set my timeout. Plan on waiting 80ms or so per quarter-kilobyte expected. Measured very unscientifically, a config descriptor without interfaces (nine bytes) takes less than 10ms, average 4.7ms, standard deviation 0.00831.

Note that you want to open the HANDLE you're using in the DeviceIoControl call in the thread too—if there's another request in flight, opening the file handle will block until it's completed, which is exactly what we're trying so hard to avoid. You shouldn't really be calling for another descriptor from the same device right after it's failed though, because it's likely about to be reset by Windows.

Bonus pitfall: For configuration and string descriptors, you don't know how big the descriptor is, so you can grab a portion of it, get the bLength (wTotalLength for configuration descriptors) and use that to get the rest of the descriptor, or you can just request UINT16_MAX and cross your fingers.

It turns out that neither of these approaches work for all devices for any value of n. To save some space, but keep the number of round trips to the device low, I always used a 256 byte buffer in my requests, and failed over to a larger buffer if the descriptor was too big. Unfortunately, in the classic "just bang on it until it starts working" approach of USB vendors, some devices will just not respond if you make a request with a buffer size that is not the exact size of the descriptor or sizeof UsbConfigurationDescriptor (nine bytes). The reason for this appears to be one of convenience—Windows itself only requests either nine bytes or the entirety of descriptors, so some device firmwares were written with only these two cases in mind.

¹ more accurately: reused a vthread, but the internal implementation of our threading library is not germane to this article

posted on friday march 4^th, 2011 at 15:57

A little writeup about how GCD works, and how sometimes it can make your performance *worse*.

(via boredzo)

edited on sunday november 20^th, 2011 at 5:57:

dispatch_io, a volume-aware addition to GCD, was shipped with Lion and iOS5.

posted on tuesday january 11^th, 2011 at 12:01

composing stick stores posts as xml snippets that, after being sent through a simple markup translator, can be dropped on a page with any content (especially other posts). care is taken to make sure that posts do not interfere with each other or change any global state.

but lately I've been using a fair amount of javascript in posts, and I wanted a way to prevent loading libraries multiple times. if a library is not written well, reloading its source file could destroy its internal state. some sort of include guard is needed for posts to continue to be able to stand alone.

I think I've found a solution. put the following into a file called include_once.js:

String.prototype.trim = function () {
  return this.replace(/^\s*/, "").replace(/\s*$/, "");
}

String.prototype.basename = function() {
  return this.replace(/^.*\//, '');
}

Node.prototype.insertAfter = function(newNode, refNode) {
  if(refNode.nextSibling) {
    return this.insertBefore(newNode, refNode.nextSibling);
  } else {
    return this.appendChild(newNode);
  }
}

var scripts = document.getElementsByTagName('script');
var tracked_files = tracked_files || {};
// unfortunately, Javascript doesn't do __FILE__
var this_file = "include_once.js"

for(var i = 0, ii = scripts.length; i < ii; i++) {
  if(!scripts[i].src || scripts[i].iterated) continue;
  scripts[i].iterated = "yes";

  /* if there's a # in the filename and the preceding portion is this file's name,
   * then include the string after the # as the script to guard.
   */
  var files = scripts[i].src.split('#', 2);
  if(files.length == 2 && files[0].basename().trim() == this_file) {
    //console.log("called to include: "+files[1]);

    // only load each script once.
    if(tracked_files[files[1]]) continue;
    tracked_files[files[1]] = files[1];

    // first time!
    var newcontent = document.createElement('script'); 
    newcontent.src = files[1];
    newcontent.type = "text/javascript";
    newcontent.charset = "utf-8";
    newcontent.iterated = "yes";
    scripts[i].parentNode.insertAfter(newcontent, scripts[i]);
    //console.log("included file: "+files[1]);
  }
}

use it like this:

<!-- comments show DOM state after the preceding include_once.js has been run -->
<script src="include_once.js#a.js" type="text/javascript" charset="utf-8"></script>
<!-- <script src="a.js" type="text/javascript" charset="utf-8"></script> -->
<script src="include_once.js#a.js" type="text/javascript" charset="utf-8"></script>
<script src="include_once.js#b.js" type="text/javascript" charset="utf-8"></script>
<!-- <script src="b.js" type="text/javascript" charset="utf-8"></script> -->
<script src="include_once.js#b.js" type="text/javascript" charset="utf-8"></script>
<script src="include_once.js#b.js" type="text/javascript" charset="utf-8"></script>
<script src="include_once.js#a.js" type="text/javascript" charset="utf-8"></script>

this code works reliably in Firefox, but it doesn't work consistently under Safari. haven't tested with other browsers.

this post continues a trend of doing stupid things with Javascript.

posted on friday april 9^th, 2010 at 16:02

this snippet implements "string".trim() in Javascript, as found in many other languages.

if(!String.prototype.trim) {
  String.prototype.trim = function () {
    return this.replace(/^\s*/, "").replace(/\s*$/, "");
  }
}

another for basename():

if(!String.prototype.basename) {
  String.prototype.basename = function() {
    return this.replace(/^.*\//, '');
  }
}

posted on friday april 9^th, 2010 at 14:09

a quick example demonstrating how to get data from the system using AppleScript within Java:

import javax.script.*;

class Test {
  public static void main(String[] args) throws Throwable {
    // before running, select a contact with an image in Address Book.app
    String script = "tell application \"Address Book\"\n"
                  + "   set contacts to selection\n"
                  + "   set contact to item 1 of contacts\n"
                  + "   set photothing to image of contact\n"
                  + "end tell";

    ScriptEngineManager mgr = new ScriptEngineManager();
    ScriptEngine engine = mgr.getEngineByName("AppleScript");
    Object retval = engine.eval(script);

    // prints: java.awt.image.BufferedImage
    System.out.println(retval.getClass().getName());
  }
}

while the shipping product was largely written by Mike Swingler, this was my intern project in 2007.

posted on thursday april 8^th, 2010 at 18:17

I've been pretty far behind on feeds lately, which means catching up in binges, which means missing out on things that take some time to process and follow up on. luckily jauricchio always seems to be looking out for me.

take for example the Academia vs. Business comic from xkcd, clearly written with an academic's bias (as pointed out by Wil Shipley). I left it at that, completely skipping the tooltip, which referred to the value 0x5f375a86 as being special. luckily jauricchio caught it and looked it up: it was part of the fast computation of inverse square roots (used a lot in 3-D graphics) and was later revised to the current value, 0x5f3759df. the code, courtesy Wikipedia:

float InvSqrt (float x)
{
    float xhalf = 0.5f*x;
    int i = *(int*)&x;
    i = 0x5f3759df - (i>>1);
    x = *(float*)&i;
    return x*(1.5f - xhalf*x*x);
}

the Wikipedia page also contains the math that isolated the correct magic number.

I love hacks like this.

posted on thursday december 3^rd, 2009 at 0:25

this blog generates its pages by shell script, and it's been a bit of a challenge to make the engine portable, so generated content and the code itself can be run anywhere.

one way this is made easier is by using <base href="<?= $blogroot ?>" />, which tells the user agent to prepend every source and reference with $blogroot. as I developed on Safari, everything worked.

then I asked for some friends to look at it, and it turned out that the base tag doesn't quite work like I'd hoped: you can't use a relative path as your base href, and Firefox enforces this strictly.

there's two options for ﬁxing this: the blog engine itself has to know where it lives on the server, so it can populate the base href correctly, making generated content less portable -or- the output generation could be smarter and append the relative base href to the beginning of every source and reference it sees, making the output code messier.

luckily, if you're ok with requiring that your clients support javascript (which I am), there's a third option. behold:

<script type="text/javascript" id="base_href">
  Node.prototype.insertAfter = function(newNode, refNode) {
    if(refNode.nextSibling) {
      return this.insertBefore(newNode, refNode.nextSibling);
    } else {
      return this.appendChild(newNode);
    }
  }

  var newcontent = document.createElement('base'); 
  newcontent.href = document.baseURI.substring(0, 
    document.baseURI.lastIndexOf('/')) + '<?= $blogroot ?>';

  var here = document.getElementById('base_href');
  here.parentNode.insertAfter(newcontent, here);
</script>

put that in the beginning of your <head> and it will emit the correct absolute base href at runtime.

thanks to Christian Hammond for the idea.

edited on friday february 19^th, 2010 at 21:38:

unfortunately this blog no longer uses this hack because the base href was interfering with document bookmarks.

edited on friday april 9^th, 2010 at 15:26:

this post used to feature a version using document.write() that didn't work in Opera, IE, and Konqueror. while I haven't extensively tested this new version, I expect it to work better.

posted on thursday september 10^th, 2009 at 15:18

the PHP library makes me sad sometimes, like earlier tonight.

I've been working on the blog engine, trying to ﬁgure out how to get various bits of markup working that I don't want to have to write by hand every post¹, and it requires doing XML manipulation in PHP.

if you take the time to look around on the internet (and I did), you'll see a lot of people who want to manipulate HTML or XML in PHP, and all the replies recommend things like SimpleXML, which can't remove or change tags, or DOM, which also didn't work for my needs². exhausting these options, every thread ends with "use str_replace" or "use preg_replace". sigh. what's the point in having these libraries if they're not actually useful?

I want to do this right, damnit, and I'm going to use a tool that parses my pseudo-HTML fragments into a tree and allows me to add, change, or remove tags at will!

luckily, as I was resigning myself to writing a library from scratch, I discovered simplehtmldom. unlike other libraries I've tried using recently, simplehtmldom worked right out of the box with no issues whatsoever. code example follows.

¹ especially footnotes
² DOM only accepts properly formed XML or HTML with an all-containing root node and will only create output with a doctype and a single root node. I wanted something that would create output as close as possible to the input.

more…

posted on wednesday september 9^th, 2009 at 1:58

cute snippet for checking if a number is a power of 2:

n & n - 1 == 0 && n != 0

edited on friday march 19^th, 2010 at 13:37:

this was used fairly extensively in the typewriter project

posted on monday august 3^rd, 2009 at 15:25