Keep Working, Worker Bee!


It's Worker Bee! PLDI Roundup, Day 2! My attendence today was less than perfect, so my report here isn't going to be all that complete, but on the plus side they served soft pretzels so that's something.

Session 5: Register Allocation

Register Allocation for Software Pipelined Multi-Dimensional Loops
Differential Register Allocation
Demystifying On-the-fly Spill Code

I missed this session, unfortunately, when I had a flash of insight on how to solve a particular theory problem and spent over an hour writing things on the backs of conference call-for-paper flyers proving a theorem over and over. So, sorry, but if you're trying to allocate registers you're gonna have to look to another weblog.

Session 6: Instrumentation and Testing

Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation
Those wacky Intel folks were out in force today. This was a tool that allows you to write other tools that monitor programs. I have to admit, most of the talks today were very technical in areas I don't really care about all that much, so I didn't really follow them as well as I did yesterday. A lot of people during the Q&A said they really liked Pin, so I guess it's probably awesome.

TraceBack: First Fault Diagnosis by Reconstruction of Distributed Control Flow
This was kind of a compliment to the statistical bug finding talk from yesterday. These people annotate production code with enough information to reconstruct just the control flow that lead to the program's current state when it crashes (though not the values present at each step of computation up to that point). They have a fancy graphical debugger, et cetera along with it it, and they say it works well. During Q&A the speaker indicated that people tend to think not having the values along with control flow will make the reported information useless, but in practice people tend not to need it.

DART: Directed Automated Random Testing
So you want to unit-test some code, but that's a pain. Can the computer do it for you? Yes it can. Say some function takes only "flat" values as inputs (i.e. values that have no particular structure and that you can just pull out of a bag without too many problems). Pick some of them at random and run the function, also watching to see how they were examined by the program (did the function test to see if the number you gave it was greater than 0? That it was less than some global? Et cetera). For every constraint you gather this way, try to find inputs that violate the constraint and see what the program does for those. In that manner you can (at least try to) get complete code coverage rapidly and if there are bugs in any branch you'll find them.

This talk gave me the eeriest feeling that I'd heard it before somewhere, but I couldn't figure out where.

During Q&A the speaker clarified that the program only really works for functions whose inputs are easy to generate randomly. In particular, it doesn't do very well with function pointers because it's hard to pluck those out of the ether.

Session 7: Network Processing

Shangri-La: Achieving High Performance from Compiled Network Applications while Enabling Ease of Programming
Intel again. They introduced a DSL for programming network processors and talked about how to compile that DSL really well.

Automatically Partitioning Packet Processing Applications for Pipelined Architectures
Another talk about parallel processors. This one was kind of the opposite of the first: they took programs that had an explicit processing loop written in standard C and figured out how to chop it into something more parallelizable.

Both these talks are the sort of thing I find it hard to get in to. Sorry if these reports are a bit cursory, but that's the reason.

Programming Ad-Hoc Networks of Mobile and Resource-Constrained Devices
Okay, there are bunches of networkable devices these days: cell phones, PDAs, Game Boys, whatever. You'd like for them all to work together to solve problems, but to do that you need to be able to write programs that will automatically distribute themselves to the nodes on the network that have the resources, even in the face of geographic and power constraints. This work gives some linguistic tools to do that. The coolest part was the example applications: a game of Pac-Man where you and your friends are Pac-Man and the ghosts and you run around an actual physical board with your Palm Pilots tracking which dots you've eaten and whether you've been captured, and the idea of finding your friend at the airport by having every camera-phone simultaneously take a picture and then having some powerful processor do face recognition on each image taken this way. Totally awesome!


Threads Cannot be Implemented as a Library
Somebody set up us the Boehm! Hans Boehm talked about how you can't design a thread system if the underlying language specification doesn't support it. As an example he "picked on" the pthreads library and concluded that "it is impossible to write portable pthreads code" due to the fact that memory locations are integral to the thread model but undefined by the C standard and various related reasons. Convincing, interesting talk. Read the paper.

Mitosis Compiler: an Infrastructure for Speculative Threading Based on Pre-Computation Slices
This looked like a compiler for chopping up sequential programs into parallel threadable chunks. I skipped out on this talk, unfortunately, so I can't tell you how it went.


  • This talk gave me the eeriest feeling that I'd heard it before somewhere, but I couldn't figure out where.


    By Blogger Dave Herman, at 09:24  

  • Well, this does something a little different from QuickCheck unless I'm mistaken. If I understand it right, QuickCheck dumps tons of automatically-generated test cases onto your function in the hopes that they'll execute all the interesting cases; DART actually keeps track of why it hit certain branches and tries to pick values that will hit other branches instead. They gave examples where they found really complicated flaws in security protocols that required very specific, complicated attacks to get at, and others they found in like 4 steps that would require you to win a 1-in-2^64 bet to get with just random guessing.

    By Blogger Jacob, at 16:57  

  • Oh, I also remember seeing a talk about Eclat at NEPLS this February. Might be relevant.

    By Blogger Dave Herman, at 23:04  

  • Another tool related to DART is Check-n-Crash. It analyses code using ESC/Java coupled with a constraint-solver to find possible inputs that lead to crashes, and then generate unit-tests automatically.

    By Anonymous Zayenz, at 02:17  

  • I've used Haskell QuickCheck a fair
    bit. And I've just skimmed the DART

    It looks like DART is designed to
    force tests over all execution
    paths. Hmmm, that's at least exponential in the number of branches, and perhaps worse if you're concerned about order-of-calls contexts.

    The cite to QuickCheck in the
    DART paper says you can control
    its tests by choosing probabilities.
    It's better than that, really.
    In fact, QuickCheck can give you
    very tight control over the kinds
    of tests it generates. You can
    control the number of tests and
    their exact contents, if you want.
    You can generate a combination
    of random values and specific
    values for corner cases. The downside is that it's often
    very painful/tedious to write the test generators.

    So DART has the advantage of being automatic. But you need a special
    tool. QuickCheck is just a
    Haskell library.

    -- Paul

    By Blogger Paul Steckler, at 15:29  

  • This talk gave me the eeriest feeling that I'd heard it before somewhere, but I couldn't figure out where.

    This paper seems pretty close.

    A Randomized Satisfiability Procedure for Arithmetic and Uninterpreted Function Symbols, Sumit Gulwani and George Necula, To appear in CADE (Conference on Automated Deduction) 2003 special issue of Information and Computation , Elsevier

    By Anonymous Anonymous, at 01:26  

Post a Comment

<< Home