Tuesday, July 24, 2007

OSCON: Higher-order Perl

My morning tutorial today is Higher-order Perl, the tutorial of the book, given by Mark-Jason Dominus.



Mark-Jason Dominus talking about Higher-order Perl

After rolling in five minutes late, in typical Mark-Jason style he booted the laptop, stood up the front and said...

Thanks, any questions...

Suffice to say he got a laugh, and got the audience on his side. I've been in a bunch of Mark-Jason's talks before, for instance back at the OSCON in 2005 I went to his Making Programs Faster which was excellent. While he isn't Damian Conway, and you won't get him proving the 2nd law of thermodynamics using the Game of Life, he's a really good speaker.

The point of this tutorial, and the book, is that a lot of people write C programs in Perl, which is a bit of a waste of time. Mark-Jason is trying to persuade him to write Perl programs in Perl.

Update: He's started talking about caching and the Memozie module,

use Memoize;
memoize 'date_to_key';

What Memozie is that it replaces the true function with a wrapper function, and with people looking a bit confused he's dropped back a fair ways to talk about closures, e.g.

sub make_function
my $val = shift;
return sub { print "Value is $val.\n"; ++$val; };
}

Which captures the lexical variable at the time they are created, which of course is why they're called closures. Interestingly there are still people looking confused, oh boy, they're so in the wrong tutorial...

Update: Back to Memozie, we pass it a function name then it constructs a new closure with a private reference to that function. Each instance has a real and cached version, and uses a glob assignment to install a reference to the wrapper function into the symbol table. Then simply when the wrapper function is invoked, we grab the identity of the function, and peek in the cache. If it's not in the cache yet, we grab a reference to the real function and stick it in the cache for the next time and then return it. If it is in the cache, then we just return the cached value. Of course, its a bit more complicated than that, you have to take account of scalar and list context for instance.

Update: After more Memozie goodness he has moved on talk about Iterators. For the non Java people in the audience an iterator is an object interface to a list. It supports a next method to generate the next item when it is needed. So why do you want them as a Perl person, well the list might be large, or it might take a long time to come up with list elements, or you might not know in advance how many items you want. Of course Perl file handles are iterators, and the next method is the <...> operator, and iterators turn up everywhere in Perl even though we don't call them that explicitly. Interestingly Python doesn't use this concept for reading directory listings, and Perl does...

Update: Hurrah, we have an "octopus" reference when talking about File::Find, we've all been waiting...

Update: Mark-Jason is making his way through an Interator example re-implementing File::Find, and is talking about classes using bless'ing subroutines rather than hashes. There are few looks of pain in the audience, mostly from the people that didn't seem to follow closures.

Update: Time for coffee, back after these messages...

Update: So unlike yesterday [1, 2] I'm actually having to work to keep up here, which is good because I've got a sneaking suspicion that this afternoon's tutorial on data mining isn't really going to stretch me that much.

Update: ...and we're back, still talking about Iterators and infinite lists.

Update: He's talking about linked lists which he's claiming aren't particularly useful in Perl because the array data type pretty much handles everything you want to do with linked lists. Which is pretty much true in general, although I do have linked lists buried inside my two hundred thousand lines of object-oriented Perl sitting in my project's CVS archive.

Perl loves you, that's why we're all here...

Update: We're moving on to parsing, taking a big pile of unstructured input and turning it in to a data structure. Perl actually hides a lot of this from you, for instance,

while(<$fh>) {
# do something with $_
}

is actually a parser after all. But at some point your ad-hoc parser and you'll need to go to something like Parse::RecDescent. Of course Parse::RecDescent is a closed system, and you really need to have open architecture to cover your bases, and it looks like he's going to build one in-front of our eyes using Recursive-descent parsing which, funnily enough, is the same algorithm used be Parse::RecDescent.

Update: Okay, that was pretty cool. Moving on he's talking about the book and how Perl is really Lisp...

Update: ...and we're done. Lunch time!