Skip over navigation

The Lost continent of

You've found a bug on my site!

The Large-print Giveth,
and the Small-print Taketh away...

Tom Waits

Welcome...

...to The Lost Continent of. My name is Leon Matthews. Programmer, father, New Zealander, and business owner. This is my personal website, and has been for some time.

Join me at this year's Python Conference

posted on: October 19th 2010

Kiwi Pycon November 20-21, 2010

If you're in New Zealand, that is...

Last year's conference was great, and I'm looking forward to this one. Hearing about all the fantastic things that are being done with Python locally is great fun, and very inspiring. What's great is that it's not a dry for-experts-by-experts sort of event — the talks range from strategic to technical to hey-look-what-I-can-do.

I'll be giving a introductory/intermediate talk on Unicode strings from a Pythonic perspective. I've always felt that Unicode was one of those technologies which seems hard, but is built on simple concepts. Once those are understood, the details fall into place, and you'll never want to go back to plain strings ever again. In Python 3 all strings are Unicode, so it's rather timely to talk about it now.

We're Married at Last!

posted on: September 29th 2010

Alyson, Blake, and Leon Matthews

I'm thrilled to announce that my wonderful wife Alyson and I finally tied the knot in beautiful Fiji on the 11th of September 2010, with with our rascally little two-year-old son, Blake.

Take me to the Photos!

We've been together ten years now, so have done things in rather the 'wrong' order: House, Baby-carriage, then our marriage. A huge thanks to my new wife Alyson for doing all the organising, and not calling it off at the last minute. To all our family and friends — both to those who could make it, and to those who couldn't, and to everyone else who made it such a wonderful day for us. Thank you.

Decimal Time — Storing Date/Time using Epoch Timestamps

posted on: May 6th 2010

Computers process only ones and zeros — or more generally, numbers. Processing some other type of data requires that you find a way to represent, or encode, that type as a number, or a series of numbers. Colours, music, pictures, even Hollywood movies are all represented as various, often extremely creative, sequences of numbers.

I've always been fascinated by the various encoding schemes that we humans have used to shoe-horn our analog world into the digital one of our computers. Some schemes are obvious (ASCII), others surprisingly deep (IEEE 754, UTF-8). Others are horribly complicated because they have to be (video files), while others are that way to maintain a commercial advantage (some office, and graphics file formats are distressingly guilty of this). Those are all interesting, but best of all is an elegant encoding scheme.

In my mind, the most elegant scheme of all is POSIX Epoch — the representation of a date and time by a single large integer. It uses the count of seconds that have elapsed since a given point in time. For example, as I write this the POSIX epoch is 1,273,107,528. What makes this scheme elegant is that it is actually easier to work with than the original representation.

Last month I gave a presentation about what makes it so easy to work with at a meeting of my local Python Users Group, and now I've finally gotten around to updating my site with the contents of the talk.

Browser Word Breaking Considered Harmful?

posted on: November 23rd 2009

I like justified text. I still use LaTeX (via LyX usually) whenever I can, despite the cruftiness, because the output always looks so great. For the same reasons, I never use 'text-align: justify' on the web. It sounds like a good idea, but always ends up looking seven kinds of ugly. Why? Because browsers, even modern ones, don't split words in order to maintain sane interword spacing.

I did an experiment this week to try and force the behaviour that I desired. You can see the results below. The left column is standard 'web justified' text, the right is the same text but using the TeX hyphenation algorithm to split words properly.

Good hyphenation at last, but the price is too high...

I ran a little throw-away Python script to insert HTML 'soft hyphens', using the ­ entity, at the appropriate points in every word. Browsers are then able to use that information to break words and then justify the text passage properly.

But...

The problem is that all those & entities all over the place absolutely kill the readability of the source code — and that's not a price I'm willing to pay. Compare:

<p>
Shyness is most likely to occur during unfamiliar situations,
though in severe cases it may hinder an individual in his or
her most familiar situations and relationships as well.
Admitting feelings may become difficult for the individual.
Shy persons avoid the objects of their apprehension in order
to keep from feeling uncomfortable and inept; thus, the
situations remain unfamiliar and the shyness perpetuates
itself.
</p>
<p>
Shy&shy;ness is most like&shy;ly to oc&shy;cur dur&shy;ing
un&shy;fa&shy;mil&shy;iar sit&shy;u&shy;a&shy;tion&shy;s,
though in se&shy;vere cas&shy;es it may hin&shy;der
an in&shy;di&shy;vid&shy;ual in his or her most
fa&shy;mil&shy;iar sit&shy;u&shy;a&shy;tions and
re&shy;la&shy;tion&shy;ships as well. Ad&shy;mit&shy;ting
feel&shy;ings may be&shy;come dif&shy;fi&shy;cult for the
in&shy;di&shy;vid&shy;ual. Shy per&shy;sons avoid the
ob&shy;jects of their ap&shy;pre&shy;hen&shy;sion in
or&shy;der to keep from feel&shy;ing
un&shy;com&shy;fort&shy;able and in&shy;ep&shy;t;
thus, the sit&shy;u&shy;a&shy;tions re&shy;main
un&shy;fa&shy;mil&shy;iar and the shy&shy;ness
per&shy;pet&shy;u&shy;ates it&shy;self.
</p>

So, server side text manipulation is out of the question. What about client-side? Once I actually looked I found a couple of JavaScript implementations of the same idea, but a 20-30kiB download to implement word breaking seems... a tad overkill.

I've come to the conclusion, having come this far, that the proper place to do decent word breaking, and hence good justified text is in the web browser itself. Anything else is just a work-around (at best). How about it browser makers? A 30kiB language specific hyphenation dictionary won't bloat your installs too much...

The Baby's Got Moves!

posted on: July 29th 2009

Our little boy is almost walking, but has decided that dancing is easier, and far more fun! I've posted lots more videos of Blake on YouTube for maximum cuteness overload!

What is Maintainabile Code?

posted on: July 21st 2009

I've finally gotten my teeth into Diomidis D. Spinellis' book Code Quality. It's refreshingly complete and precise. The chapter on Maintainability opens with four attributes of a maintainable system (from ISO/IEC 9126-1:2001) that really struck a chord with me.

Code Quality by Diomidis D. Spinellis
Analysability
Finding the location of an error or the part of the software that must be analysed
Changeability
Implementing the maintenance change on the system's code
Stability
Not breaking anything through the change
Testability
Validating the software after the change

I know maintainable code when I see it — it has a certain feel... Up until now I've often struggled to express that feeling to non-programmers.

Overall, the book's been a very worthwhile read. The author doesn't shy away from explaining difficult or intricate concepts, where necessary, and each point is illustrated with example code from real systems. I'm very much looking forward to reading the first book in this series, 'Code Reading'.