Skip over navigation

The Lost continent of

You've found a bug on my site!

Any person can invent a security system so clever that she or he can't think of how to break it.

Schneier's Law

Recent Updates

Using the Posix Epoch

Stop, it's UNIX time!

Commodore VIC-20

My first computer. Eight bits of microprocessor fury.

Linux

Your favourite operating system and mine.

The Command Line

Do not underestimate the power of the dark prompt.

God-Man!

Chronicling the further adventures of God-Man.

Prime numbers

Introducing the 41st Mersenne prime, the biggest prime ever found.

The Dvorak Keyboard

A keyboard alternative to speed your typing, and give your metacarpals a break.

Welcome...

...to The Lost Continent of. My name is Leon Matthews. Programmer, father, New Zealander, and business owner. This is my personal website, and has been for some time.

Browser Word Breaking Considered Harmful?

posted on: November 23rd 2009

I like justified text. I still use LaTeX (via LyX usually) whenever I can, despite the cruftiness, because the output always looks so great. For the same reasons, I never use 'text-align: justify' on the web. It sounds like a good idea, but always ends up looking seven kinds of ugly. Why? Because browsers, even modern ones, don't split words in order to maintain sane interword spacing.

I did an experiment this week to try and force the behaviour that I desired. You can see the results below. The left column is standard 'web justified' text, the right is the same text but using the TeX hyphenation algorithm to split words properly.

Good hyphenation at last, but the price is too high...

I ran a little throw-away Python script to insert HTML 'soft hyphens', using the ­ entity, at the appropriate points in every word. Browsers are then able to use that information to break words and then justify the text passage properly.

But...

The problem is that all those & entities all over the place absolutely kill the readability of the source code — and that's not a price I'm willing to pay. Compare:

<p>
Shyness is most likely to occur during unfamiliar situations,
though in severe cases it may hinder an individual in his or
her most familiar situations and relationships as well.
Admitting feelings may become difficult for the individual.
Shy persons avoid the objects of their apprehension in order
to keep from feeling uncomfortable and inept; thus, the
situations remain unfamiliar and the shyness perpetuates
itself.
</p>
<p>
Shy&shy;ness is most like&shy;ly to oc&shy;cur dur&shy;ing
un&shy;fa&shy;mil&shy;iar sit&shy;u&shy;a&shy;tion&shy;s,
though in se&shy;vere cas&shy;es it may hin&shy;der
an in&shy;di&shy;vid&shy;ual in his or her most
fa&shy;mil&shy;iar sit&shy;u&shy;a&shy;tions and
re&shy;la&shy;tion&shy;ships as well. Ad&shy;mit&shy;ting
feel&shy;ings may be&shy;come dif&shy;fi&shy;cult for the
in&shy;di&shy;vid&shy;ual. Shy per&shy;sons avoid the
ob&shy;jects of their ap&shy;pre&shy;hen&shy;sion in
or&shy;der to keep from feel&shy;ing
un&shy;com&shy;fort&shy;able and in&shy;ep&shy;t;
thus, the sit&shy;u&shy;a&shy;tions re&shy;main
un&shy;fa&shy;mil&shy;iar and the shy&shy;ness
per&shy;pet&shy;u&shy;ates it&shy;self.
</p>

So, server side text manipulation is out of the question. What about client-side? Once I actually looked I found a couple of JavaScript implementations of the same idea, but a 20-30kiB download to implement word breaking seems... a tad overkill.

I've come to the conclusion, having come this far, that the proper place to do decent word breaking, and hence good justified text is in the web browser itself. Anything else is just a work-around (at best). How about it browser makers? A 30kiB language specific hyphenation dictionary won't bloat your installs too much...

The Baby's Got Moves!

posted on: July 29th 2009

Our little boy is almost walking, but has decided that dancing is easier, and far more fun! I've posted lots more videos of Blake on YouTube for maximum cuteness overload!

What is Maintainabile Code?

posted on: July 21st 2009

I've finally gotten my teeth into Diomidis D. Spinellis' book Code Quality. It's refreshingly complete and precise. The chapter on Maintainability opens with four attributes of a maintainable system (from ISO/IEC 9126-1:2001) that really struck a chord with me.

Code Quality by Diomidis D. Spinellis
Analysability
Finding the location of an error or the part of the software that must be analysed
Changeability
Implementing the maintenance change on the system's code
Stability
Not breaking anything through the change
Testability
Validating the software after the change

I know maintainable code when I see it — it has a certain feel... Up until now I've often struggled to express that feeling to non-programmers.

Overall, the book's been a very worthwhile read. The author doesn't shy away from explaining difficult or intricate concepts, where necessary, and each point is illustrated with example code from real systems. I'm very much looking forward to reading the first book in this series, 'Code Reading'.

Where do we want you to go today?

posted on: May 29th 2009

Not quite what I was going for, but I have to admire the incredible optimism (and lack of bias) of the Microsoft's live.com search engine's auto-completion system!

Microsoft Sucks

They did 'let' me run the search I really wanted, although their results did kinda... blow.

(Before anyone accuses me [probably rightfully] of rampant anti-fan-boy-ism, the reason I was there in the first place was to check out the recently announced re-vamp of their search engine. Turns out it was just vaporware...)

Arduino Duemilanove!

posted on: April 6th 2009

I've just placed an order for an Arduino Micro-controller. It's a bridge between your computer and the physical world, a real computer that fits into the palm of your hand. The basic board has 14 digital outputs, 6 analog inputs and connects to your PC via a USB cable, which also powers the device. Best of all, the support software and the hardware itself are both Open Source.

You write programs on your computer, then compile and upload them to run on the board, which can read and write values from and to the outside world — light LEDs, run LCD displays, power motors, read temperature, acceleration, range, and even GPS sensors. Pretty exciting stuff when you've spent your career just shuffling imaginary ones and zeros around!

Arduino Duemilanove Microcontroller

Make Magazine has a nice video tutorial that shows how easy it is to get started, while ladyada.net has a more in-depth series of lessons available. A YouTube search for Arduino shows the sorts of cool projects that people are making. I'd love to make a little robot that my 10-month old son Blake can chase around the house (or vice-versa!). I have a board on order, now if I could just find that soldering iron...

It's interesting to note that at only USD $35 it's be the cheapest computer I've ever bought, but even running at just 16MHz with just 32KB of memory, it's only the second slowest.