The task was to find the longest palindrome—words or
sentences that read the same forwards and backwards—in Abraham
Lincoln's Gettysburg Address.
Of course, the code had to be as fast and elegant as possible. (Correct
submissions lead to recruitment talks. I did talk with them, but as a
foreigner I did not have any US worker's visa, and they were too small to
have any time and money to spend on getting me one.)
Anyway, there are many algorithms for doing this. Mine is just based on a brute
force scan of the text. I see I claim that the running time is linear on the
average input and quadratic in the worst case. Update: Today I would have used
the word in practice instead of on average, because there really is nothing
fancy about this code (except that it's a bit cute).
The output is
So the longest palindrome in The Gettysburg Address is ranynar.
The above code is quite fast as well. It runs through the complete works of
Shakespeare in about 0.03 seconds of wall clock time, or a processing rate
of 128 Mb/second, and grows linearly with the input. It has a worst case
running time which is quadratic, though; if you feed it a string of same
characters, for instance.
To find the palindromes in Shakespeare's collected works, I first modified
the code to load the text from disk. I changed print_best to print all
palindromes equal to or longer than the current best, so we get a longer
list of palindromes. I also had to prepare Shakespeare's collected works
into a format suitable for processing: First I removed the name of which
character is speaking (done by sed), then I converted all text to lowercase
and deleted all non-alphanumeric characters.
Since Shakespeare wrote his plays more than 300 years before copyright law
was invented, you can download and use it freely. I got mine off a site with
all the stuff in different directories. Here's what I did to prepare them,
all in one jolly line of UNIX goodness:
I compiled my code and ran it on the file:
The longest palindromes appear in the lines:
So will it madam till I lie with you.
from the play Richard III. The preparsing of the collected works
above includes the character's name. Without it, the output is a bit