Back to the labs, boys

My new job involves quite a bit of work with natural language tools which try to group, summarise or classify text fragments. I am aware that sometimes these tools produce odd results, because in the end they’re not really intelligent – they’re driven by statistics.

Still, it made me laugh when I had a quick play with Google Sets, a tool which tries to predict additional entries for a set based on a few entries that the user supplies. I decided I would go easy on it the first time, so I gave it the following innocuous set: “hat”, “handbag”, “wallet”, “keys”.

What could possibly go wrong?

Predicted Items
US History book
coke can empty
tits hairy
History notebook
cell phone
lipgloss watermelon

“tits hairy”?

lipgloss watermelon??

Sometimes a transcript doesn’t convey the appropriate sense of gravity

Today, Australian journalist Richard Carleton died while reporting on the Beaconsfield Mine disaster, which claimed the life of one miner while two others are still trapped almost a kilometer underground. A transcript of ABC Radio’s reporting of the story is available online, and contains what may be the silliest line in a serious piece of journalism that I’ve ever read.

And while the wait continues for the people of Beaconsfield, the death of veteran journalist Richard Carleton has added to the grief-stricken atmosphere in the town.

Already the death of miner Larry Knight has devastated many locals. Now journalists, many of whom witnessed Mr Carleton’s collapse, are sharing their difficulties with the people of Beaconsfield.

Tim Jeanes reports from the town.

PAT VEEVERS (phonetic): Here we are. I hope you enjoy that, that’s beautiful cabbage rolls.

(wipes a tear from one eye)

Cracks in the Google

Update: Since writing this, my remaining problems have been smoothed out, and the people from Google who have contacted me have been very professional and attentive. The user experience continues to improve, and of course it’s worth remembering that both Google Base and Froogle are still Beta products. I’m quite excited about what the future holds for structured data uploads to search engines.

Google is releasing a lot of products. After all, when you’re a company with that much cash in the bank and hordes of stockholders holding their breath, you have to appear to be doing something with their good faith. Some might say Google’s headlong rush to the forefront of everything has been a little too quick.

Just a few days ago, I would have been the first to leap to Google’s defense in the fact of criticisms that they’re spreading themselves too thin. But after a week of trying to get my store’s products online using either Google Base or Froogle, I have to say the experience has completely sucked.

To be fair, Google has actually responded to my emails (and they must get a lot of emails), but the whole thing smacks of being flung at the ceiling to see how long it sticks.

Here’s a rough chronology:

– I open a Google Base account, just to see what it’s like.
– I select a product feed, using the drop-down box to select “Australian Dollars”, since that’s my store’s native (but not only) currency.
– I create a simple script to upload the products via FTP from my server.
– I get some very non-specific error, like “feed disapproved”, with no more information. So I email using the “Contact Us” link.
– 24 hours later, I get a form letter back, saying the feed was rejected because of “- Unsupported currency”. Which begs the question:
(a) Why couldn’t they just have displayed that error in the web interface, if they knew what it was?
and (b) Why offer other currencies in a drop down box on sign-up if nothing but USD is supported?
– I fix the problem, and re-upload.
– 24 hours later, my products appear! Huzzah! Sans-images, though. I email Google.
– 24 hours later, my images appear! Huzzah!
– 24 hours later, my feed is ripped off the site! Boo! I ask them what the problem is. Uh-oh: “- Unsupported currency”. Didn’t I already fix that? I email Google.
– 24 hours later, another form letter. A new error! “- Wrong prices”. What? This time, they helpfully included a note:

Wrong prices: Please make sure the prices that appear in your bulk upload match the prices that appear on your item pages. For example, for your item named “Lantern Tree T-shirt,” you included $42.00 as the price. However, the price is listed as $60.00 on the item page. Recheck your bulk upload and make sure that all prices match the prices that appear on your item pages.

Right. So I’m guessing they didn’t see the drop-down box in the corner with “Australian Dollars” selected, and the ability to select USD for comparison. But no matter! I added some extra CGI parameters to my uploaded links so that it would display USD on the site to anyone coming from Google Base, and emailed Google back to let them know about the changes.

– A couple of days later, Google emails me to let me know that “Congratulations! You products are appearing!”. I don’t want to be congratulated by this stage. I just need a hug. What’s more, only 16 of the 120 products are appearing, and many of those don’t have their associated images (the feed image URLs are absolutely correct).
– Still waiting on a response from Google as to when the rest of my products will appear…

AND SO. I have been trying to upload using Froogle Merchant in parallel. Again, I was offered Australian Dollars as an option. Again, I foolishly accepted it. Again, rejected.
– When I tried to re-upload with correct prices, they wouldn’t let me delete the (erroneously created) original file. Emails back and forth. They recreated the file with the correct settings.
– My file was personally reviewed by John of “The Google Team” after upload – nice!
– Uh-oh, my feed was rejected again – “- Wrong prices – Unsupported Currency”. Of course, when we’d changed the feed settings to USD, I’d forgotten to change back the prices. Fixed.
– When I tried to re-upload the data, I got a permissions error. Guess what? John, or someone else from the Google Team, had written my newly-renamed feed as ROOT in their FTP server, and now there was no way I could update the file. Nice one!

ftp> ls
200 PORT command successful.
150 Opening ASCII mode data connection for directory listing.
total 176
-r--r----- 1 ftp ? root ? 83071 May 3 10:10 georgielove_products_us.txt
226 Transfer complete.

The saga continues, but as of right now, more than a week after I began this process, I still don’t have any satisfactory products displaying on Google Base or Froogle. I am not an idiot – sure I made a couple of mistakes, but a lot of people will, and unless Google can smooth this process out a lot of people’s time is going to be wasted.

I’m still a Google Believer, but I’m no longer wearing rose-tinted glasses.

Stephen Colbert skewers Bush with greatest sustained sarcasm in living memory

Update: Newshounds has a rundown of O’Reilly’s limp response.

This feels like a turning point. In a speech at the White House Correspondents’ Association dinner last Saturday, Stephen Colbert (of Comedy Central’s “The Colbert Report“) served the president and Washington Press their linguistic asses and forced them to lick the plate clean.

In a speech almost unanimously underplayed in the mainstream media since the event (the New York Times completely ignored it, preferring instead to focus on George W. Bush’s artless lampooning of his own speech patterns), Colbert praised Bush for thinking with his gut, rather than his brain. He sarcastically attacked the “factinista” and dismissed Bush’s 32% approval rating as being based on “reality”: “And reality has a well-known liberal bias”.

It was a fun routine, and would have been pretty edgy had it been screened on TV as part of the Colbert Report. But the fact that this performance was directed at Mr. Bush’s face with complete sincerity makes it a slam-dunk to everyone frustrated by the inability of the Washington Press to hold the US administration to task for its repeated lies, failures and hypocrisy.

The speech must present something of a conundrum for media commentators (particularly the conservative ones), given that Colbert has pulled the rug from under their schtick. Do they draw attention to the speech and attempt to rebut it, or try to ignore it and so reduce the fallout? After all, it makes them looks like fools. Much will depend on the media penetration of the news and footage of the speech – if the mainstream media can keep it marginalised or buried, then the Hannity’s of the world may get away with a passing dismissal or nothing at all.

The alternative is a higher profile debate. But to have that, we have to watch the video. So, enough from me: Go watch it.

Bittorrent link links:
Part 1
Part 2

Analysis from and Editor & Publisher. Dismissal from Michelle Malkin, Ace. Complete ignorance from the New York Times.