The fundamentals of structured data

Still wondering how to cram all these cool new Web-based tools and toys into your newspaper’s content management system? What, you mean it didn’t come with a database to manage all those user-submitted photos you’re getting through your MySpace page?

Even if you’re not quite that friendly with the social-networking set yet, chances are you’ve got some data sitting around in your stories, just waiting to be structured.

Adrian Holovaty lays out the basics for you, running down all the information you might want to build into something other than a “big blob of text,” as he calls it.

“For example, say a newspaper has written a story about a local fire. Being able to read that story on a cell phone is fine and dandy. Hooray, technology! But what I really want to be able to do is explore the raw facts of that story, one by one, with layers of attribution, and an infrastructure for comparing the details of the fire — date, time, place, victims, fire station number, distance from fire department, names and years experience of firemen on the scene, time it took for firemen to arrive — with the details of previous fires. And subsequent fires, whenever they happen.”

The problem, of course, is that you need to hire a journalistically-minded database geek or a wonky journalist skilled at coding PHP-type languages in order to pull this off.

What if there were a content management system that builds this into its DNA?

There isn’t one yet, I think, but Adrian recommends Ellington, which, to be fair, he developed. I haven’t tried it myself, and I believe it actually costs quite a bit of money for a commercial license, but like I said, using a CMS that allows you to implement new ideas can give you a respectable head start at becoming the sort of online news outlet you talk about becoming when you give speeches at industry conventions.

2 Replies to “The fundamentals of structured data”

  1. Hey Ryan,
    Ellington is $15,000 for the news site version, $10,000 for the entertainment version.

    For an individual to test it out, that’s a lot of money. But for news website content management, it’s INSANELY cheap.

    Most corporate CMS systems cost millions and don’t do 1/20th of what Ellington can do.

    Unfortunately, most news websites are cobbled together Frankenstein-style with antiquated home grown or patched together systems and throwing all that out to restart with Ellington takes some massive kool-aid drinking. 🙂

    Cheers,
    Will

  2. You hit the nail on the head Ryan. Being able to have a powerful data extraction system for news content allows a wonderful slew of possibilities for repurposing the info.

    I don’t know if this kind of approach would be comprehensive enough for things like crime reports, but it would be great for looking up every time so-and-so was quoted in the paper or creating things that resemble Times Topics.

Comments are closed.