I ran the text of Guardian articles categorised as 'science' (full text), New York Times articles categorised as 'science and technology' (short sections) and Nature News articles (full text) through OpenCalais to see what entities it identified.
Comments
All fields are optional, email address will not be shown; no HTML, URLs are automatically hyperlinked.

Was this just an experiment or what was the motivation, if you're free to say?
That list of media sources made me think about whether it would be possible to associate Guardian and NYT stories with the science journal articles that ... prompted them? that they're about? Not sure.
...in case it wasn't clear, adding: maybe entity extraction could help with that associating.
Mostly it was just to see how well OpenCalais' entity extraction worked, but yes, matching up news stories with journal articles is one thing I have in mind (following the lead of thesciencebehindit.net).
hard to tell if this list represents a pass or a fail. What do you think?
Without having a gold standard of manually annotated articles to compare it to, I'd say it was pretty good (with a few exceptions that feedback would probably solve). This doesn't show how many entities it's missing, of course.
The only disappointing thing was that it had different URIs for 'Bush', 'George Bush' and 'George W. Bush', for example. I guess it couldn't be sure whether they were all referrring to the same person, but multiple URIs per entity, with confidence scores, might be one solution.
Tom Tague from OpenCalais here.
First - thanks for the experiment. Always great to see efforts beyond the "tag a single document" stage.
A couple of points.
Is it a pass / fail? There's no simple yes / no answer. I think you'll find the accuracy is very good and the recall is quite high. Compared to the alternative of doing this manually it's clearly a big win.
We work very hard to disambiguate and normalize *some* things - like companies, geographies, etc. We don't do people at this point - but we may tackle that in the future. It's not a trivial exercise.
It's also great that you exposed the Linked Data URI's in the results list. I'd encourage people to explore where they can get from a company or country or one of the other entities we link. Just as an example - from Company to Board of Directors to DBPedia to Geography to Products to Revenue are all just HTTP calls and parsing - once you've done the entity extraction you've opened up a whole world of zero cost additional content you can work with.
Might want to start experimenting with relevance scores next - how important were these entities within a given article?
Thanks again.
i don't get this at tall can some one explain. im looking for a story which involves science but i cant seem to find one can someone help?
thanks lealou x x