<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns="http://purl.org/rss/1.0/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/"
    xmlns:foaf="http://xmlns.com/foaf/0.1/"
    xmlns:vcard="http://www.w3.org/2001/vcard-rdf/3.0#"
>
<channel rdf:about="http://hublog.hubmed.org/archives/001345">
    <dc:date>2006-04-07T12:33:50Z</dc:date>
    <link>http://hublog.hubmed.org/archives/001345.html</link>
    <title>HubLog: Open Text Mining Interface (OTMI)</title>
    <description>RDF feed for the individual post titled Open Text Mining Interface (OTMI), part of HubLog</description>
    <items>
        <rdf:Seq>
            <rdf:li rdf:resource="http://hublog.hubmed.org/archives/001345.html" />
            <rdf:li rdf:resource="http://hublog.hubmed.org/archives/001345.html#comment-034914" /><rdf:li rdf:resource="http://hublog.hubmed.org/archives/001345.html#comment-034994" />        </rdf:Seq>
    </items>
</channel>
 
<item rdf:about="http://hublog.hubmed.org/archives/001345.html">
    <dc:date>2006-04-07T12:07:55Z</dc:date>
    <title>Open Text Mining Interface (OTMI)</title>
    <link>http://hublog.hubmed.org/archives/001345.html</link>
    <content:encoded>&lt;p&gt;The Open Text Mining Interface (OTMI) is a proposed method for making available the text of journal articles for indexing and analysis, while preserving any subscription model that funds the journals. This approach, presented in a &lt;a href=&quot;http://blogs.nature.com/wp/nascent/2006/04/web_20_in_science.html&quot;&gt;Web 2.0 session at the Bio-IT World conference&lt;/a&gt; earlier this week, uses an Atom XML version of each article, with OTMI namespaced extensions, to provide all the sentences of the article in alphabetical order. Some extra information such as word frequency is also presented, but this could presumably be derived from the sentence text anyway.&lt;/p&gt;

&lt;p&gt;All the articles in &lt;a href=&quot;http://www.nature.com/nature/journal/v440/n7083/index.html&quot;&gt;the 2020 Computing issue of Nature&lt;/a&gt; have OTMI files linked using &amp;lt;link rel=&quot;OTMI&quot; type=&quot;application/atom+xml&quot; href=&quot;&quot;/&gt;&lt;/code&gt; - &lt;a href=&quot;http://www.nature.com/nature/journal/v440/n7083/otmi/otmi-440413a.xml&quot;&gt;here&apos;s an example file&lt;/a&gt;.&lt;/p&gt;</content:encoded>
        <dc:creator>
        <foaf:Person>
            <rdf:value>Alf Eaton</rdf:value>
            <foaf:nick>alf</foaf:nick>
            <foaf:name>Alf Eaton</foaf:name>
            <vcard:Given>Alf</vcard:Given>
            <vcard:Family>Eaton</vcard:Family>
            <foaf:mbox rdf:resource="mailto:alf@hubmed.org" />
        </foaf:Person>
    </dc:creator>
    <prism:isPartOf rdf:resource="http://hublog.hubmed.org/"/>
    <prism:publicationName>HubLog</prism:publicationName>
</item>

<item rdf:about="http://hublog.hubmed.org/archives/001345.html#comment-034914">
    <title>Comment from Glen Newton</title>
    <dc:date>2006-04-22T13:15:15Z</dc:date>
    <content:encoded>&lt;p&gt;In its present form, OTMI is flawed: it makes assumptions about the nature of the text mining that will be applied. By listing the sentences out-of-order (in alphabetical order and not in article order) and not including paragraph and other document structure, techniques which take advantage of the information clustering and flow that the structure of the article provides - which represent newer and more effective analysis techniques - cannot be applied. Even fairly traditional things like proximity search will not work using an OTMI source if the two words of interest are not in the same sentence.&lt;/p&gt;

&lt;p&gt;The stopwords and term frequency are completely redundent and the latter suggests a vector-space model view of text mining.&lt;/p&gt;

&lt;p&gt;I do realize that publishers would be more reluctant to release this information if the sentences were in article order, but feel that OTMI as it stands is too limiting for the real world. &lt;/p&gt;

&lt;p&gt;I also understand that this is a proposal, open to input for changes/improvements.&lt;/p&gt;

&lt;p&gt;I must confess that I have not been able to find a primary source of information on OTMI, only that of blogs.&lt;/p&gt;</content:encoded>
    <link>http://hublog.hubmed.org/archives/001345.html#comment-034914</link>
    <dc:contributor>Glen Newton</dc:contributor>
</item>
<item rdf:about="http://hublog.hubmed.org/archives/001345.html#comment-034994">
    <title>Comment from Timo Hannay</title>
    <dc:date>2006-04-25T06:59:25Z</dc:date>
    <content:encoded>&lt;p&gt;I&apos;ve posted some more details of OTMI here:&lt;br /&gt;
&lt;a href=&quot;http://blogs.nature.com/wp/nascent/2006/04/open_text_mining_interface_1.html&quot;&gt;http://blogs.nature.com/wp/nascent/2006/04/open_text_mining_interface_1.html&lt;/a&gt;&lt;/p&gt;</content:encoded>
    <link>http://hublog.hubmed.org/archives/001345.html#comment-034994</link>
    <dc:contributor>Timo Hannay</dc:contributor>
</item>

</rdf:RDF>
