[OT] Text analysis and author, anyone done it?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[OT] Text analysis and author, anyone done it?

Peter Alcibiades
Has anyone implemented anything in LC which takes a passage of text and
then does statistical analysis to see whether another passage was written
by the same author?

Or do you know of any implementation in any other language for that
matter...?

Peter

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Text analysis and author, anyone done it?

Robert Brenstein
You mean like turnitin or crot?

http://turnitin.com/static/index.php
http://www.siberiasoft.com/

Robert

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Text analysis and author, anyone done it?

Peter Alcibiades
No, not quite.  Those test to see does a given text derive from some others.  What I need to know is, whether this text, which is likely originally authored, was authored by the same person as this other.

Its like, did St Paul write the Epistle to the Hebrews, given that we know he wrote the one to the Romans?
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Text analysis and author, anyone done it?

slylabs13
It would take more than logic to determine that. If a program was the thing that made that decision I would be very doubtful of it's results.

Bob


On Jun 24, 2011, at 5:45 AM, Peter Alcibiades wrote:

> No, not quite.  Those test to see does a given text derive from some others.
> What I need to know is, whether this text, which is likely originally
> authored, was authored by the same person as this other.
>
> Its like, did St Paul write the Epistle to the Hebrews, given that we know
> he wrote the one to the Romans?
>
> --
> View this message in context: http://runtime-revolution.278305.n4.nabble.com/OT-Text-analysis-and-author-anyone-done-it-tp3621990p3622425.html
> Sent from the Revolution - User mailing list archive at Nabble.com.
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Text analysis and author, anyone done it?

Peter Alcibiades
It can be done statistically. Various methods have been proposed and used.  One general kind of measure is the probability of another word coming, as a function of the past n words.  Another is to measure the length of gap between occurrences of pairs of a given word.  There is technical literature on it, and I guess LC would permit writing something to do it.  Not that its the best thing to do it in, that seems to be R, but its what I know.

But it would be nice if someone had already done it, in any language.  Save a huge lot of work.
Peter
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Text analysis and author, anyone done it?

Peter Alcibiades
In reply to this post by slylabs13
Well just in case anyone ever does need to do it, here are two places to get started.  One is NLTK - the free Natural Language Toolkit and its associated free online book Natural Language Processing with Python.  Which appears to double as a Python tutorial, so its two for one.

http://www.nltk.org/book

Then there is this

http://cran.ma.imperial.ac.uk/web/views/NaturalLanguageProcessing.html

which has a bunch of tools and material in R.  i guess we have all known that one day it was going to be our painful duty to learn R, but, like St Augustine, hoped it would not be yet.
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Text analysis and author, anyone done it?

pmbrig
On Jun 25, 2011, at 3:48 AM, Peter Alcibiades wrote:

> Well just in case anyone ever does need to do it, here are two places to get
> started.  One is NLTK - the free Natural Language Toolkit and its associated
> free online book Natural Language Processing with Python.  Which appears to
> double as a Python tutorial, so its two for one.
>
> http://www.nltk.org/book
>
> Then there is this
>
> http://cran.ma.imperial.ac.uk/web/views/NaturalLanguageProcessing.html

This link gives me a 404 error.

> which has a bunch of tools and material in R.  i guess we have all known
> that one day it was going to be our painful duty to learn R, but, like St
> Augustine, hoped it would not be yet.

-- Peter

Peter M. Brigham
[hidden email]
http://home.comcast.net/~pmbrig



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Text analysis and author, anyone done it?

pmbrig
In reply to this post by Peter Alcibiades
On Jun 24, 2011, at 11:46 PM, Peter Alcibiades wrote:

> It can be done statistically. Various methods have been proposed and used.
> One general kind of measure is the probability of another word coming, as a
> function of the past n words.  Another is to measure the length of gap
> between occurrences of pairs of a given word.  There is technical literature
> on it, and I guess LC would permit writing something to do it.  Not that its
> the best thing to do it in, that seems to be R, but its what I know.
>
> But it would be nice if someone had already done it, in any language.  Save
> a huge lot of work.
> Peter

Don't know if anyone has already tackled this kind of thing in LC, but it should be fairly easy to do. (Whether the algorithms actually work to distinguish different authors is something I know nothing about.) The gap between pairs of a given word, in particular, is nearly trivial. The question would be speed, and since LC is blindingly fast at processing text strings, I'd be optimistic about that, unless you're talking really huge texts.

-- Peter

Peter M. Brigham
[hidden email]
http://home.comcast.net/~pmbrig



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: [OT] Text analysis and author, anyone done it?

Jim Ault
In reply to this post by pmbrig
Worked fine for me.
---  excerpt ----
CRAN Task View: Natural Language Processing
Maintainer:
Ingo Feinerer and Fridolin Wild
Contact:
Fridolin.Wild at wu-wien.ac.at
Version:
2009-02-05
This CRAN Task View contains a list of packages useful for natural  
language processing.

------------------end-------


On Jun 25, 2011, at 8:13 AM, Peter Brigham MD wrote:

> On Jun 25, 2011, at 3:48 AM, Peter Alcibiades wrote:
>
>> Well just in case anyone ever does need to do it, here are two  
>> places to get
>> started.  One is NLTK - the free Natural Language Toolkit and its  
>> associated
>> free online book Natural Language Processing with Python.  Which  
>> appears to
>> double as a Python tutorial, so its two for one.
>>
>> http://www.nltk.org/book
>>
>> Then there is this
>>
>> http://cran.ma.imperial.ac.uk/web/views/ 
>> NaturalLanguageProcessing.html
>
> This link gives me a 404 error.
>
>> which has a bunch of tools and material in R.  i guess we have all  
>> known
>> that one day it was going to be our painful duty to learn R, but,  
>> like St
>> Augustine, hoped it would not be yet.
>
> -- Peter
>
> Peter M. Brigham
> [hidden email]
> http://home.comcast.net/~pmbrig

Jim Ault
Las Vegas



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode