Regex (matchChunk) help...

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Regex (matchChunk) help...

Tore Nilsen via use-livecode
Hello,

I have a couple of hundred pages of text where I need to extract out a
different string.

The ending of each string I need has the same ending    skyrider1

The beginning of each string is the same   selkirkst

The middle of each string can be any text.

The problem is that within each line where a string exists there are
several strings that have the same beginning selkirkst but none of the have
the correct ending skyrider1.

My thoughts are to find ending of the string first and then work backwards
to the first beginning string.

I created the following example which is gibberish but should make this
clearer... this is the string I want to extract from the line given is
 *selkirkst is
placed in the second **skyrider1*



Use the *selkirkst* function to check whether a *string* contains a
specified pattern. If *selkirkst* includes a pair of parentheses, the
position of the substring matching the part of theregular expression inside
the parentheses is placed in the variables in the *positionVarsList*. The
number of the first character in the matching substring is placed in the
first variable in the positionVarsList, and the number of the last
*selkirkst is
placed in the second **skyrider1*. Additional starting and ending
positions, matching additional parenthetical expressions, are placed in
additional pairs of variables in thepositionVarsList. If the
*selkirkst* function
returns false, the values of the variables in the positionVarsListare not
changed. The string and regularExpression are always case-sensitive,
regardless of the setting of the caseSensitive property. (If you need to
make a case-insensitive comparison, use "(?i)" at the start of the
regularExpression to make the match case-insensitive.)

The next line will not have *is placed in the second*  but some other text
*selkirkst*  ????   *skyrider1*

I am not sure if this explains it well enough but I believe a regex
expression could be used (or perhaps a matchChunk) to extract the correct
string from each line of text.

Any suggestions?

thanks,

Glen
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
one way would be to populate a memory database, then query it with LIKE:

SELECT * FROM memorydb WHERE stringtext LIKE 'selkirkst%' OR stringtext LIKE '%skyrider1'

If you need lines that have both use a single comparison 'selkirkst%skyrider1'

Sometimes SQL is the best way to find things.

Bob S


> On Jun 15, 2018, at 08:45 , Glen Bojsza via use-livecode <[hidden email]> wrote:
>
> Hello,
>
> I have a couple of hundred pages of text where I need to extract out a
> different string.
>
> The ending of each string I need has the same ending    skyrider1
>
> The beginning of each string is the same   selkirkst
>
> The middle of each string can be any text.
>
> The problem is that within each line where a string exists there are
> several strings that have the same beginning selkirkst but none of the have
> the correct ending skyrider1.
>
> My thoughts are to find ending of the string first and then work backwards
> to the first beginning string.
>
> I created the following example which is gibberish but should make this
> clearer... this is the string I want to extract from the line given is
> *selkirkst is
> placed in the second **skyrider1*


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
Will this do what you want? (untested)

put empty into tExtract
repeat for each line L in bigText
  if char -9 to -1 of L is “skyrider1” then
    if char 1 to 9 of L is “selkirkst” then
      put L & return after tExtract
    end if
  end if
end repeat
if char -1 of tExtract is return then delete char -1 of tExtract

> On Jun 15, 2018, at 8:45 AM, Glen Bojsza via use-livecode <[hidden email]> wrote:
>
> Hello,
>
> I have a couple of hundred pages of text where I need to extract out a
> different string.
>
> The ending of each string I need has the same ending    skyrider1
>
> The beginning of each string is the same   selkirkst
>
> The middle of each string can be any text.
>
> The problem is that within each line where a string exists there are
> several strings that have the same beginning selkirkst but none of the have
> the correct ending skyrider1.
>
> My thoughts are to find ending of the string first and then work backwards
> to the first beginning string.
>
> I created the following example which is gibberish but should make this
> clearer... this is the string I want to extract from the line given is
> *selkirkst is
> placed in the second **skyrider1*
>
>
>
> Use the *selkirkst* function to check whether a *string* contains a
> specified pattern. If *selkirkst* includes a pair of parentheses, the
> position of the substring matching the part of theregular expression inside
> the parentheses is placed in the variables in the *positionVarsList*. The
> number of the first character in the matching substring is placed in the
> first variable in the positionVarsList, and the number of the last
> *selkirkst is
> placed in the second **skyrider1*. Additional starting and ending
> positions, matching additional parenthetical expressions, are placed in
> additional pairs of variables in thepositionVarsList. If the
> *selkirkst* function
> returns false, the values of the variables in the positionVarsListare not
> changed. The string and regularExpression are always case-sensitive,
> regardless of the setting of the caseSensitive property. (If you need to
> make a case-insensitive comparison, use "(?i)" at the start of the
> regularExpression to make the match case-insensitive.)
>
> The next line will not have *is placed in the second*  but some other text
> *selkirkst*  ????   *skyrider1*
>
> I am not sure if this explains it well enough but I believe a regex
> expression could be used (or perhaps a matchChunk) to extract the correct
> string from each line of text.
>
> Any suggestions?
>
> thanks,
>
> Glen
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
If I understand correctly.. If you find the beginning string occurrence,
and then find another beginning string, you want to ignore the first, and
only take strings where beginning and end have no intermediate beginnings?
Like this I mean..

beginning blah blah blah blah blah *beginning blah blah blah blah ending*

Where you would only want the bold part to match?

If so, it might be easiest to do a repeat for each trueword loop.

put 1 into tCounter
repeat for each trueword tword in tText
/*pseudocode
check the word. If its the beginning word, keep track of it using tCounter
so you know what word it was
If you're tracking a beginning and you hit a beginning again, track that
one instead.
If its the ending word, and you have a beginning word being tracked, place
the pair of word numbers into a list of found strings and reset the begin
tracker to empty
if its neither, do nothing
increment the counter
next loop
*/
end repeat

Of course if I've misunderstood what you need, kindly ignore this.

On Fri, Jun 15, 2018 at 10:27 AM Jerry Jensen via use-livecode <
[hidden email]> wrote:

> Will this do what you want? (untested)
>
> put empty into tExtract
> repeat for each line L in bigText
>   if char -9 to -1 of L is “skyrider1” then
>     if char 1 to 9 of L is “selkirkst” then
>       put L & return after tExtract
>     end if
>   end if
> end repeat
> if char -1 of tExtract is return then delete char -1 of tExtract
>
> > On Jun 15, 2018, at 8:45 AM, Glen Bojsza via use-livecode <
> [hidden email]> wrote:
> >
> > Hello,
> >
> > I have a couple of hundred pages of text where I need to extract out a
> > different string.
> >
> > The ending of each string I need has the same ending    skyrider1
> >
> > The beginning of each string is the same   selkirkst
> >
> > The middle of each string can be any text.
> >
> > The problem is that within each line where a string exists there are
> > several strings that have the same beginning selkirkst but none of the
> have
> > the correct ending skyrider1.
> >
> > My thoughts are to find ending of the string first and then work
> backwards
> > to the first beginning string.
> >
> > I created the following example which is gibberish but should make this
> > clearer... this is the string I want to extract from the line given is
> > *selkirkst is
> > placed in the second **skyrider1*
> >
> >
> >
> > Use the *selkirkst* function to check whether a *string* contains a
> > specified pattern. If *selkirkst* includes a pair of parentheses, the
> > position of the substring matching the part of theregular expression
> inside
> > the parentheses is placed in the variables in the *positionVarsList*. The
> > number of the first character in the matching substring is placed in the
> > first variable in the positionVarsList, and the number of the last
> > *selkirkst is
> > placed in the second **skyrider1*. Additional starting and ending
> > positions, matching additional parenthetical expressions, are placed in
> > additional pairs of variables in thepositionVarsList. If the
> > *selkirkst* function
> > returns false, the values of the variables in the positionVarsListare not
> > changed. The string and regularExpression are always case-sensitive,
> > regardless of the setting of the caseSensitive property. (If you need to
> > make a case-insensitive comparison, use "(?i)" at the start of the
> > regularExpression to make the match case-insensitive.)
> >
> > The next line will not have *is placed in the second*  but some other
> text
> > *selkirkst*  ????   *skyrider1*
> >
> > I am not sure if this explains it well enough but I believe a regex
> > expression could be used (or perhaps a matchChunk) to extract the correct
> > string from each line of text.
> >
> > Any suggestions?
> >
> > thanks,
> >
> > Glen
> > _______________________________________________
> > use-livecode mailing list
> > [hidden email]
> > Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
I’m a little confused about the requirements. Can the desired text span lines? If not, it changes things slightly.

Offset is something you can use. Find location of end token. Find location of first start token. Find location of next start token. If before end token, try again until after or none. Then you have the bounds for the first match. I’ll try to write up code to test later this evening if no one provides a good example before then. Offset has a param for chars to skip.
On Jun 15, 2018, 10:46 AM -0500, Glen Bojsza via use-livecode , wrote:

> Hello,
>
> I have a couple of hundred pages of text where I need to extract out a
> different string.
>
> The ending of each string I need has the same ending skyrider1
>
> The beginning of each string is the same selkirkst
>
> The middle of each string can be any text.
>
> The problem is that within each line where a string exists there are
> several strings that have the same beginning selkirkst but none of the have
> the correct ending skyrider1.
>
> My thoughts are to find ending of the string first and then work backwards
> to the first beginning string.
>
> I created the following example which is gibberish but should make this
> clearer... this is the string I want to extract from the line given is
> *selkirkst is
> placed in the second **skyrider1*
>
>
>
> Use the *selkirkst* function to check whether a *string* contains a
> specified pattern. If *selkirkst* includes a pair of parentheses, the
> position of the substring matching the part of theregular expression inside
> the parentheses is placed in the variables in the *positionVarsList*. The
> number of the first character in the matching substring is placed in the
> first variable in the positionVarsList, and the number of the last
> *selkirkst is
> placed in the second **skyrider1*. Additional starting and ending
> positions, matching additional parenthetical expressions, are placed in
> additional pairs of variables in thepositionVarsList. If the
> *selkirkst* function
> returns false, the values of the variables in the positionVarsListare not
> changed. The string and regularExpression are always case-sensitive,
> regardless of the setting of the caseSensitive property. (If you need to
> make a case-insensitive comparison, use "(?i)" at the start of the
> regularExpression to make the match case-insensitive.)
>
> The next line will not have *is placed in the second* but some other text
> *selkirkst* ???? *skyrider1*
>
> I am not sure if this explains it well enough but I believe a regex
> expression could be used (or perhaps a matchChunk) to extract the correct
> string from each line of text.
>
> Any suggestions?
>
> thanks,
>
> Glen
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
Bob, this is an interesting approach using SQL. I will try and setup a
simple test with SQLite.

thanks

On Fri, Jun 15, 2018 at 11:53 AM, Bob Sneidar via use-livecode <
[hidden email]> wrote:

> one way would be to populate a memory database, then query it with LIKE:
>
> SELECT * FROM memorydb WHERE stringtext LIKE 'selkirkst%' OR stringtext
> LIKE '%skyrider1'
>
> If you need lines that have both use a single comparison
> 'selkirkst%skyrider1'
>
> Sometimes SQL is the best way to find things.
>
> Bob S
>
>
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
Hi Jerry,

I may be wrong but it looks like your solution assumes that the line has
the beginning and ending I am looking for in the first and last
positions...in my example text that I gave it was all one line and the
string I am looking for is somewhere in the middle. I may not have been
clear but the example text looks like a paragraph with multiple lines but
it is actually a single line and the formatting of it may be deceiving.

On Fri, Jun 15, 2018 at 12:27 PM, Jerry Jensen via use-livecode <
[hidden email]> wrote:

> Will this do what you want? (untested)
>
> put empty into tExtract
> repeat for each line L in bigText
>   if char -9 to -1 of L is “skyrider1” then
>     if char 1 to 9 of L is “selkirkst” then
>       put L & return after tExtract
>     end if
>   end if
> end repeat
> if char -1 of tExtract is return then delete char -1 of tExtract
>
> *Start of line *
> > Use the *selkirkst* function to check whether a *string* contains a
> > specified pattern. If *selkirkst* includes a pair of parentheses, the
> > position of the substring matching the part of theregular expression
> inside
> > the parentheses is placed in the variables in the *positionVarsList*. The
> > number of the first character in the matching substring is placed in the
> > first variable in the positionVarsList, and the number of the last
> > *selkirkst is
> > placed in the second **skyrider1*. Additional starting and ending
> > positions, matching additional parenthetical expressions, are placed in
> > additional pairs of variables in thepositionVarsList. If the
> > *selkirkst* function
> > returns false, the values of the variables in the positionVarsListare not
> > changed. The string and regularExpression are always case-sensitive,
> > regardless of the setting of the caseSensitive property. (If you need to
> > make a case-insensitive comparison, use "(?i)" at the start of the
> > regularExpression to make the match case-insensitive.)
> *End of line*
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
Mike, I believe that you are correct in understanding what I am trying to
extract.

I will need a bit more time to work through your solution.

regards,

Glen
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
On 06/15/2018 08:45 AM, Glen Bojsza via use-livecode wrote:
> Any suggestions?

filter lotsOfText with "*selkirkst*skyrider1*"

--
  Mark Wieder
  [hidden email]

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
Try this..

on mouseup
   local tCharOffset
   --set the text of field 1 to the text of field 1
   put the text of field 1 into tText
   put "beginning" into tstartword -- string begin
   put "ending" into tEndword -- string end
   put 1 into tCounter -- tracks current word
   repeat for each trueword tWord in tText
      switch tWord
         case tStartword
            put tCounter into tPair
            break
         case tEndword
            if tPair is not empty then
               put tPair & comma & tCounter & cr after tPairs
            end if
            break
      end switch
      add 1 to tcounter
   end repeat
   delete the last char of tPairs
   repeat for each line tLine in tPairs
      set the textcolor of trueword (item 1 of tLine) to (item 2 of tLine)
of field 1 to "blue"
   end repeat
end mouseup

On Fri, Jun 15, 2018 at 11:03 AM Glen Bojsza via use-livecode <
[hidden email]> wrote:

> Mike, I believe that you are correct in understanding what I am trying to
> extract.
>
> I will need a bit more time to work through your solution.
>
> regards,
>
> Glen
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
 Slightly cleaned up, and adjusted to empty tPair after a match.  (on the
off chance there is a second ending without a matching new beginning word)



> on mouseup
>    put the text of field 1 into tText
>    put "beginning" into tstartword -- string begin
>    put "ending" into tEndword -- string end
>    put 1 into tCounter -- tracks current word
>    repeat for each trueword tWord in tText
>       switch tWord
>          case tStartword
>             put tCounter into tPair
>             break
>          case tEndword
>             if tPair is not empty then
>                put tPair & comma & tCounter & cr after tPairs
>                 put empty into tPair
>             end if
>             break
>       end switch
>       add 1 to tcounter
>    end repeat
>    delete the last char of tPairs
>    repeat for each line tLine in tPairs
>       set the textcolor of trueword (item 1 of tLine) to (item 2 of tLine)
> of field 1 to "blue"
>    end repeat
> end mouseup
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
Hi Mike,

Yes this works....never used or knew about trueword.

thanks!

Glen

On Fri, Jun 15, 2018 at 1:21 PM, Mike Bonner via use-livecode <
[hidden email]> wrote:

> Try this..
>
> on mouseup
>    local tCharOffset
>    --set the text of field 1 to the text of field 1
>    put the text of field 1 into tText
>    put "beginning" into tstartword -- string begin
>    put "ending" into tEndword -- string end
>    put 1 into tCounter -- tracks current word
>    repeat for each trueword tWord in tText
>       switch tWord
>          case tStartword
>             put tCounter into tPair
>             break
>          case tEndword
>             if tPair is not empty then
>                put tPair & comma & tCounter & cr after tPairs
>             end if
>             break
>       end switch
>       add 1 to tcounter
>    end repeat
>    delete the last char of tPairs
>    repeat for each line tLine in tPairs
>       set the textcolor of trueword (item 1 of tLine) to (item 2 of tLine)
> of field 1 to "blue"
>    end repeat
> end mouseup
>
> On Fri, Jun 15, 2018 at 11:03 AM Glen Bojsza via use-livecode <
> [hidden email]> wrote:
>
> > Mike, I believe that you are correct in understanding what I am trying to
> > extract.
> >
> > I will need a bit more time to work through your solution.
> >
> > regards,
> >
> > Glen
> > _______________________________________________
> > use-livecode mailing list
> > [hidden email]
> > Please visit this url to subscribe, unsubscribe and manage your
> > subscription preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
> >
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
Glad it worked.   If it turns out there is a reason not to use trueword and
a for each loop, the same basic algorithm can be used with offset() but it
would be a bit more convoluted.
Basically, find the first offset for the beginning string, then use that to
skip chars to look for both the beginning string and ending string.  If
both are found, replace the initial offset for beginning with the new
offset, use that info for the next chars to skip, and do it again.

There would be several more things to watch for, like having a current
beginning offset, doing your check, and making sure you still have a match
even if there isn't another beginning offset, and only an end. (and various
other combinations i'm sure) but it shouldn't be too bad to figure out if
the other solution is ruled out for some reason.

On Fri, Jun 15, 2018 at 12:32 PM Glen Bojsza via use-livecode <
[hidden email]> wrote:

> Hi Mike,
>
> Yes this works....never used or knew about trueword.
>
> thanks!
>
> Glen
>
> On Fri, Jun 15, 2018 at 1:21 PM, Mike Bonner via use-livecode <
> [hidden email]> wrote:
>
> > Try this..
> >
> > on mouseup
> >    local tCharOffset
> >    --set the text of field 1 to the text of field 1
> >    put the text of field 1 into tText
> >    put "beginning" into tstartword -- string begin
> >    put "ending" into tEndword -- string end
> >    put 1 into tCounter -- tracks current word
> >    repeat for each trueword tWord in tText
> >       switch tWord
> >          case tStartword
> >             put tCounter into tPair
> >             break
> >          case tEndword
> >             if tPair is not empty then
> >                put tPair & comma & tCounter & cr after tPairs
> >             end if
> >             break
> >       end switch
> >       add 1 to tcounter
> >    end repeat
> >    delete the last char of tPairs
> >    repeat for each line tLine in tPairs
> >       set the textcolor of trueword (item 1 of tLine) to (item 2 of
> tLine)
> > of field 1 to "blue"
> >    end repeat
> > end mouseup
> >
> > On Fri, Jun 15, 2018 at 11:03 AM Glen Bojsza via use-livecode <
> > [hidden email]> wrote:
> >
> > > Mike, I believe that you are correct in understanding what I am trying
> to
> > > extract.
> > >
> > > I will need a bit more time to work through your solution.
> > >
> > > regards,
> > >
> > > Glen
> > > _______________________________________________
> > > use-livecode mailing list
> > > [hidden email]
> > > Please visit this url to subscribe, unsubscribe and manage your
> > > subscription preferences:
> > > http://lists.runrev.com/mailman/listinfo/use-livecode
> > >
> > _______________________________________________
> > use-livecode mailing list
> > [hidden email]
> > Please visit this url to subscribe, unsubscribe and manage your
> > subscription preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
> >
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
I understood he wants lines with either or.

Bob S


> On Jun 15, 2018, at 10:05 , Mark Wieder via use-livecode <[hidden email]> wrote:
>
> On 06/15/2018 08:45 AM, Glen Bojsza via use-livecode wrote:
>> Any suggestions?
>
> filter lotsOfText with "*selkirkst*skyrider1*"
>
> --
> Mark Wieder
> [hidden email]


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
If the first search string can be anywhere before the second search string, and the second search string can be anywhere (or nowhere) after the first string, use LIKE '%selkirkst%' OR LIKE '%skyrider1%'

I posted some code, and also a sample stack, for converting an array to a memory database and back again. You could easily copy the arrayToMemoryDB and create a TextToMemoryDB (and it's counterpart) so it will work with delimited text. The conversion process can take some time (milliseconds) so it's not going to overload the matrix! But for the utility of it, it's great, because queries can be quite complex, and can return other values like if/then evaluations. It can sort and return just the columns you are interested in. And for situations where you need to query the database multiple times, the performance will exceed Livecode string searches, because you only need to iterate through the data once for any given dataset.

Bob S


> On Jun 15, 2018, at 09:50 , Glen Bojsza via use-livecode <[hidden email]> wrote:
>
> Bob, this is an interesting approach using SQL. I will try and setup a
> simple test with SQLite.
>
> thanks


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
Building on what Mark Wieder elegantly wrote:

> MarkW wrote:
>
>  filter lotsOfText with "*selkirkst*skyrider1*”

function extractStrings lotsOfText, startWord, endWord

        replace cr with space in lotsOfText -- Makes sure lotsOfText is just a single line

        replace startWord with cr & startWord in lotsOfText -- Makes sure a line starts with the startWord

        replace endWord with endWord & cr in lotsOfText -- Makes sure a line ends with the endWord

        filter lotsOfText with "*" & startWord & "*" & endWord  — Mark’s suggestion

        return lotsOfText

end extractStrings


Try it with your original gibberish. I‘ve added a second instance of the string you want to extract to show that the function will return all instances.


Use the *selkirkst* function to check whether a *string* contains a

specified pattern. If *selkirkst* includes a pair of parentheses, the

position of the substring matching the part of theregular expression inside

the parentheses is placed in the variables in the *positionVarsList*. The

number of the first character in the matching substring is placed in the

first variable in the positionVarsList, and the number of the last

*selkirkst is

placed in the second **skyrider1*. Additional starting and ending

positions, matching additional parenthetical expressions, are placed in

additional pairs of variables in thepositionVarsList. If the

*selkirkst* function

returns false, the values of the variables in the positionVarsListare not

*selkirkst is

placed in the third **skyrider1*. changed. The string and regularExpression are always case-sensitive,

regardless of the setting of the caseSensitive property. (If you need to

make a case-insensitive comparison, use "(?i)" at the start of the

regularExpression to make the match case-insensitive.)


Jim Lambert

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Regex (matchChunk) help...

Tore Nilsen via use-livecode
Hey cool!   I'd use Jims (an marks) method.  MUCH simpler.

On Fri, Jun 15, 2018 at 7:06 PM Jim Lambert via use-livecode <
[hidden email]> wrote:

> Building on what Mark Wieder elegantly wrote:
>
> > MarkW wrote:
> >
> >  filter lotsOfText with "*selkirkst*skyrider1*”
>
> function extractStrings lotsOfText, startWord, endWord
>
>         replace cr with space in lotsOfText -- Makes sure lotsOfText is
> just a single line
>
>         replace startWord with cr & startWord in lotsOfText -- Makes sure
> a line starts with the startWord
>
>         replace endWord with endWord & cr in lotsOfText -- Makes sure a
> line ends with the endWord
>
>         filter lotsOfText with "*" & startWord & "*" & endWord  — Mark’s
> suggestion
>
>         return lotsOfText
>
> end extractStrings
>
>
> Try it with your original gibberish. I‘ve added a second instance of the
> string you want to extract to show that the function will return all
> instances.
>
>
> Use the *selkirkst* function to check whether a *string* contains a
>
> specified pattern. If *selkirkst* includes a pair of parentheses, the
>
> position of the substring matching the part of theregular expression inside
>
> the parentheses is placed in the variables in the *positionVarsList*. The
>
> number of the first character in the matching substring is placed in the
>
> first variable in the positionVarsList, and the number of the last
>
> *selkirkst is
>
> placed in the second **skyrider1*. Additional starting and ending
>
> positions, matching additional parenthetical expressions, are placed in
>
> additional pairs of variables in thepositionVarsList. If the
>
> *selkirkst* function
>
> returns false, the values of the variables in the positionVarsListare not
>
> *selkirkst is
>
> placed in the third **skyrider1*. changed. The string and
> regularExpression are always case-sensitive,
>
> regardless of the setting of the caseSensitive property. (If you need to
>
> make a case-insensitive comparison, use "(?i)" at the start of the
>
> regularExpression to make the match case-insensitive.)
>
>
> Jim Lambert
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode