com.stevesoft.pat
Class Regex

java.lang.Object
  |
  +--com.stevesoft.pat.RegRes
| +--com.stevesoft.pat.Regex
All Implemented Interfaces:
java.lang.Cloneable, java.io.FilenameFilter
Direct Known Subclasses:
FileRegex

public class Regex
extends RegRes
implements java.io.FilenameFilter

Regex provides the parser which constructs the linked list of Pattern classes from a String.

For the purpose of this documentation, the fact that java interprets the backslash will be ignored. In practice, however, you will need a double backslash to obtain a string that contains a single backslash character. Thus, the example pattern "\b" should really be typed as "\\b" inside java code.

Note that Regex is part of package "com.stevesoft.pat". To use it, simply import com.stevesoft.pat.Regex at the top of your file.

Regex is made with a constructor that takes a String that defines the regular expression. Thus, for example

Regex r = new Regex("[a-c]*");
matches any number of characters so long as the are 'a', 'b', or 'c').

To attempt to match the Pattern to a given string, you can use either the search(String) member function, or the matchAt(String,int position) member function. These functions return a boolean which tells you whether or not the thing worked, and sets the methods "charsMatched()" and "matchedFrom()" in the Regex object appropriately.

The portion of the string before the match can be obtained by the left() member, and the portion after the match can be obtained by the right() member.

Essentially, this package implements a syntax that is very much like the perl 5 regular expression syntax. Longer example:

Regex r = new Regex("x(a|b)y");
r.matchAt("xay",0);
System.out.println("sub = "+r.stringMatched(1));
The above would print "sub = a".
r.left() // would return "x"
r.right() // would return "y"

Differences between this package and perl5:
The extended Pattern for setting flags, is now supported, but the flags are different. "(?i)" tells the pattern to ignore case, "(?Q)" sets the "dontMatchInQuotes" flag, and "(?iQ)" sets them both. You can change the escape character. The pattern

(?e=#)#d+
is the same as
\d+
, but note that the sequence
(?e=#)
must occur at the very beginning of the pattern. There may be other small differences as well. I will either make my package conform or note them as I become aware of them.

This package supports additional patterns not in perl5:

(?@())GroupThis matches all characters between the '(' character and the balancing ')' character. Thus, it will match "()" as well as "(())". The balancing characters are arbitrary, thus (?@{}) matches on "{}" and "{{}}".
(?<1)BackupMoves the pointer backwards within the text. This allows you to make a "look behind." It fails if it attempts to move to a position before the beginning of the string. "x(?<1)" is equivalent to "(?=x)". The number, 1 in this example, is the number of characters to move backwards.

See Also:
Pattern

Field Summary
static boolean dotDoesntMatchCR
          Set this to change the default behavior of the "." pattern.
 char esc
          By default, the escape character is the backslash, but you can make it anything you want by setting this variable.
 
Fields inherited from class com.stevesoft.pat.RegRes
charsMatched_, didMatch_, marks, matchFrom_, numSubs_, src
 
Constructor Summary
Regex()
          Initializes the object without a Pattern.
Regex(Regex r)
          Essentially clones the Regex object
Regex(java.lang.String s)
          Create and compile a Regex, but do not throw any exceptions.
Regex(java.lang.String s, ReplaceRule rp)
          Create and compile a Regex, but give it the ReplaceRule specified.
Regex(java.lang.String s, java.lang.String rp)
          Create and compile both a Regex and a ReplaceRule.
 
Method Summary
 boolean accept(java.io.File dir, java.lang.String s)
          This method implements FilenameFilter, allowing one to use a Regex to search through a directory using File.list.
protected  void add(Pattern p2)
          Only needed for creating your own extensions of Regex.
 java.lang.Object clone()
          A clone by any other name would smell as sweet.
 void compile(java.lang.String prepat)
          This method compiles a regular expression, making it possible to call the search or matchAt methods.
protected  void compile1(StrPos sp, Rthings mk)
          You only need to use this method if you are creating your own extentions to Regex.
 patInt countMaxChars()
          You only need to know about this if you are inventing your own pattern elements.
 patInt countMinChars()
          You only need to know about this if you are inventing your own pattern elements.
static void define(java.lang.String nm, java.lang.String pat)
          Defines a shorthand for a pattern.
static void define(java.lang.String nm, java.lang.String pat, Validator v)
          Defines a method to create a new rule.
 boolean equals(java.lang.Object o)
           
static boolean getDefaultMFlag()
          Get the default value of the m flag.
 boolean getDontMatchInQuotes()
          Find out if the dontMatchInQuotes flag is enabled.
 boolean getGFlag()
          Get the state of the 'g' flag.
 boolean getIgnoreCase()
          Get the state of the ignoreCase flag.
 boolean getMFlag()
          Get the state of the sFlag
 Replacer getReplacer()
           
 ReplaceRule getReplaceRule()
          Get the current ReplaceRule.
 boolean getSFlag()
          Get the state of the sFlag
static boolean isDefined(java.lang.String nm)
          Test to see if a custom defined rule exists.
 boolean isLiteral()
          Checks to see if there are only literal and no special pattern elements in this Regex.
 boolean matchAt(java.lang.String s, int start_pos)
          Attempt to match a Pattern beginning at a specified location within the string.
 boolean matchAt(StringLike s, int start_pos)
          Attempt to match a Pattern beginning at a specified location within the StringLike.
 void optimize()
          Once this method is called, the state of variables ignoreCase and dontMatchInQuotes should not be changed as the results will be unpredictable.
 boolean optimized()
          This function returns true if the optimize method has been called.
static Regex perlCode(java.lang.String s)
          A bit of syntactic surgar for those who want to make their code look more perl-like.
 java.lang.String replaceAll(java.lang.String s)
          Replace all occurences of this pattern in String s according to the ReplaceRule.
 StringLike replaceAll(StringLike s)
           
 java.lang.String replaceAllFrom(java.lang.String s, int pos)
          Replace all occurences of this pattern in String s beginning with position pos according to the ReplaceRule.
 java.lang.String replaceAllRegion(java.lang.String s, int start, int end)
          Replace all occurences of this pattern in String s beginning with position start and ending with end according to the ReplaceRule.
 java.lang.String replaceFirst(java.lang.String s)
          Replace the first occurence of this pattern in String s according to the ReplaceRule.
 java.lang.String replaceFirstFrom(java.lang.String s, int pos)
          Replace the first occurence of this pattern in String s beginning with position pos according to the ReplaceRule.
 java.lang.String replaceFirstRegion(java.lang.String s, int start, int end)
          Replace the first occurence of this pattern in String s beginning with position start and ending with end according to the ReplaceRule.
 RegRes result()
          Return a clone of the underlying RegRes object.
 boolean reverseSearch(java.lang.String s)
           
 boolean reverseSearch(StringLike sl)
           
 boolean search(java.lang.String s)
          Search through a String for the first occurrence of a match.
 boolean search(StringLike sl)
           
 boolean searchFrom(java.lang.String s, int start)
          Search through a String for the first occurence of a match, but start at position
 boolean searchFrom(StringLike s, int start)
           
 boolean searchRegion(java.lang.String s, int start, int end)
          Search through a region of a String for the first occurence of a match.
static void setDefaultMFlag(boolean mFlag)
          Set the default value of the m flag.
 void setDontMatchInQuotes(boolean b)
          Set the dontMatch in quotes flag.
 void setGFlag(boolean b)
          Set the 'g' flag
 void setIgnoreCase(boolean b)
          Set the state of the ignoreCase flag.
 void setReplaceRule(ReplaceRule rp)
          Change the ReplaceRule of this Regex to rp.
 void setReplaceRule(java.lang.String rp)
          Change the ReplaceRule of this Regex by compiling a new one using String rp.
 java.lang.String toString()
          Converts the stored Pattern to a String -- this is a decompile.
static void undefine(java.lang.String nm)
          Removes a custom defined rule.
static java.lang.String version()
          The version of this package
 
Methods inherited from class com.stevesoft.pat.RegRes
charsMatched, charsMatched, copyOutOf, didMatch, equals, getString, getStringLike, left, left, matchedFrom, matchedFrom, matchedTo, matchedTo, matchFrom, matchFrom, numSubs, right, right, stringMatched, stringMatched, substring, substring
 
Methods inherited from class java.lang.Object
finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

esc

public char esc
By default, the escape character is the backslash, but you can make it anything you want by setting this variable.

dotDoesntMatchCR

public static boolean dotDoesntMatchCR
Set this to change the default behavior of the "." pattern. By default it now matches perl's behavior and fails to match the '\n' character.
Constructor Detail

Regex

public Regex()
Initializes the object without a Pattern. To supply a Pattern use compile(String s).
See Also:
compile(java.lang.String)

Regex

public Regex(java.lang.String s)
Create and compile a Regex, but do not throw any exceptions. If you wish to have exceptions thrown for syntax errors, you must use the Regex(void) constructor to create the Regex object, and then call the compile method. Therefore, you should only call this method when you know your pattern is right. I will probably become more like
See Also:
search(java.lang.String), compile(java.lang.String)

Regex

public Regex(java.lang.String s,
             java.lang.String rp)
Create and compile both a Regex and a ReplaceRule.
See Also:
ReplaceRule, compile(java.lang.String)

Regex

public Regex(java.lang.String s,
             ReplaceRule rp)
Create and compile a Regex, but give it the ReplaceRule specified. This allows the user finer control of the Replacement process, if that is desired.
See Also:
ReplaceRule, compile(java.lang.String)

Regex

public Regex(Regex r)
Essentially clones the Regex object
Method Detail

setDontMatchInQuotes

public void setDontMatchInQuotes(boolean b)
Set the dontMatch in quotes flag.

getDontMatchInQuotes

public boolean getDontMatchInQuotes()
Find out if the dontMatchInQuotes flag is enabled.

setIgnoreCase

public void setIgnoreCase(boolean b)
Set the state of the ignoreCase flag. If set to true, then the pattern matcher will ignore case when searching for a match.

getIgnoreCase

public boolean getIgnoreCase()
Get the state of the ignoreCase flag. Returns true if we are ignoring the case of the pattern, false otherwise.

setDefaultMFlag

public static void setDefaultMFlag(boolean mFlag)
Set the default value of the m flag. If it is set to true, then the MFlag will be on for any regex search executed.

getDefaultMFlag

public static boolean getDefaultMFlag()
Get the default value of the m flag. If it is set to true, then the MFlag will be on for any regex search executed.

setReplaceRule

public void setReplaceRule(java.lang.String rp)
Change the ReplaceRule of this Regex by compiling a new one using String rp.

setReplaceRule

public void setReplaceRule(ReplaceRule rp)
Change the ReplaceRule of this Regex to rp.

isDefined

public static boolean isDefined(java.lang.String nm)
Test to see if a custom defined rule exists.
See Also:
com.stevesoft.pat

undefine

public static void undefine(java.lang.String nm)
Removes a custom defined rule.
See Also:
com.stevesoft.pat

define

public static void define(java.lang.String nm,
                          java.lang.String pat,
                          Validator v)
Defines a method to create a new rule. See test/deriv2.java and test/deriv3.java for examples of how to use it.

define

public static void define(java.lang.String nm,
                          java.lang.String pat)
Defines a shorthand for a pattern. The pattern will be invoked by a string that has the form "(??"+nm+")".

getReplaceRule

public ReplaceRule getReplaceRule()
Get the current ReplaceRule.

getReplacer

public Replacer getReplacer()

replaceFirst

public java.lang.String replaceFirst(java.lang.String s)
Replace the first occurence of this pattern in String s according to the ReplaceRule.
See Also:
ReplaceRule, getReplaceRule()

replaceFirstFrom

public java.lang.String replaceFirstFrom(java.lang.String s,
                                         int pos)
Replace the first occurence of this pattern in String s beginning with position pos according to the ReplaceRule.
See Also:
ReplaceRule, getReplaceRule()

replaceFirstRegion

public java.lang.String replaceFirstRegion(java.lang.String s,
                                           int start,
                                           int end)
Replace the first occurence of this pattern in String s beginning with position start and ending with end according to the ReplaceRule.
See Also:
ReplaceRule, getReplaceRule()

replaceAll

public java.lang.String replaceAll(java.lang.String s)
Replace all occurences of this pattern in String s according to the ReplaceRule.
See Also:
ReplaceRule, getReplaceRule()

replaceAll

public StringLike replaceAll(StringLike s)

replaceAllFrom

public java.lang.String replaceAllFrom(java.lang.String s,
                                       int pos)
Replace all occurences of this pattern in String s beginning with position pos according to the ReplaceRule.
See Also:
ReplaceRule, getReplaceRule()

replaceAllRegion

public java.lang.String replaceAllRegion(java.lang.String s,
                                         int start,
                                         int end)
Replace all occurences of this pattern in String s beginning with position start and ending with end according to the ReplaceRule.
See Also:
ReplaceRule, getReplaceRule()

compile

public void compile(java.lang.String prepat)
             throws RegSyntax
This method compiles a regular expression, making it possible to call the search or matchAt methods.
Throws:
RegSyntax - is thrown if a syntax error is encountered in the pattern. For example, "x{3,1}" or "*a" are not valid patterns.
See Also:
search(java.lang.String), matchAt(java.lang.String, int)

equals

public boolean equals(java.lang.Object o)
Overrides:
equals in class java.lang.Object

clone

public java.lang.Object clone()
A clone by any other name would smell as sweet.
Overrides:
clone in class RegRes

result

public RegRes result()
Return a clone of the underlying RegRes object.

matchAt

public boolean matchAt(java.lang.String s,
                       int start_pos)
Attempt to match a Pattern beginning at a specified location within the string.
See Also:
search(java.lang.String)

matchAt

public boolean matchAt(StringLike s,
                       int start_pos)
Attempt to match a Pattern beginning at a specified location within the StringLike.
See Also:
search(java.lang.String)

search

public boolean search(java.lang.String s)
Search through a String for the first occurrence of a match.
See Also:
searchFrom(java.lang.String, int), matchAt(java.lang.String, int)

search

public boolean search(StringLike sl)

reverseSearch

public boolean reverseSearch(java.lang.String s)

reverseSearch

public boolean reverseSearch(StringLike sl)

searchFrom

public boolean searchFrom(java.lang.String s,
                          int start)
Search through a String for the first occurence of a match, but start at position
start

searchFrom

public boolean searchFrom(StringLike s,
                          int start)

searchRegion

public boolean searchRegion(java.lang.String s,
                            int start,
                            int end)
Search through a region of a String for the first occurence of a match.

setGFlag

public void setGFlag(boolean b)
Set the 'g' flag

getGFlag

public boolean getGFlag()
Get the state of the 'g' flag.

getSFlag

public boolean getSFlag()
Get the state of the sFlag

getMFlag

public boolean getMFlag()
Get the state of the sFlag

add

protected void add(Pattern p2)
Only needed for creating your own extensions of Regex. This method adds the next Pattern in the chain of patterns or sets the Pattern if it is the first call.

compile1

protected void compile1(StrPos sp,
                        Rthings mk)
                 throws RegSyntax
You only need to use this method if you are creating your own extentions to Regex. compile1 compiles one Pattern element, it can be over-ridden to allow the Regex compiler to understand new syntax. See deriv.java for an example. This routine is the heart of class Regex. Rthings has one integer member called intValue, it is used to keep track of the number of ()'s in the Pattern.
Throws:
RegSyntax - is thrown when a nonsensensical pattern is supplied. For example, a pattern beginning with *.

toString

public java.lang.String toString()
Converts the stored Pattern to a String -- this is a decompile. Note that \t and \n will really print out here, Not just the two character representations. Also be prepared to see some strange output if your characters are not printable.
Overrides:
toString in class RegRes

accept

public boolean accept(java.io.File dir,
                      java.lang.String s)
This method implements FilenameFilter, allowing one to use a Regex to search through a directory using File.list. There is a FileRegex now that does this better.
Specified by:
accept in interface java.io.FilenameFilter
See Also:
FileRegex

version

public static final java.lang.String version()
The version of this package

optimize

public void optimize()
Once this method is called, the state of variables ignoreCase and dontMatchInQuotes should not be changed as the results will be unpredictable. However, search and matchAt will run more quickly. Note that you can check to see if the pattern has been optimized by calling the optimized() method.

This method will attempt to rewrite your pattern in a way that makes it faster (not all patterns execute at the same speed). In general, "(?: ... )" will be faster than "( ... )" so if you don't need the backreference, you should group using the former pattern.

It will also introduce new pattern elements that you can't get to otherwise, for example if you have a large table of strings, i.e. the months of the year "(January|February|...)" optimize() will make a Hashtable that takes it to the next appropriate pattern element -- eliminating the need for a linear search.

See Also:
optimized(), com.stevesoft.pat.Regex#ignoreCase, com.stevesoft.pat.Regex#dontMatchInQuotes, matchAt(java.lang.String, int), search(java.lang.String)

optimized

public boolean optimized()
This function returns true if the optimize method has been called.

perlCode

public static Regex perlCode(java.lang.String s)
A bit of syntactic surgar for those who want to make their code look more perl-like. To use this initialize your Regex object by saying:
Regex r1 = Regex.perlCode("s/hello/goodbye/");
Regex r2 = Regex.perlCode("s'fish'frog'i");
Regex r3 = Regex.perlCode("m'hello');
The i for ignoreCase is supported in this syntax, as well as m, s, and x. The g flat is a bit of a special case.

If you wish to replace all occurences of a pattern, you do not put a 'g' in the perlCode, but call Regex's replaceAll method.

If you wish to simply and only do a search for r2's pattern, you can do this by calling the searchFrom method method repeatedly, or by calling search repeatedly if the g flag is set.

Note: Currently perlCode does not support the (?e=#) syntax for changing the escape character.


isLiteral

public boolean isLiteral()
Checks to see if there are only literal and no special pattern elements in this Regex.

countMinChars

public patInt countMinChars()
You only need to know about this if you are inventing your own pattern elements.

countMaxChars

public patInt countMaxChars()
You only need to know about this if you are inventing your own pattern elements.