Tuesday, March 3, 2009

Oracle Regular Expression

http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/rischert_regexp_pt1.html


operator REGEXP_LIKE

functions REGEXP_INSTR, REGEXP_SUBSTR, and REGEXP_REPLACE

Example:
select group_number, disk_number, path, mount_status, header_status from v$asm_disk order by to_number(regexp_replace(path, '[a-z/]', ''))

Table 1: Anchoring Metacharacters

MetacharacterDescription
^Anchor the expression to the start of a line
$Anchor the expression to the end of a line

Table 2: Quantifiers, or Repetition Operators

QuantifierDescription
*Match 0 or more times
?Match 0 or 1 time
+Match 1 or more times
{m}Match exactly m times
{m,}Match at least m times
{m, n}Match at least m times but no more than n times

Table 3: Predefined POSIX Character Classes

Character ClassDescription
[:alpha:]Alphabetic characters
[:lower:]Lowercase alphabetic characters
[:upper:]Uppercase alphabetic characters
[:digit:]Numeric digits
[:alnum:]Alphanumeric characters
[:space:]Space characters (nonprinting), such as carriage return, newline, vertical tab, and form feed
[:punct:]Punctuation characters
[:cntrl:]Control characters (nonprinting)
[:print:]Printable characters

Table 4: Alternate Matching and Grouping of Expressions

MetacharacterDescription
|AlternationSeparates alternates, often used with grouping operator ()
( )GroupGroups subexpression into a unit for alternations, for quantifiers, or for backreferencing (see "Backreferences" section)
[char]Character listIndicates a character list; most metacharacters inside a character list are understood as literals, with the exception of character classes, and the ^ and - metacharacters

Table 5: The REGEXP_LIKE Operator

SyntaxDescription
REGEXP_LIKE(source_string, pattern
[, match_parameter])
source_string supports character datatypes (CHAR, VARCHAR2, CLOB, NCHAR, NVARCHAR2, and NCLOB but not LONG). The pattern parameter is another name for the regular expression. match_parameter allows optional parameters such as handling the newline character, retaining multiline formatting, and providing control over case-sensitivity.

Table 6: The REGEXP_INSTR Function

SyntaxDescription
REGEXP_INSTR(source_string, pattern
[, start_position
[, occurrence
[, return_option
[, match_parameter]]]])
This function looks for a pattern and returns the first position of the pattern. Optionally, you can indicate the start_position you want to begin the search. The occurrence parameter defaults to 1 unless you indicate that you are looking for a subsequent occurrence. The default value of the return_option is 0, which returns the starting position of the pattern; a value of 1 returns the starting position of the next character following the match.

Table 7: Explanation of 5-digit + 4 Zip-Code Expression

SyntaxDescription
Empty space that must be matched
[:digit:] POSIX numeric digit class
] End of character list
{5} Repeat exactly five occurrences of the character list
( Start of subexpression
- A literal hyphen, because it is not a range metacharacter inside a character list
[ Start of character list
[:digit:] POSIX [:digit:] class
[ Start of character list
] End of character list
{4} Repeat exactly four occurrences of the character list
) Closing parenthesis, to end the subexpression
? The ? quantifier matches the grouped subexpression 0 or 1 time thus making the 4-digit code optional
$ Anchoring metacharacter, to indicate the end of the line

Table 8: The REGEXP_SUBSTR Function

SyntaxDescription
REGEXP_SUBSTR(source_string, pattern
[, position [, occurrence
[, match_parameter]]])
The REGEXP_SUBSTR function returns the substring that matches the pattern.

Table 9: The REGEXP_REPLACE Function

SyntaxDescription
REGEXP_REPLACE(source_string, pattern
[, replace_string [, position
[,occurrence, [match_parameter]]]])
This function replaces the matching pattern with a specified replace_string, allowing complex search-and-replace operations.

Table 10: Backreference Metacharacter

MetacharacterDescription
\digitBackslashFollowed by a digit between 1 and 9, the backslash matches the preceding digit-th parenthesized subexpression.
(Note: The backslash has another meaning in regular expressions; depending on the context it can also mean the Escape character

Table 11: Explanation of Pattern-Swap Regular Expression

Regular-Expression ItemDescription
(Start of first subexpression
. Match any single character except a newline
* Repetition operator, matches previous . metacharacter 0 to n times
) End of first subexpression; result of the match is captured in \1
(In this example, it's Ellen.)
Empty space that needs to be present
( Start of the second subexpression
. Match any single character except a newline
* Repetition operator matches the previous . metacharacter 0 to n times
) End of second subexpression; result of this match is captured in \2
(In this example, it stores Hildi.)
Empty space
( Start of third subexpression
. Match any single character except a newline
* Repetition operator matches . metacharacter 0 to n times
) End of third subexpression; result of this match is captured in \3
(In this example, it holds Smith.)

Table 12: Explanation of the Social Security Number Regular Expression

Regular-Expression ItemDescription
^ Start of line character (Regular expression cannot have any leading characters before the match.)
( Start subexpression and list alternates separated by the | metacharacter
[ Start of character list
[:digit:] POSIX numeric digit class
] End of character list
{3} Repeat exactly three occurrences of character list
-A hyphen
[ Start of character list
[:digit:] POSIX numeric digit class
] End of character list
{2} Repeat exactly two occurrences of character list
- Another hyphen
[ Start of character list
[:digit:] POSIX numeric digit class
] End of character list
{4} Repeat exactly four occurrences of character list
| Alternation metacharacter; ends the first choice and starts the next alternate expression
[ Start of character list
[:digit:] POSIX numeric digit class.
] End of character list
{9} Repeat exactly nine occurrences of character list
) Ending parenthesis, to close the subexpression group used for alternation
$ Anchoring metacharacter, to indicate the end of the line; no extra characters can follow the pattern

No comments: