Oracle正则表达式实战(一)

2015-01-23 21:53:19 · 作者: · 浏览: 32

介绍

Oracle 10g introduced support for regular expressions in SQL andPL/SQL with the following functions.

Oracle 10g开始支持在SQL和PLSQL中使用以下正则表达式:

REGEXP_INSTR - Similar to INSTR except it uses a regular expression rather than a literal as the search string. 类似INSTR函数REGEXP_LIKE - Similar to LIKE except it uses a regular expression as the search string. REGEXP_LIKE is really an operator, not a function. 类似LIKE条件 REGEXP_REPLACE - Similar to REPLACE except it uses a regular expression as the search string. 类似REPLACE函数REGEXP_SUBSTR - Returns the string matching the regular expression. Not really similar to SUBSTR. 返回匹配正则表达式的字符串,和SUBSTR有点类似

Oracle 11g introduced two new features related to regularexpressions.

11g开始引入2个新特性:

REGEXP_COUNT - Returns the number of occurrences of the regular expression in the string. 返回符合正则表达式的字符串出现的次数。 Sub-expression support was added to all regular expression functions by adding a parameter to each function to specify the sub-expression in the pattern match.子表达式在所有正则表达式函数都支持,可通过增加一个参数实现。

Learning to write regular expressions takes a little time. If youdon't do it regularly, it can be a voyage of discovery each time. The generalrules for writing regular expressions are available here.You can read the Oracle Regular Expression Support here.

Rather than trying to repeat the formal definitions, I'll presenta number of problems I've been asked to look at over the years, where asolution using a regular expression has been appropriate.

此处不重复正则表达式的定义,代之以一组问题导向的正则表达式应用实例:

Example 1 : REGEXP_SUBSTR

The data in a column is free text, but may include a 4 digit year.

数据在字段中以自由文本存放,但是可能包含4个精度的年份数据。

DROP TABLE t1;
CREATE TABLE t1 (
  data VARCHAR2(50)
);
 
INSERT INTO t1 VALUES ('FALL 2014');
INSERT INTO t1 VALUES ('2014 CODE-B');
INSERT INTO t1 VALUES ('CODE-A 2014 CODE-D');
INSERT INTO t1 VALUES ('ADSHLHSALK');
INSERT INTO t1 VALUES ('FALL 2004');
INSERT INTO t1 VALUES ('FALL 2015');
COMMIT;
 
SELECT * FROM t1;
 
DATA
---------------------------------------------------------------------
FALL 2014
2014 CODE-B
CODE-A 2014 CODE-D
ADSHLHSALK
FALL 2004
 
5 rows selected.
 
SQL>

If we needed to return rows containing a specific year we coulduse the LIKE operator (WHERE data LIKE '%2014%'),but how do we return rows using a comparison (<, <=, >, >=,<>)?

One way to approach this is to pull out the 4 figure year andconvert it to a number, so we don't accidentally do an ASCII comparison. That'spretty easy using regular expressions.

如果我们需要返回包含指定年份的数据我们可以使用LIKE操作符(…),但是如何通过不等操作符返回行?一条路是抽出4个数字的年份并转换为数字。通过正则表达式可以很容易实现。

We can identify digits using the "\d" or"[0-9]" operators. We want a group of four of them, which isrepresented by the "{4}" operator. So our regular expression will be"\d{4}" or "[0-9]{4}". The REGEXP_SUBSTR functionreturns the string matching the regular expression, so that can be used toextract the text of interest. We then just need to convert it to a number andperform our comparison.

我们使用\d或者[0-9]来识别数字。我们需要4个一组,可以使用{4}表示。至此,我们的正则表达式为:\d{4}或者[0-9]{4}。REGEXP_SUBSTR函数返回匹配指定正式表达式的字符串,所以可以用来提取我们感兴趣的文本。然后我们只需将其转换为数字并执行比较即可。

SELECT *
FROM   t1
WHERE  TO_NUMBER(REGEXP_SUBSTR(data, '\d{4}')) >= 2014;
 
DATA
---------------------------------------------------------------------
FALL 2014
2014 CODE-B
CODE-A 2014 CODE-D
FALL 2015
 
4 rows selected.
 
SQL>

Example 2 : REGEXP_SUBSTR

Give