Oracle正则表达式实战(四)

2015-01-23 21:53:19 · 作者: · 浏览: 30
parentheses we need to include asub-expression inside the literal parentheses "\((.*)\)". We can then REGEXP_SUBSTR using thefirst sub expression.

匹配括号内文本的模式基本写法为:“\(.*\)”。\是转义字符,使跟在其后的字符变为字面值。但是这个模式用在REGEXP_SUBSTR时会连括号一起返回。为了忽略括号我们需要在字面括号内部包含子表达式:"\((.*)\)".

COLUMN with_parentheses FORMAT A20
COLUMN without_parentheses FORMAT A20
 
SELECT data,
       REGEXP_SUBSTR(data, '\(.*\)') AS with_parentheses,
 
 
REGEXP_SUBSTR(data, '\((.*)\)', 1, 1, 'i', 1) AS without_parentheses
FROM   t1
WHERE  REGEXP_LIKE(data, '\(.*\)');
 
DATA                                               WITH_PARENTHESES     WITHOUT_PARENTHESES
-------------------------------------------------- -------------------- --------------------
This is some text (with parentheses) in it.        (with parentheses)   with parentheses
This text has (parentheses too).                   (parentheses too)    parentheses too
 
2 rows selected.
 
SQL>

?

注意:REGEXP_SUBSTR(data,'\((.*)\)', 1, 1, 'i', 1) 中最后的i代码不区分大小写,最后1个“1”代表返回哪个子表达式匹配的文本。(范围0-9)

Example 7 : REGEXP_COUNT

We need to know how many times a block of 4 digits appears intext. The data looks like this.

我们需要知道4个数字的块在字符串中出现的次数。看原始数据:

DROP TABLE t1;
CREATE TABLE t1 (
  data VARCHAR2(50)
);
 
INSERT INTO t1 VALUES ('1234');
INSERT INTO t1 VALUES ('1234 1234');
INSERT INTO t1 VALUES ('1234 1234 1234');
COMMIT;

We can identify digits using "\d" or "[0-9]"and the "{4}" operator signifies 4 of them, so using"\d{4}" or "[0-9]{4}" with the REGEXP_COUNT functionseems to be a valid option.

我们可以用表达式:\d 或[0-9]和{4}操作符识别4个数字的块。

SELECT REGEXP_COUNT(data, '[0-9]{4}') AS pattern_count_1,
       REGEXP_COUNT(data, '\d{4}') AS pattern_count_2
FROM   t1;
 
PATTERN_COUNT_1 PATTERN_COUNT_2
--------------- ---------------
              1               1
              2               2
              3               3
 
3 rows selected.
 
SQL>

Example 8 : REGEXP_LIKE

We need to identify invalid email addresses. The data looks likethis.

我们需要校验邮箱地址,原始数据如下:

DROP TABLE t1;
CREATE TABLE t1 (
  data VARCHAR2(50)
);
 
INSERT INTO t1 VALUES ('me@example.com');
INSERT INTO t1 VALUES ('me@example');
INSERT INTO t1 VALUES ('@example.com');
INSERT INTO t1 VALUES ('me.me@example.com');
INSERT INTO t1 VALUES ('me.me@ example.com');
INSERT INTO t1 VALUES ('me.me@example-example.com');
COMMIT;

The following test gives us email addresses that approximate toinvalid email address formats.

下列测试给我们近似不合法的邮箱。

SELECT data
FROM   t1
WHERE  NOT REGEXP_LIKE(data, '[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}', 'i');
 
DATA
--------------------------------------------------
me@example
@example.com
me.me@ example.com
 
3 rows selected.
 
SQL>

?

-----------------------------

Dylan Presents.