PHP 笔记 2

正则表达式

分隔符（可以是除了字母、数字、反斜线、空格以外的任何字符，常用 #）
表达式（特殊字符和非特殊字符串组成）
修饰符（用于开启某种模式）
常用模式

忽略大小写模式 (i)
多行模式 (m)
点号统配模式 (s)
懒惰模式 (U)
结尾限制 (D)
支持UTF-8转义表达 (u)
<?php

$reg = "#[aby\}]#";
$str = 'a\bc[]{}';
preg_match_all($reg, $str, $m);
var_dump($m);

/*
array(1) {
  [0]=>
  array(3) {
    [0]=>
    string(1) "a"
    [1]=>
    string(1) "b"
    [2]=>
    string(1) "}"
  }
}
*/

// 顺序肯定环视 (?=exp)
$reg = "#\b\w+(?=ing\b)#";
$str = "I'm singing while you're dancing.";
preg_match_all($reg, $str, $m);
var_dump($m);

/*
array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(4) "sing"
    [1]=>
    string(4) "danc"
  }
}
*/

// 逆序肯定环视 (?<=exp)
$reg = "#(?<=\bre)\w+\b#";
$str = "reading a book";
preg_match_all($reg, $str, $m);
var_dump($m);

/*
array(1) {
  [0]=>
  array(1) {
    [0]=>
    string(5) "ading"
  }
}
*/

$reg = "#((?<=\d)\d{3})+\b#";
$str = "12345678901234567890";
preg_match_all($reg, $str, $m);
var_dump($m);

/*
array(2) {
  [0]=>
  array(1) {
    [0]=>
    string(18) "345678901234567890"
  }
  [1]=>
  array(1) {
    [0]=>
    string(3) "890"
  }
}
*/


$reg = "#<a[^>]*>([^<>]*)<\/a>#";
$str = '<a href="http://baidu.com">baidu</a>some<a href="http://sohu.com">sohu</a>';
preg_match_all($reg, $str, $m);
var_dump($m);

/*
array(2) {
  [0]=>
  array(2) {
    [0]=>
    string(36) "<a href="http://baidu.com">baidu</a>"
    [1]=>
    string(34) "<a href="http://sohu.com">sohu</a>"
  }
  [1]=>
  array(2) {
    [0]=>
    string(5) "baidu"
    [1]=>
    string(4) "sohu"
  }
}
*/

// 忽略大小写模式 i
$reg = "%<div>gg<\/div>%i";
$str = "<div>gG</Div>";
preg_match_all($reg, $str, $m);
var_dump($m);
/*
array(1) {
  [0]=>
  array(1) {
    [0]=>
    string(13) "<div>gG</Div>"
  }
}
*/

// 多行模式 m 
$reg = "%.*reg$%mi";
$reg2 = "%^t.*%mi";
$str = "this is reg
reg
this is
regexp turtor,oh reg";
preg_match_all($reg, $str, $m);
preg_match_all($reg2, $str, $m2);
var_dump($m, $m2);
/*
array(1) {
  [0]=>
  array(3) {
    [0]=>
    string(11) "this is reg"
    [1]=>
    string(3) "reg"
    [2]=>
    string(20) "regexp turtor,oh reg"
  }
}
array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(11) "this is reg"
    [1]=>
    string(7) "this is"
  }
}
*/

// 点号统配模式 s （s修饰符包括换行符）
$str = "<body>
<div class='content'>
<div class='head'></div>
<div class='body'></div>
<div class='foot'></div>
</div>
</body>";
$reg = "|<body>(.*)<\/body>|";
$reg2 = "|<body>(.*)<\/body>|s";
preg_match_all($reg, $str, $m);
preg_match_all($reg2, $str, $m2);
var_dump($m, $m2);
/*
array(2) {
  [0]=>
  array(0) {
  }
  [1]=>
  array(0) {
  }
}
array(2) {
  [0]=>
  array(1) {
    [0]=>
    string(118) "<body>
<div class='content'>
<div class='head'></div>
<div class='body'></div>
<div class='foot'></div>
</div>
</body>"
  }
  [1]=>
  array(1) {
    [0]=>
    string(105) "
<div class='content'>
<div class='head'></div>
<div class='body'></div>
<div class='foot'></div>
</div>
"
  }
}
*/

// 懒惰匹配，reg和reg2两个表达式等价
$reg = "#\[url\](.*?)\[\/url\]#";
$reg2 = "#\[url\](.*)\[\/url\]#U";
$str = '[url]1.gif[/url][url]2.gif[/url][url]3.gif[/url]';
$s = preg_replace($reg, "<img src=http://image.ai.com/upload/$1>", $str);
$s2 = preg_replace($reg2, "<img src=http://image.ai.com/upload/$1>", $str);
var_dump($s, $s2);

/*
string(126) "<img src=http://image.ai.com/upload/1.gif><img src=http://image.ai.com/upload/2.gif><img src=http://image.ai.com/upload/3.gif>"
string(126) "<img src=http://image.ai.com/upload/1.gif><img src=http://image.ai.com/upload/2.gif><img src=http://image.ai.com/upload/3.gif>"
*/

// 支持 UTF-8 转义表达 u
$reg = "#^[\x{4e00}-\x{9fa5}]+$#u";
$str = "php编程";
var_dump("$str 全是中文? ", preg_match_all($reg, $str, $m));
$str = "编程";
var_dump("$str 全是中文? ", preg_match_all($reg, $str, $m));
/*
string(24) "php编程 全是中文? "
int(0)
string(21) "编程 全是中文? "
int(1)
*/
参考书籍: 《PHP核心技术与最佳实践》