해시코드 UTF-8 hashtags that contain unicode, numeric values and underscores
로빈
본문
function ar_get_hashtags($string) {
$arr=array();
$arr_tr=explode(" ","? * @ $ % = + [ ] ' \" \\ : , | | / ~ { } & 【 】 [ ] \t");
$string=trim(str_replace($arr_tr,' ',$string));
$string=trim(str_replace($arr_tr,' ',$string));
$string=trim(str_replace(' ',' ',$string));
$string=trim(str_replace(' ',' ',$string));
preg_match_all('/#([\p{Pc}\p{N}\p{L}\p{Mn}]+)/u', $string, $matches);
//preg_match_all('/#(\S*\w)/i', $string, $matches);
//var_dump($matches);
foreach ($matches[1] as $match) {
if ($match && !in_array($match,$arr) && strlen($match)>2 && strlen($match)<30 ) $arr[] = cut_str($match,30,'');
if (count($arr)>=30) break;
}
return $arr;
}
-->#레토나#오프로드#경반분교#솔캠#차박캠핑#캠핑요리#
Don't forget about hashtags that contain unicode, numeric values and underscores:
$tweet = "Valid hashtags include: #hashtag #NYC2016 #NYC_2016 #gøypålandet!";
preg_match_all('/#([\p{Pc}\p{N}\p{L}\p{Mn}]+)/u', $tweet, $matches);
print_r( $matches );
\p{Pc} - to match underscore
\p{N} - numeric character in any script
\p{L} - letter from any language
\p{Mn} - any non marking space (accents, umlauts, etc)
$arr=array();
$arr_tr=explode(" ","? * @ $ % = + [ ] ' \" \\ : , | | / ~ { } & 【 】 [ ] \t");
$string=trim(str_replace($arr_tr,' ',$string));
$string=trim(str_replace($arr_tr,' ',$string));
$string=trim(str_replace(' ',' ',$string));
$string=trim(str_replace(' ',' ',$string));
preg_match_all('/#([\p{Pc}\p{N}\p{L}\p{Mn}]+)/u', $string, $matches);
//preg_match_all('/#(\S*\w)/i', $string, $matches);
//var_dump($matches);
foreach ($matches[1] as $match) {
if ($match && !in_array($match,$arr) && strlen($match)>2 && strlen($match)<30 ) $arr[] = cut_str($match,30,'');
if (count($arr)>=30) break;
}
return $arr;
}
-->#레토나#오프로드#경반분교#솔캠#차박캠핑#캠핑요리#
Don't forget about hashtags that contain unicode, numeric values and underscores:
$tweet = "Valid hashtags include: #hashtag #NYC2016 #NYC_2016 #gøypålandet!";
preg_match_all('/#([\p{Pc}\p{N}\p{L}\p{Mn}]+)/u', $tweet, $matches);
print_r( $matches );
\p{Pc} - to match underscore
\p{N} - numeric character in any script
\p{L} - letter from any language
\p{Mn} - any non marking space (accents, umlauts, etc)
댓글목록
등록된 댓글이 없습니다.