Regex for all alphabets

i need a regex for all alphabets. I have an input and target text. Both of them can be belong different alphabets. I mean they can be belong chinese, latin, cyrillic and any others alphabet.

I need a regex for multi language input and multi language target text.

Is there anybody has any idea about this? How can i write this regex ?

I will use this with javascript. But i think there should be common regex for java and javascript also for this problem.

Answers:

Answer

If you are in Java (not in javascript!) you can use unicode properties, e.g.

\P{L} any kind of letter from any language.

See regular-expressions.info/unicode for more informations.

For Javascript:

There is a lib from XRegExp and some plugins XRegExp Unicode plugins that extends the javasript regex features. That adds support for Unicode categories, scripts, and blocks.

With those libs you would be able to use \p{L} with javascript.

See my answer to this question for a small example

Answer

Some regex engines support special character for all Unicode letters:

\p{L}

Or you can use \w - letter, digit, underscore

Answer

i use "|" this character as a separator, so it is speacial for me. Key can be any character except of "|". it solve my problems thanks for answers. And it can be used with javascript, java and groovy. I tested it, worked.

var keyPrefix ="\\|[\u0000-\u007B\u007D-\uFFEF]*";
var keySuffix = "[\u0000-\u007B\u007D-\uFFEF]*\\|";
var searchkey = keyPrefix + key.toLowerCase() + keySuffix; 

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.