Regular expression for all printable characters in JavaScript

Looking for a regular expression for that validates all printable characters. The regex needs to be used in JavaScript only. I have gone through this post but it mostly talks about .net, Java and C but not JavaScript.

You have to allow only these printable characters :

a-z, A-Z, 0-9, and the thirty-two symbols: !"#$%&'()*+,-./:;<=>[email protected][] ^_`{|}~ and space

Need a JavaScript regex to validate the input characters is one of the above and discard the rest.

Answers:

Answer

If you want to match all printable characters in the UTF-8 set (as indicated by your comment on Aug 21), you're going to have a hard time doing this yourself. JavaScript's native regexes have abysmal Unicode support. But you can use XRegExp with the regex ^\P{C}*$.

If you only want to match those few ASCII letters you mentioned in the edit to your post from Aug 22, then the regex is trivial:

/^[a-z0-9!"#$%&'()*+,.\/:;<=>[email protected]\[\] ^_`{|}~-]*$/i
Answer

For non-unicode use regex pattern ^[^\x00-\x1F\x80-\x9F]+$


If you want to work with unicode, first read Javascript + Unicode regexes.

I would suggest then to use regex pattern ^[^\p{Cc}\p{Cf}\p{Zl}\p{Zp}]*$

  • \p{Cc} or \p{Control}: an ASCII 0x00..0x1F or Latin-1 0x80..0x9F control character.
  • \p{Cf} or \p{Format}: invisible formatting indicator.
  • \p{Zl} or \p{Line_Separator}: line separator character U+2028.
  • \p{Zp} or \p{Paragraph_Separator}: paragraph separator character U+2029.

For more information see http://www.regular-expressions.info/unicode.html

Answer

Looks like JavaScript has changed to some degree since this question was posted?

I'm using this one:

var regex = /^[\u0020-\u007e\u00a0-\u00ff]*$/;
console.log( regex.test("!\"#$%&'()*+,-./:;<=>[email protected][] ^_`{|}~")); //should output "true" 
console.log( regex.test("Iñtërnâtiônàlizætiøn")); //should output "true"
console.log( regex.test("?????")); //should output "false" 
Answer

To validate a string only consists of printable ASCII characters, use a simple regex like

/^[ -~]+$/

It matches

  • ^ - the start of string anchor
  • [ -~]+ - one or more (due to + quantifier) characters that are within a range from space till a tilde in the ASCII table:

enter image description here
- $ - end of string anchor

For Unicode printable chars, use \PC Unicode category (matching any char but a control char) from XRegExp, as has already been mentioned:

^\PC+$

See regex demos:

// ASCII only
var ascii_print_rx = /^[ -~]+$/;
console.log(ascii_print_rx.test("It's all right.")); // true
console.log(ascii_print_rx.test('\f ')); // false, \f is an ASCII form feed char
console.log(ascii_print_rx.test("demásiado tarde")); // false, no Unicode printable char support
// Unicode support
console.log(XRegExp.test('demásiado tarde', XRegExp("^\\PC+$"))); // true
console.log(XRegExp.test('? ', XRegExp("^\\PC+$"))); // false, \u200C is a Unicode zero-width joiner
console.log(XRegExp.test('\f ', XRegExp("^\\PC+$"))); // false, \f is an ASCII form feed char
<script src="http://cdnjs.cloudflare.com/ajax/libs/xregexp/3.1.1/xregexp-all.min.js"></script>

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.