(You'll need Firefox or Safari to see the emoji in the code.)
I want to take a string of emoji and do something with the individual characters.
In JavaScript "?????????????????????????".length == 13
because "?"
length is 1, the rest are 2. So we can't do
s = string.split("");
c = [];
c[0] = s[0]+s[1];
The grapheme-splitter library that does just that, is fully compatible even with old browsers and works not just with emoji but all sorts of exotic characters: https://github.com/orling/grapheme-splitter You are likely to miss edge-cases in any home-brew solution. This one is actually based on the UAX-29 Unicode standart
Edit: see Orlin Georgiev's answer for a proper solution in a library: https://github.com/orling/grapheme-splitter
Thanks to this answer I made a function that takes a string and returns an array of emoji:
var emojiStringToArray = function (str) {
split = str.split(/([\uD800-\uDBFF][\uDC00-\uDFFF])/);
arr = [];
for (var i=0; i<split.length; i++) {
char = split[i]
if (char !== "") {
arr.push(char);
}
}
return arr;
};
So
emojiStringToArray("?????????????????????????")
// => Array [ "????", "????", "????", "?", "????", "????", "????" ]
JavaScript ES6 has a solution!, for a real split:
[..."?????????????????????????"] // ["????", "????", "????", "?", "????", "????", "????"]
Yay? Except for the fact that when you run this through your transpiler, it might not work (see @brainkim's comment). It only works when natively run on an ES6-compliant browser. Luckily this encompasses most browsers (Safari, Chrome, FF), but if you're looking for high browser compatibility this is not the solution for you.
The modern / proper way to split a UTF8 string is using Array.from(str)
instead of str.split('')
It can be done using the u
flag of a regular expression. The regular expression is:
/.*?/u
This is broken every time there are there are at least minimally zero or more characters that may or may not be emojis, but cannot be spaces or new lines break.
?
(split in zero chars)*
.
/u
By using the question mark ?
I am forcing to cut exactly every zero chars, otherwise /.*/u
it cuts by all characters until I find a space or newline break.
var string = "?????????????????????????"
var c = string.split(/.*?/u)
console.log(c)
©2020 All rights reserved.