在 JavaScript 中获取字符串中每个单词的首字母

你会如何收集字符串中每个单词的第一个字母,比如收到一个缩写?

输入: "Java Script Object Notation"

输出: "JSON"

104999 次浏览

I think what you're looking for is the acronym of a supplied string.

var str = "Java Script Object Notation";
var matches = str.match(/\b(\w)/g); // ['J','S','O','N']
var acronym = matches.join(''); // JSON


console.log(acronym)


Note: this will fail for hyphenated/apostrophe'd words Help-me I'm Dieing will be HmImD. If that's not what you want, the split on space, grab first letter approach might be what you want.

Here's a quick example of that:

let str = "Java Script Object Notation";
let acronym = str.split(/\s/).reduce((response,word)=> response+=word.slice(0,1),'')


console.log(acronym);

I think you can do this with

'Aa Bb'.match(/\b\w/g).join('')

Explanation: Obtain all /g the alphanumeric characters \w that occur after a non-alphanumeric character (i.e: after a word boundary \b), put them on an array with .match() and join everything in a single string .join('')


Depending on what you want to do you can also consider simply selecting all the uppercase characters:

'JavaScript Object Notation'.match(/[A-Z]/g).join('')

How about this:

var str = "", abbr = "";
str = "Java Script Object Notation";
str = str.split(' ');
for (i = 0; i < str.length; i++) {
abbr += str[i].substr(0,1);
}
alert(abbr);

Working Example.

This should do it.

var s = "Java Script Object Notation",
a = s.split(' '),
l = a.length,
i = 0,
n = "";


for (; i < l; ++i)
{
n += a[i].charAt(0);
}


console.log(n);

Try -

var text = '';
var arr = "Java Script Object Notation".split(' ');
for(i=0;i<arr.length;i++) {
text += arr[i].substr(0,1)
}
alert(text);

Demo - http://jsfiddle.net/r2maQ/

Easiest way without regex

var abbr = "Java Script Object Notation".split(' ').map(function(item){return item[0]}).join('');

The regular expression versions for JavaScript is not compatible with Unicode on older than ECMAScript 6, so for those who want to support characters such as "å" will need to rely on non-regex versions of scripts.

Event when on version 6, you need to indicate Unicode with \u.

More details: https://mathiasbynens.be/notes/es6-unicode-regex

Using map (from functional programming)

'use strict';


function acronym(words)
{
if (!words) { return ''; }


var first_letter = function(x){ if (x) { return x[0]; } else { return ''; }};


return words.split(' ').map(first_letter).join('');
}

Yet another option using reduce function:

var value = "Java Script Object Notation";


var result = value.split(' ').reduce(function(previous, current){
return {v : previous.v + current[0]};
},{v:""});




$("#output").text(result.v);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<pre id="output"/>

@BotNet flaw: i think i solved it after excruciating 3 days of regular expressions tutorials:

==> I'm a an animal

(used to catch m of I'm) because of the word boundary, it seems to work for me that way.

/(\s|^)([a-z])/gi

To add to the great examples, you could do it like this in ES6

const x = "Java Script Object Notation".split(' ').map(x => x[0]).join('');
console.log(x);  // JSON

and this works too but please ignore it, I went a bit nuts here :-)

const [j,s,o,n] = "Java Script Object Notation".split(' ').map(x => x[0]);
console.log(`${j}${s}${o}${n}`);

Alternative 1:

you can also use this regex to return an array of the first letter of every word

/(?<=(\s|^))[a-z]/gi

(?<=(\s|^)) is called positive lookbehind which make sure the element in our search pattern is preceded by (\s|^).


so, for your case:

// in case the input is lowercase & there's a word with apostrophe


const toAbbr = (str) => {
return str.match(/(?<=(\s|^))[a-z]/gi)
.join('')
.toUpperCase();
};


toAbbr("java script object notation"); //result JSON

(by the way, there are also negative lookbehind, positive lookahead, negative lookahead, if you want to learn more)


Alternative 2:

match all the words and use replace() method to replace them with the first letter of each word and ignore the space (the method will not mutate your original string)

// in case the input is lowercase & there's a word with apostrophe


const toAbbr = (str) => {
return str.replace(/(\S+)(\s*)/gi, (match, p1, p2) => p1[0].toUpperCase());
};


toAbbr("java script object notation"); //result JSON


// word = not space = \S+ = p1 (p1 is the first pattern)
// space = \s* = p2 (p2 is the second pattern)

This is made very simple with ES6

string.split(' ').map(i => i.charAt(0))               //Inherit case of each letter
string.split(' ').map(i => i.charAt(0)).toUpperCase() //Uppercase each letter
string.split(' ').map(i => i.charAt(0)).toLowerCase() //lowercase each letter

This ONLY works with spaces or whatever is defined in the .split(' ') method

ie, .split(', ') .split('; '), etc.

string.split(' ') .map(i => i.charAt(0)) .toString() .toUpperCase().split(',')

This is similar to others, but (IMHO) a tad easier to read:

const getAcronym = title =>
title.split(' ')
.map(word => word[0])
.join('');

ES6 reduce way:

const initials = inputStr.split(' ').reduce((result, currentWord) =>
result + currentWord.charAt(0).toUpperCase(), '');
alert(initials);

If you came here looking for how to do this that supports non-BMP characters that use surrogate pairs:

initials = str.split(' ')
.map(s => String.fromCodePoint(s.codePointAt(0) || '').toUpperCase())
.join('');

Works in all modern browsers with no polyfills (not IE though)

Getting first letter of any Unicode word in JavaScript is now easy with the ECMAScript 2018 standard:

/(?<!\p{L}\p{M}*)\p{L}/gu

This regex finds any Unicode letter (see the last \p{L}) that is not preceded with any other letter that can optionally have diacritic symbols (see the (?<!\p{L}\p{M}*) negative lookbehind where \p{M}* matches 0 or more diacritic chars). Note that u flag is compulsory here for the Unicode property classes (like \p{L}) to work correctly.

To emulate a fully Unicode-aware \b, you'd need to add a digit matching pattern and connector punctuation:

/(?<!\p{L}\p{M}*|[\p{N}\p{Pc}])\p{L}/gu

It works in Chrome, Firefox (since June 30, 2020), Node.js, and the majority of other environments (see the compatibility matrix here), for any natural language including Arabic.

Quick test:

const regex = /(?<!\p{L}\p{M}*)\p{L}/gu;
const string = "Żerard Łyżwiński";
// Extracting
console.log(string.match(regex));                        // => [ "Ż", "Ł" ]
// Extracting and concatenating into string
console.log(string.match(regex).join(""))                // => ŻŁ
// Removing
console.log(string.replace(regex, ""))                   // => erard yżwiński
// Enclosing (wrapping) with a tag
console.log(string.replace(regex, "<span>$&</span>"))    // => <span>Ż</span>erard <span>Ł</span>yżwiński


console.log("_Łukasz 1Żukowski".match(/(?<!\p{L}\p{M}*|[\p{N}\p{Pc}])\p{L}/gu)); // => null

In ES6:

function getFirstCharacters(str) {
let result = [];


str.split(' ').map(word => word.charAt(0) != '' ? result.push(word.charAt(0)) : '');
  

return result;
}


const str1 = "Hello4 World65 123 !!";
const str2 = "123and 456 and 78-1";
const str3 = " Hello World    !!";


console.log(getFirstCharacters(str1));
console.log(getFirstCharacters(str2));
console.log(getFirstCharacters(str3));

Output:

[ 'H', 'W', '1', '!' ]

[ '1', '4', 'a', '7' ]

[ 'H', 'W', '!' ]

Try This Function

const createUserName = function (name) {
const username = name
.toLowerCase()
.split(' ')
.map((elem) => elem[0])
.join('');


return username;
};


console.log(createUserName('Anisul Haque Bhuiyan'));

It's important to trim the word before splitting it, otherwise, we'd lose some letters.

const getWordInitials = (word: string): string => {
const bits = word.trim().split(' ');
return bits
.map((bit) => bit.charAt(0))
.join('')
.toUpperCase();
};

$ getWordInitials("Java Script Object Notation")

$ "JSON"