如何强制 JavaScript 深度复制字符串?

我有一些 javascript 代码,看起来像这样:

var myClass = {
ids: {}
myFunc: function(huge_string) {
var id = huge_string.substr(0,2);
ids[id] = true;
}
}

稍后,用一些大字符串(100MB +)调用该函数。我只想保存在每个字符串中找到的一个简短 id。然而,Google Chrome 的子字符串函数(实际上在我的代码中是 regex)只返回一个“切片字符串”对象,该对象引用原始对象。因此,在一系列对 myFunc的调用之后,我的 chrome 选项卡耗尽了内存,因为临时 huge_string对象不能被垃圾收集。

如何复制字符串 id,以便不维护对 huge_string的引用,并且可以对 huge_string进行垃圾回收?

enter image description here

141003 次浏览

JavaScript's implementation of ECMAScript can vary from browser to browser, however for Chrome, many string operations (substr, slice, regex, etc.) simply retain references to the original string rather than making copies of the string. This is a known issue in Chrome (Bug #2869). To force a copy of the string, the following code works:

var string_copy = (' ' + original_string).slice(1);

This code works by appending a space to the front of the string. This concatenation results in a string copy in Chrome's implementation. Then the substring after the space can be referenced.

This problem with the solution has been recreated here: http://jsfiddle.net/ouvv4kbs/1/

WARNING: takes a long time to load, open Chrome debug console to see a progress printout.

// We would expect this program to use ~1 MB of memory, however taking
// a Heap Snapshot will show that this program uses ~100 MB of memory.
// If the processed data size is increased to ~1 GB, the Chrome tab
// will crash due to running out of memory.


function randomString(length) {
var alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
var result = '';
for (var i = 0; i < length; i++) {
result +=
alphabet[Math.round(Math.random() * (alphabet.length - 1))];
}
return result;
};


var substrings = [];
var extractSubstring = function(huge_string) {
var substring = huge_string.substr(0, 100 * 1000 /* 100 KB */);
// Uncommenting this line will force a copy of the string and allow
// the unused memory to be garbage collected
// substring = (' ' + substring).slice(1);
substrings.push(substring);
};


// Process 100 MB of data, but only keep 1 MB.
for (var i =  0; i < 10; i++) {
console.log(10 * (i + 1) + 'MB processed');
var huge_string = randomString(10 * 1000 * 1000 /* 10 MB */);
extractSubstring(huge_string);
}


// Do something which will keep a reference to substrings around and
// prevent it from being garbage collected.
setInterval(function() {
var i = Math.round(Math.random() * (substrings.length - 1));
document.body.innerHTML = substrings[i].substr(0, 10);
}, 2000);

enter image description here

I use Object.assign() method for string, object, array, etc:

const newStr = Object.assign("", myStr);
const newObj = Object.assign({}, myObj);
const newArr = Object.assign([], myArr);

Note that Object.assign only copies the keys and their properties values inside an object (one-level only). For deep cloning a nested object, refer to the following example:

let obj100 = { a:0, b:{ c:0 } };
let obj200 = JSON.parse(JSON.stringify(obj100));
obj100.a = 99; obj100.b.c = 99; // No effect on obj200

I was getting an issue when pushing into an array. Every entry would end up as the same string because it was referencing a value on an object that changed as I iterated over results via a .next() function. Here is what allowed me to copy the string and get unique values in my array results:

while (results.next()) {
var locationName = String(results.name);
myArray.push(locationName);
}

You can use:

 String.prototype.repeat(1)

It seems to work well. Refer the MDN documentation on repeat.

not sure how to test, but does using string interpolation to create a new string variable work?

newString = `${oldString}`

I typically use strCopy = new String (originalStr); Is this not recommended for some reason?

I have run into this problem and this was how I coped with it:

let copy_string = [];
copy_string.splice(0, 0, str);

I believe this would deep copy str to copy_string.

using String.slice()

const str = 'The quick brown fox jumps over the lazy dog.';


// creates a new string without modifying the original string
const new_str = str.slice();


console.log( new_str );

Edit: These tests were run in Google Chrome back in September 2021 and not in NodeJS.

It's interesting to see some of the responses here. If you're not worried about legacy browser support (IE6+), skip on down to the interpolation method because it is extremely performant.

One of the most backwards compatible (back to IE6), and still very performant ways to duplicate a string by value is to split it into a new array and immediately rejoin that new array as a string:

let str = 'abc';
let copiedStr = str.split('').join('');
console.log('copiedStr', copiedStr);

Behind the scenes

What the above does is calls on JavaScript to split the string using no character as a separator, which splits each individual character into its own element in the newly created array. This means that, for a brief moment, the copiedStr variables looks like this:

['a', 'b', 'c']

Then, immediately, the copiedStr variable is rejoined using no character as a separator in between each element, which means that each element in the newly created array is pushed back into a brand new string, effectively copying the string.

At the end of the execution, copiedStr is its own variable, which outputs to the console:

abc

Performance

On average, this takes around 0.007 ms - 0.01 ms on my machine, but your mileage may vary. Tested on a string wth 4,000 characters, this method produced a max of 0.2 ms and average of about .14 ms to copy a string, so it still has a solid performance.

Who cares about Legacy support anyways?/Interpolation Method

But, if you're not worried about legacy browser support, however, the interpolation method offered in one of the answers on here, by Pirijan, is a very performant and easy to copy a string:

let str = 'abc';
let copiedStr = `${str}`;

Testing the performance of interpolation on the same 4,000 character length string, I saw an average of 0.004 ms, with a max of 0.1 ms and a min of an astonishing 0.001 ms (quite frequently).

In my opinion this is the cleanest and the most self-documenting solution:

const strClone = String(strOrigin);

I would use string interpolation and check if undefined or empty.

`{huge_string || ''}`

Keep in mind that with this solution, you will have the following result.

'' => ''
undefined => ''
null => ''
'test => 'test'