解释如何使用位向量来确定所有字符是否唯一

我对位向量是如何工作的感到困惑(不太熟悉位向量)。下面是给出的代码。有人能给我解释一下吗?

public static boolean isUniqueChars(String str) {
int checker = 0;
for (int i = 0; i < str.length(); ++i) {
int val = str.charAt(i) - 'a';
if ((checker & (1 << val)) > 0) return false;
checker |= (1 << val);
}
return true;
}

特别是 checker在做什么?

76715 次浏览

这里使用 int checker作为位的存储器。整数值中的每个位都可以看作是一个标志,所以最终 int是一个位数组(标志)。代码中的每个位都说明是否在字符串中找到了带有位索引的字符。出于同样的原因,您可以使用位向量代替 int。它们之间有两个不同之处:

  • 大小 int有固定的大小,通常是4字节,这意味着8 * 4 = 32位(标志)。位向量通常可以有不同的大小,或者应该在构造函数中指定大小。

  • 有了位向量,你将更容易读取代码,可能是这样的:

    vector.SetFlag(4, true); // set flag at index 4 as true

    对于 int,您将拥有较低级别的位逻辑代码:

    checker |= (1 << 5); // set flag at index 5 to true

也可能 int会稍微快一点,因为带有位的操作级别非常低,可以由 CPU 按原样执行。BitVector 允许编写不那么神秘的代码,而且可以存储更多的标志。

为了将来的参考: 位向量也被称为位集或位数组。以下是一些针对不同语言/平台的数据结构链接:

我怀疑你是从我正在读的同一本书中得到这段代码的... ... 这里的代码本身并不像操作符-| = ,& 和 < < 这些通常不被我们外行人使用的操作符那样神秘——作者没有花费额外的时间来解释这个过程,也没有解释这里涉及的实际机制是什么。开始的时候,我对这个帖子的前一个答案很满意,但只是在抽象的层面上。我回过头来看,因为我觉得需要一个更具体的解释——缺乏一个解释总是让我感到不安。

这个运算符 < < 是一个左位移器,它接受该数字或操作数的二进制表示形式,并将其移动到右边操作数或数字指定的多个位置上,比如只在二进制数中使用十进制数。我们正在乘以基数2-当我们向上移动许多位置不是基数10-所以右边的数字是指数,左边的数字是基数2的倍数。

这个操作符 | = (称为按位 OR 赋值)获取左边的操作数,或者是右边的操作数,并将结果赋给左边的操作数(x | = y 等价于 x = x | y)。类似地,操作符(’&’)将“和”操作符的左边和右边。它还有一个按位 AND 赋值(x & = y 等价于 x = x & y)。

所以我们这里有一个哈希表,每次检查器获取或者 d (checker |= (1 << val))时,它都存储在一个32位的二进制数中,并且指定了一个字母的二进制值,其对应的位被设置为 true。 字符的值使用检查器(checker & (1 << val)) > 0)进行检查(如果大于0,我们知道我们被骗了) ,因为将两个相同的位设置为 true 和’d 一起将返回 true 或’1”。

有26个二进制位置,每个位置对应一个小写字母——作者确实说过假设字符串只包含小写字母——这是因为我们只剩下6个(32位整数)位置可以使用——然后我们得到一个冲突

00000000000000000000000000000001 a 2^0


00000000000000000000000000000010 b 2^1


00000000000000000000000000000100 c 2^2


00000000000000000000000000001000 d 2^3


00000000000000000000000000010000 e 2^4


00000000000000000000000000100000 f 2^5


00000000000000000000000001000000 g 2^6


00000000000000000000000010000000 h 2^7


00000000000000000000000100000000 i 2^8


00000000000000000000001000000000 j 2^9


00000000000000000000010000000000 k 2^10


00000000000000000000100000000000 l 2^11


00000000000000000001000000000000 m 2^12


00000000000000000010000000000000 n 2^13


00000000000000000100000000000000 o 2^14


00000000000000001000000000000000 p 2^15


00000000000000010000000000000000 q 2^16


00000000000000100000000000000000 r 2^17


00000000000001000000000000000000 s 2^18


00000000000010000000000000000000 t 2^19


00000000000100000000000000000000 u 2^20


00000000001000000000000000000000 v 2^21


00000000010000000000000000000000 w 2^22


00000000100000000000000000000000 x 2^23


00000001000000000000000000000000 y 2^24


00000010000000000000000000000000 z 2^25

因此,对于输入字符串‘ azya’,当我们逐步移动时

字符串“ a”

a      =00000000000000000000000000000001
checker=00000000000000000000000000000000


checker='a' or checker;
// checker now becomes = 00000000000000000000000000000001
checker=00000000000000000000000000000001


a and checker=0 no dupes condition

字符串‘ az’

checker=00000000000000000000000000000001
z      =00000010000000000000000000000000


z and checker=0 no dupes


checker=z or checker;
// checker now becomes 00000010000000000000000000000001
      

字符串’azy’

checker= 00000010000000000000000000000001
y      = 00000001000000000000000000000000


checker and y=0 no dupes condition


checker= checker or y;
// checker now becomes = 00000011000000000000000000000001

字符串’阿齐亚’

checker= 00000011000000000000000000000001
a      = 00000000000000000000000000000001


a and checker=1 we have a dupe

现在,它声明一个 复制品

public static void main (String[] args)
{
//In order to understand this algorithm, it is necessary to understand the following:


//int checker = 0;
//Here we are using the primitive int almost like an array of size 32 where the only values can be 1 or 0
//Since in Java, we have 4 bytes per int, 8 bits per byte, we have a total of 4x8=32 bits to work with


//int val = str.charAt(i) - 'a';
//In order to understand what is going on here, we must realize that all characters have a numeric value
for (int i = 0; i < 256; i++)
{
char val = (char)i;
System.out.print(val);
}


//The output is something like:
//             !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
//There seems to be ~15 leading spaces that do not copy paste well, so I had to use real spaces instead


//To only print the characters from 'a' on forward:
System.out.println();
System.out.println();


for (int i=0; i < 256; i++)
{
char val = (char)i;
//char val2 = val + 'a'; //incompatible types. required: char found: int
int val2 = val + 'a';  //shift to the 'a', we must use an int here otherwise the compiler will complain
char val3 = (char)val2;  //convert back to char. there should be a more elegant way of doing this.
System.out.print(val3);
}


//Notice how the following does not work:
System.out.println();
System.out.println();


for (int i=0; i < 256; i++)
{
char val = (char)i;
int val2 = val - 'a';
char val3 = (char)val2;
System.out.print(val3);
}
//I'm not sure why this spills out into 2 lines:
//EDIT I cant seem to copy this into stackoverflow!


System.out.println();
System.out.println();


//So back to our original algorithm:
//int val = str.charAt(i) - 'a';
//We convert the i'th character of the String to a character, and shift it to the right, since adding shifts to the right and subtracting shifts to the left it seems


//if ((checker & (1 << val)) > 0) return false;
//This line is quite a mouthful, lets break it down:
System.out.println(0<<0);
//00000000000000000000000000000000
System.out.println(0<<1);
//00000000000000000000000000000000
System.out.println(0<<2);
//00000000000000000000000000000000
System.out.println(0<<3);
//00000000000000000000000000000000
System.out.println(1<<0);
//00000000000000000000000000000001
System.out.println(1<<1);
//00000000000000000000000000000010 == 2
System.out.println(1<<2);
//00000000000000000000000000000100 == 4
System.out.println(1<<3);
//00000000000000000000000000001000 == 8
System.out.println(2<<0);
//00000000000000000000000000000010 == 2
System.out.println(2<<1);
//00000000000000000000000000000100 == 4
System.out.println(2<<2);
// == 8
System.out.println(2<<3);
// == 16
System.out.println("3<<0 == "+(3<<0));
// != 4 why 3???
System.out.println(3<<1);
//00000000000000000000000000000011 == 3
//shift left by 1
//00000000000000000000000000000110 == 6
System.out.println(3<<2);
//00000000000000000000000000000011 == 3
//shift left by 2
//00000000000000000000000000001100 == 12
System.out.println(3<<3);
// 24


//It seems that the -  'a' is not necessary
//Back to if ((checker & (1 << val)) > 0) return false;
//(1 << val means we simply shift 1 by the numeric representation of the current character
//the bitwise & works as such:
System.out.println();
System.out.println();
System.out.println(0&0);    //0
System.out.println(0&1);       //0
System.out.println(0&2);          //0
System.out.println();
System.out.println();
System.out.println(1&0);    //0
System.out.println(1&1);       //1
System.out.println(1&2);          //0
System.out.println(1&3);             //1
System.out.println();
System.out.println();
System.out.println(2&0);    //0
System.out.println(2&1);       //0   0010 & 0001 == 0000 = 0
System.out.println(2&2);          //2  0010 & 0010 == 2
System.out.println(2&3);             //2  0010 & 0011 = 0010 == 2
System.out.println();
System.out.println();
System.out.println(3&0);    //0    0011 & 0000 == 0
System.out.println(3&1);       //1  0011 & 0001 == 0001 == 1
System.out.println(3&2);          //2  0011 & 0010 == 0010 == 2, 0&1 = 0 1&1 = 1
System.out.println(3&3);             //3 why?? 3 == 0011 & 0011 == 3???
System.out.println(9&11);   // should be... 1001 & 1011 == 1001 == 8+1 == 9?? yay!


//so when we do (1 << val), we take 0001 and shift it by say, 97 for 'a', since any 'a' is also 97


//why is it that the result of bitwise & is > 0 means its a dupe?
//lets see..


//0011 & 0011 is 0011 means its a dupe
//0000 & 0011 is 0000 means no dupe
//0010 & 0001 is 0011 means its no dupe
//hmm
//only when it is all 0000 means its no dupe


//so moving on:
//checker |= (1 << val)
//the |= needs exploring:


int x = 0;
int y = 1;
int z = 2;
int a = 3;
int b = 4;
System.out.println("x|=1 "+(x|=1));  //1
System.out.println(x|=1);     //1
System.out.println(x|=1);      //1
System.out.println(x|=1);       //1
System.out.println(x|=1);       //1
System.out.println(y|=1); // 0001 |= 0001 == ?? 1????
System.out.println(y|=2); // ??? == 3 why??? 0001 |= 0010 == 3... hmm
System.out.println(y);  //should be 3??
System.out.println(y|=1); //already 3 so... 0011 |= 0001... maybe 0011 again? 3?
System.out.println(y|=2); //0011 |= 0010..... hmm maybe.. 0011??? still 3? yup!
System.out.println(y|=3); //0011 |= 0011, still 3
System.out.println(y|=4);  //0011 |= 0100.. should be... 0111? so... 11? no its 7
System.out.println(y|=5);  //so we're at 7 which is 0111, 0111 |= 0101 means 0111 still 7
System.out.println(b|=9); //so 0100 |= 1001 is... seems like xor?? or just or i think, just or... so its 1101 so its 13? YAY!


//so the |= is just a bitwise OR!
}


public static boolean isUniqueChars(String str) {
int checker = 0;
for (int i = 0; i < str.length(); ++i) {
int val = str.charAt(i) - 'a';  //the - 'a' is just smoke and mirrors! not necessary!
if ((checker & (1 << val)) > 0) return false;
checker |= (1 << val);
}
return true;
}


public static boolean is_unique(String input)
{
int using_int_as_32_flags = 0;
for (int i=0; i < input.length(); i++)
{
int numeric_representation_of_char_at_i = input.charAt(i);
int using_0001_and_shifting_it_by_the_numeric_representation = 1 << numeric_representation_of_char_at_i; //here we shift the bitwise representation of 1 by the numeric val of the character
int result_of_bitwise_and = using_int_as_32_flags & using_0001_and_shifting_it_by_the_numeric_representation;
boolean already_bit_flagged = result_of_bitwise_and > 0;              //needs clarification why is it that the result of bitwise & is > 0 means its a dupe?
if (already_bit_flagged)
return false;
using_int_as_32_flags |= using_0001_and_shifting_it_by_the_numeric_representation;
}
return true;
}

我还假设您的示例来自 破解代码访谈一书,我的答案与这个上下文有关。

为了使用这个算法来解决这个问题,我们必须承认我们只是将字符从 a 传递到 z (小写)。

由于只有26个字母,并且这些字母在我们使用的编码表中被正确排序,这保证了所有潜在的差异 str.charAt(i) - 'a'都小于32(int 变量 checker的大小)。

正如斯诺贝尔解释的那样,我们将使用 checker变量作为位数组。让我们举个例子:

这么说吧 str equals "test"

  • 第一遍(i = t)

Check = = 0(0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

In ASCII, val = str.charAt(i) - 'a' = 116 - 97 = 19
What about 1 << val ?
1          == 00000000000000000000000000000001
1 << 19    == 00000000000010000000000000000000
checker |= (1 << val) means checker = checker | (1 << val)
so checker = 00000000000000000000000000000000 | 00000000000010000000000000000000
checker == 524288 (00000000000010000000000000000000)
  • 第二次通过(i = e)

Checker = = 524288(000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000

val = 101 - 97 = 4
1          == 00000000000000000000000000000001
1 << 4     == 00000000000000000000000000010000
checker |= (1 << val)
so checker = 00000000000010000000000000000000 | 00000000000000000000000000010000
checker == 524304 (00000000000010000000000000010000)

等等,直到我们通过条件在检查器中找到一个特定字符的已经设置的位

(checker & (1 << val)) > 0

希望能有帮助

上面已经提供了几个很好的答案。所以我不想重复之前说过的话。但确实想增加一些东西,以帮助上述程序,因为我只是通过相同的程序,并有几个问题,但花了一些时间后,我对这个程序有更清晰的了解。

首先,“检查器”用于跟踪字符串中已经遍历的字符,以查看是否有任何字符被重复。

现在“ Checker”是一个 int 数据类型,因此它只能有32位或4字节(取决于平台) ,所以这个程序只能对32个字符范围内的字符集正确工作。这就是为什么,这个程序从每个字符中减去‘ a’,以便使这个程序只运行小写字符。但是,如果你混合小写和大写字符,那么它将不会工作。

顺便说一下,如果你不从每个字符中减去‘ a’(参见下面的语句) ,那么这个程序将只对大写字符的字符串或只有小写字符的字符串正确工作。因此,上述程序的范围也从小写字符增加到大写字符,但它们不能混合在一起。

int val = str.charAt(i) - 'a';

然而,我想写一个通用程序,使用位操作,应该工作的任何 ASCII 字符,而不用担心大写,小写,数字或任何特殊字符。为了做到这一点,我们的“检查器”应该大到足以存储256个字符(ASCII 字符集大小)。但是在 Java 中 int 不能工作,因为它只能存储32位。因此在下面的程序中,我使用了 JDK 中提供的 BitSet 类,它可以在实例化 BitSet 对象时传递任何用户定义的大小。

下面是一个程序,它的功能和上面的程序一样,都是使用 Bitwise 运算符编写的,但是这个程序可以处理来自 ASCII 字符集的任何字符串。

public static boolean isUniqueStringUsingBitVectorClass(String s) {


final int ASCII_CHARACTER_SET_SIZE = 256;


final BitSet tracker = new BitSet(ASCII_CHARACTER_SET_SIZE);


// if more than  256 ASCII characters then there can't be unique characters
if(s.length() > 256) {
return false;
}


//this will be used to keep the location of each character in String
final BitSet charBitLocation = new BitSet(ASCII_CHARACTER_SET_SIZE);


for(int i = 0; i < s.length(); i++) {


int charVal = s.charAt(i);
charBitLocation.set(charVal); //set the char location in BitSet


//check if tracker has already bit set with the bit present in charBitLocation
if(tracker.intersects(charBitLocation)) {
return false;
}


//set the tracker with new bit from charBitLocation
tracker.or(charBitLocation);


charBitLocation.clear(); //clear charBitLocation to store bit for character in the next iteration of the loop


}


return true;


}

我认为所有这些答案确实解释了它是如何工作的,然而我想通过重命名一些变量,添加一些其他变量,并添加评论,来更好地说明我是如何看待它的:

public static boolean isUniqueChars(String str) {


/*
checker is the bit array, it will have a 1 on the character index that
has appeared before and a 0 if the character has not appeared, you
can see this number initialized as 32 0 bits:
00000000 00000000 00000000 00000000
*/
int checker = 0;


//loop through each String character
for (int i = 0; i < str.length(); ++i) {
/*
a through z in ASCII are charactets numbered 97 through 122, 26 characters total
with this, you get a number between 0 and 25 to represent each character index
0 for 'a' and 25 for 'z'


renamed 'val' as 'characterIndex' to be more descriptive
*/
int characterIndex = str.charAt(i) - 'a'; //char 'a' would get 0 and char 'z' would get 26


/*
created a new variable to make things clearer 'singleBitOnPosition'


It is used to calculate a number that represents the bit value of having that
character index as a 1 and the rest as a 0, this is achieved
by getting the single digit 1 and shifting it to the left as many
times as the character index requires
e.g. character 'd'
00000000 00000000 00000000 00000001
Shift 3 spaces to the left (<<) because 'd' index is number 3
1 shift: 00000000 00000000 00000000 00000010
2 shift: 00000000 00000000 00000000 00000100
3 shift: 00000000 00000000 00000000 00001000


Therefore the number representing 'd' is
00000000 00000000 00000000 00001000


*/
int singleBitOnPosition = 1 << characterIndex;


/*
This peforms an AND between the checker, which is the bit array
containing everything that has been found before and the number
representing the bit that will be turned on for this particular
character. e.g.
if we have already seen 'a', 'b' and 'd', checker will have:
checker = 00000000 00000000 00000000 00001011
And if we see 'b' again:
'b' = 00000000 00000000 00000000 00000010


it will do the following:
00000000 00000000 00000000 00001011
& (AND)
00000000 00000000 00000000 00000010
-----------------------------------
00000000 00000000 00000000 00000010


Since this number is different than '0' it means that the character
was seen before, because on that character index we already have a
1 bit value
*/
if ((checker & singleBitOnPosition) > 0) {
return false;
}


/*
Remember that
checker |= singleBitOnPosition is the same as
checker = checker | singleBitOnPosition
Sometimes it is easier to see it expanded like that.


What this achieves is that it builds the checker to have the new
value it hasnt seen, by doing an OR between checker and the value
representing this character index as a 1. e.g.
If the character is 'f' and the checker has seen 'g' and 'a', the
following will happen


'f' = 00000000 00000000 00000000 00100000
checker(seen 'a' and 'g' so far) = 00000000 00000000 00000000 01000001


00000000 00000000 00000000 00100000
| (OR)
00000000 00000000 00000000 01000001
-----------------------------------
00000000 00000000 00000000 01100001


Therefore getting a new checker as 00000000 00000000 00000000 01100001


*/
checker |= singleBitOnPosition;
}
return true;
}

阅读伊万上面的答案确实帮了我很大的忙,尽管我的措辞有些不同。

(1 << val)中的 <<是位移运算符。它接受 1(在二进制中表示为 000000001,前面有多少个零,可以根据内存分配) ,并通过 val空格将其左移。因为我们假设只有 a-z 并且每次减去 a,每个字母的值将为0-25,这将是该字母在 checker整数的布尔表示中从右边开始的索引,因为我们将在 checker val中将 1向左移动。

在每个检查的末尾,我们看到 |=操作符。这将合并两个二进制数,如果在该索引的任一操作数中存在 1,则用 1替换所有 0。在这里,这意味着无论 1存在于 (1 << val)中的哪个位置,这个 1都将被复制到 checker中,而所有 checker现有的1将被保留。

正如您可能猜到的,1在这里作为 true 的布尔标志。当我们检查一个字符是否已经在字符串中表示时,我们比较 checker,它在这一点上实质上是一个在已经表示的字符的索引处的布尔标志数组(1值) ,它实质上是一个布尔值数组,在当前字符的索引处有一个 1标志。

&操作员完成这个检查。与 |=类似,如果两个操作数在该索引处都有 1,则 &操作符将复制到 1 |=1上。因此,本质上,只有 checker中已经存在的、也在 (1 << val)中表示的标志才会被复制过来。在这种情况下,这意味着只有当当前字符已经被表示时,才会有一个 1出现在 checker & (1 << val)的结果中的任何地方。如果该操作的结果中有一个 1,那么返回的布尔值就是 |=0,该方法返回 false。

我猜这就是位向量也叫 位数组的原因。因为,即使它们不属于数组数据类型,它们的使用方式也类似于数组用于存储布尔标志的方式。

简单说明(下面有 JS 代码)

  • 每个机器代码的整数变量是 32位数组
  • 所有位明智的操作都是 32-bit
  • 它们是不可知的 OS/CPU 架构或选择的语言数字系统,例如 DEC64为 JS。
  • 这种重复查找方法类似于 在大小为32的数组中存储字符,如果在字符串中找到 a,我们设置 0th索引,1st用于 b等等。
  • 字符串中的重复字符将占用其对应的位,或者在本例中将其设置为1。
  • Ivan 已经解释过 : 这个索引计算在前一个答案 中是如何工作的。

业务概要:

  • 在字符的 checkerindex之间执行 还有操作
  • 内部都是 Int-32-Arrays
  • 这是两者之间的一个比特智能操作。
  • 检查 if操作的输出为 1
  • 如果 output == 1
    • checker变量在两个数组中都设置了 那个特定的索引 -th 位
    • 所以这是个复制品。
  • 如果 output == 0
    • 到目前为止还没有找到这个角色
    • 在字符的 checkerindex之间执行 或者操作
    • 因此,将 index-th 位更新为 1
    • 将输出分配给 checker

假设:

  • 我们假设我们会得到所有的小写字母
  • 32号就够了
  • 因此,我们从 96号作为参考点开始索引计数,考虑到 aAscii代码是 97

下面给出的是 JavaScript源代码。

function checkIfUniqueChars (str) {


var checker = 0; // 32 or 64 bit integer variable


for (var i = 0; i< str.length; i++) {
var index = str[i].charCodeAt(0) - 96;
var bitRepresentationOfIndex = 1 << index;


if ( (checker & bitRepresentationOfIndex) > 1) {
console.log(str, false);
return false;
} else {
checker = (checker | bitRepresentationOfIndex);
}
}
console.log(str, true);
return true;
}


checkIfUniqueChars("abcdefghi");  // true
checkIfUniqueChars("aabcdefghi"); // false
checkIfUniqueChars("abbcdefghi"); // false
checkIfUniqueChars("abcdefghii"); // false
checkIfUniqueChars("abcdefghii"); // false

注意 ,在 JS 中,尽管整数为64位,但总是对32位执行位明智的操作。

如果字符串是 aa,那么:

// checker is intialized to 32-bit-Int(0)
// therefore, checker is
checker= 00000000000000000000000000000000

I = 0

str[0] is 'a'
str[i].charCodeAt(0) - 96 = 1


checker 'AND' 32-bit-Int(1) = 00000000000000000000000000000000
Boolean(0) == false


// So, we go for the '`OR`' operation.


checker = checker OR 32-bit-Int(1)
checker = 00000000000000000000000000000001

I = 1

str[1] is 'a'
str[i].charCodeAt(0) - 96 = 1


checker= 00000000000000000000000000000001
a      = 00000000000000000000000000000001


checker 'AND' 32-bit-Int(1) = 00000000000000000000000000000001
Boolean(1) == true
// We've our duplicate now

让我们逐行分解代码。

我们正在启动一个检查器,它将帮助我们找到重复的值。

Int val = str.charAt (i)-‘ a’; 我们得到字符串第 i 个位置的字符的 ASCII 值,然后用‘ a’的 ASCII 值减去它。由于假设字符串只是较低的字符,所以字符数限制为26。Hece,‘ val’的值总是 > = 0。

如果((检查 & (1 < val)) > 0)返回 false;

检查器 | = (1 < val) ;

现在是棘手的部分。让我们考虑一个有字符串“ abcda”的例子。这应该理想地返回 false。

对于循环迭代1:

检查器: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Val: 97-97 = 0

1 < 0:000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

(1 < val) : 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

因此检查器: 0000000000000000000000000000000000001

对于循环迭代2:

检查器: 0000000000000000000000000000000000000001

Val: 98-97 = 1

1 < 1:0000000000000000000000000000000000000010

(1 < val) : 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

因此检查器: 0000000000000000000000000000000011

对于循环迭代3:

检查器: 00000000000000000000000000000000011

Val: 99-97 = 2

1 < 2:00000000000000000000000000000000000000100

(1 < val) : 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

因此检查器: 00000000000000000000000000000000111

对于循环迭代4:

检查器: 00000000000000000000000000000000111

Val: 100-97 = 3

1 < 3:00000000000000000000000000000000001000

(1 < val) : 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

因此检查器: 00000000000000000000000000000001111

对于循环迭代5:

检查程序: 000000000000000000000000000001111

Val: 97-97 = 0

1 < 0:000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Checker & (1 < val) : 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

所以还是假的。

之前的帖子很好地解释了代码块的作用,我想用 BitSet java 数据结构添加我的简单解决方案:

private static String isUniqueCharsUsingBitSet(String string) {
BitSet bitSet =new BitSet();
for (int i = 0; i < string.length(); ++i) {
int val = string.charAt(i);
if(bitSet.get(val)) return "NO";
bitSet.set(val);
}
return "YES";
}
Line 1:   public static boolean isUniqueChars(String str) {
Line 2:      int checker = 0;
Line 3:      for (int i = 0; i < str.length(); ++i) {
Line 4:          int val = str.charAt(i) - 'a';
Line 5:          if ((checker & (1 << val)) > 0) return false;
Line 6:         checker |= (1 << val);
Line 7:      }
Line 8:      return true;
Line 9:   }

假设输入 var inputChar = "abca"; //find if inputChar has all unique characters

开始吧

Line 4: int val = str.charAt(i) - 'a';

以上行在 inputChar 中查找第一个字符的二进制值,即 ,在 ascii 中查找 A = 97,然后将97转换为二进制变为 1100001

在 Javascript 中,例如: "a".charCodeAt().toString(2)返回1100001

checker = 0//二进制32位表示 = 0000000000000000000

checker = 1100001 | checker;//Checker 变成1100001(在32位表示中它变成00000000... ... 00001100001)

但我希望我的位掩码(int checker)只设置一位,但检查器是1100001

Line 4:          int val = str.charAt(i) - 'a';

现在上面的代码很方便,我只需要总是减去97(a 的 ASCII 值)

val = 0; // 97 - 97  Which is  a - a
val = 1; // 98 - 97 Which is b - a
val = 1;  // 99 - 97 Which is c - a

让我们使用重置的 val

第5行和第6行解释得很清楚: “伊万回答

以防有人用位向量在字符串中寻找与 kotlin 等价的唯一字符

fun isUnique(str: String): Boolean {
var checker = 0
for (i in str.indices) {
val bit = str.get(i) - 'a'
if (checker.and(1 shl bit) > 0) return false
checker = checker.or(1 shl bit)
}
return true
}

档号: https://www.programiz.com/kotlin-programming/bitwise