混淆身份

我正在寻找一种方法来加密/混淆一个整数 ID 到另一个整数。更准确地说,我需要一个函数 int F(int x)

  • X <-> F (x)是双射(如果 x! = y,F (x) ! = F (y))
  • 给定 F (x) ,很容易得到 x-因此 F 不是散列函数
  • 给定 x 和 F (x) ,很难/不可能求出 F (y) ,像 x ^ 0x1234这样的函数不起作用

为了明确起见,我不是在寻找一个强大的加密解决方案,这只是模糊处理。想象一下,一个网页应用程序的网址,如 example.com/profile/1example.com/profile/2等。配置文件本身并不是秘密,但我想防止偶尔偷窥者查看/获取所有配置文件一个接一个,所以我宁愿隐藏他们后面的东西,如 example.com/profile/23423example.com/profile/80980234等。尽管数据库存储的令牌可以非常容易地完成这项工作,但我很好奇是否有一些简单的数学运算可用于此。

我不清楚的一个重要要求是,结果应该看起来“随机”,也就是说,给定一个序列 x,x+1,...,x+nF(x),F(x+1)...F(x+n)不应该形成任何类型的进展。

25623 次浏览

Not sure how "hard" you need it to be, how fast, or how little memory to use. If you have no memory constraints you could make a list of all integers, shuffle them and use that list as a mapping. However, even for a 4 byte integer you would need a lot of memory.

However, this could be made smaller so instead of mapping all integers you would map only 2 (or worst case 1) byte and apply this to each group in the integer. So, using 2 bytes a integer would be (group1)(group2) you would map each group through the random map. But that means that if you only change group2 then the mapping for group1 would stay the same. This could "fixed" by mapping different bits to each group.

So, *(group2) could be (bit 14,12,10,8,6,4,2,0) so, adding 1 would change both group1 and group2.

Still, this is only security by obscurity, anyone that can feed numbers into your function (even if you keep the function secret) could fairly easily figure it out.

What you're describing here seems to be the opposite of a one-way function: it's easy to invert but super difficult to apply. One option would be to use a standard, off-the-shelf public-key encryption algorithm where you fix a (secret, randomly-chosen) public key that you keep a secret and a private key that you share with the world. That way, your function F(x) would be the encryption of x using the public key. You could then easily decrypt F(x) back to x by using the private decryption key. Notice that the roles of the public and private key are reversed here - you give out the private key to everyone so that they can decrypt the function, but keep the public key secret on your server. That way:

  1. The function is a bijection, so it's invertible.
  2. Given F(x), x is efficiently computable.
  3. Given x and F(x), it is extremely difficult to compute F(y) from y, since without the public key (assuming you use a cryptographically strong encryption scheme) there is no feasible way to encrypt the data, even if the private decryption key is known.

This has many advantages. First, you can rest assured that the crypto system is safe, since if you use a well-established algorithm like RSA then you don't need to worry about accidental insecurity. Second, there are already libraries out there to do this, so you don't need to code much up and can be immune to side-channel attacks. Finally, you can make it possible for anyone to go and invert F(x) without anyone actually being able to compute F(x).

One detail- you should definitely not just be using the standard int type here. Even with 64-bit integers, there are so few combinations possible that an attacker could just brute-force try inverting everything until they find the encryption F(y) for some y even if they don't have the key. I would suggest using something like a 512-bit value, since even a science fiction attack would not be able to brute-force this.

Hope this helps!

Generate a private symmetric key for use in your application, and encrypt your integer with it. This will satisfy all three requirements, including the hardest #3: one would need to guess your key in order to break your scheme.

Obfuscation is not really sufficient in terms of security.

However, if you are trying to thwart the casual onlooker, I'd recommend a combination of two methods:

  • A private key that you combine with the id by xor'ing them together
  • Rotating the bits by a certain amount both before and after the key has been applied

Here is an example (using pseudo code):

  def F(x)
x = x XOR 31415927       # XOR x with a secret key
x = rotl(x, 5)           # rotate the bits left 5 times
x = x XOR 31415927       # XOR x with a secret key again
x = rotr(x, 5)           # rotate the bits right 5 times
x = x XOR 31415927       # XOR x with a secret key again
return x                 # return the value
end

I haven't tested it, but I think this is reversible, should be fast, and not too easy to tease out the method.

You want the transformation to be reversible, and not obvious. That sounds like an encryption that takes a number in a given range and produces a different number in the same range. If your range is 64 bit numbers, then use DES. If your range is 128 bit numbers then use AES. If you want a different range, then your best bet is probably Hasty Pudding cipher, which is designed to cope with different block sizes and with number ranges that do not fit neatly into a block, such as 100,000 to 999,999.

Obfuscate it with some combination of 2 or 3 simple methods:

  • XOR
  • shuffle individual bits
  • convert to modular representation (D.Knuth, Vol. 2, Chapter 4.3.2)
  • choose 32 (or 64) overlapping subsets of bits and XOR bits in each subset (parity bits of subsets)
  • represent it in variable-length numberic system and shuffle digits
  • choose a pair of odd integers x and y that are multiplicative inverses of each other (modulo 232), then multiply by x to obfuscate and multiply by y to restore, all multiplications are modulo 232 (source: "A practical use of multiplicative inverses" by Eric Lippert)

Variable-length numberic system method does not obey your "progression" requirement on its own. It always produces short arithmetic progressions. But when combined with some other method, it gives good results.

The same is true for the modular representation method.

Here is C++ code example for 3 of these methods. Shuffle bits example may use some different masks and distances to be more unpredictable. Other 2 examples are good for small numbers (just to give the idea). They should be extended to obfuscate all integer values properly.

// *** Numberic system base: (4, 3, 5) -> (5, 3, 4)
// In real life all the bases multiplied should be near 2^32
unsigned y = x/15 + ((x/5)%3)*4 + (x%5)*12; // obfuscate
unsigned z = y/12 + ((y/4)%3)*5 + (y%4)*15; // restore


// *** Shuffle bits (method used here is described in D.Knuth's vol.4a chapter 7.1.3)
const unsigned mask1 = 0x00550055; const unsigned d1 = 7;
const unsigned mask2 = 0x0000cccc; const unsigned d2 = 14;


// Obfuscate
unsigned t = (x ^ (x >> d1)) & mask1;
unsigned u = x ^ t ^ (t << d1);
t = (u ^ (u  >> d2)) & mask2;
y = u ^ t ^ (t << d2);


// Restore
t = (y ^ (y >> d2)) & mask2;
u = y ^ t ^ (t << d2);
t = (u ^ (u >> d1)) & mask1;
z = u ^ t ^ (t << d1);


// *** Subset parity
t = (x ^ (x >> 1)) & 0x44444444;
u = (x ^ (x << 2)) & 0xcccccccc;
y = ((x & 0x88888888) >> 3) | (t >> 1) | u; // obfuscate


t = ((y & 0x11111111) << 3) | (((y & 0x11111111) << 2) ^ ((y & 0x22222222) << 1));
z = t | ((t >> 2) ^ ((y >> 2) & 0x33333333)); // restore

Do anything with the bits of the ID that won't destroy them. For example:

  • rotate the value
  • use lookup to replace certain parts of the value
  • xor with some value
  • swap bits
  • swap bytes
  • mirror the whole value
  • mirror a part of the value
  • ... use your imagination

For decryption, do all that in reverse order.

Create a program that will 'encrypt' some interesting values for you and put them in a table you can examine. Have same program TEST your encryption/decryption routine WITH all set of values that you want to have in your system.

Add stuff to the above list into the routines until your numbers will look properly mangled to you.

For anything else, get a copy of The Book.

If xor is acceptable for everything but inferring F(y) given x and F(x) then I think you can do that with a F(y)8. First choose a secret one-way function. For example S(s) = MD5(secret ^ s). Then F(x) = (s, S(s) ^ x) where s is chosen randomly. I wrote that as a tuple but you can combine the two parts into an integer, e.g. F(x) = 10000 * s + S(s) ^ x. The decryption extracts the salt s again and uses F'(F(x)) = S(extract s) ^ (extract S(s)^x). Given x and F(x) you can see s (though it is slightly obfuscated) and you can infer F(y)3 but for some other user F(y)4 with a different random salt F(y)5 the user knowing F(x) can't find F(y)7.

I wrote an article on secure permutations with block ciphers, which ought to fulfil your requirements as stated.

I'd suggest, though, that if you want hard to guess identifiers, you should just use them in the first place: generate UUIDs, and use those as the primary key for your records in the first place - there's no need to be able to convert to and from a 'real' ID.

I found this particular piece of Python/PHP code very useful:

https://github.com/marekweb/opaque-id

I wrote some JS code using some of the ideas in this thread:

const BITS = 32n;
const MAX = 4294967295n;
const COPRIME = 65521n;
const INVERSE = 2166657316n;
const ROT = 6n;
const XOR1 = 10296065n;
const XOR2 = 2426476569n;




function rotRight(n, bits, size) {
const mask = (1n << bits) - 1n;
// console.log('mask',mask.toString(2).padStart(Number(size),'0'));
const left = n & mask;
const right = n >> bits;
return (left << (size - bits)) | right;
}


const pipe = fns => fns.reduce((f, g) => (...args) => g(f(...args)));


function build(...fns) {
const enc = fns.map(f => Array.isArray(f) ? f[0] : f);
const dec = fns.map(f => Array.isArray(f) ? f[1] : f).reverse();


return [
pipe(enc),
pipe(dec),
]
}


[exports.encode, exports.decode] = build(
[BigInt, Number],
[i => (i * COPRIME) % MAX, i => (i * INVERSE) % MAX],
x => x ^ XOR1,
[x => rotRight(x, ROT, BITS), x => rotRight(x, BITS-ROT, BITS)],
x => x ^ XOR2,
);

It produces some nice results like:

1 1352888202n 1 'mdh37u'
2 480471946n 2 '7y26iy'
3 3634587530n 3 '1o3xtoq'
4 2225300362n 4 '10svwqy'
5 1084456843n 5 'hxno97'
6 212040587n 6 '3i8rkb'
7 3366156171n 7 '1jo4eq3'
8 3030610827n 8 '1e4cia3'
9 1889750920n 9 'v93x54'
10 1017334664n 10 'gtp0g8'
11 4171450248n 11 '1wzknm0'
12 2762163080n 12 '19oiqo8'
13 1621319561n 13 'qtai6h'
14 748903305n 14 'cdvlhl'
15 3903018889n 15 '1sjr8nd'
16 3567473545n 16 '1mzzc7d'
17 2426613641n 17 '144qr2h'
18 1554197390n 18 'ppbudq'
19 413345678n 19 '6u3fke'
20 3299025806n 20 '1ik5klq'
21 2158182286n 21 'zoxc3y'
22 1285766031n 22 'l9iff3'
23 144914319n 23 '2ea0lr'
24 4104336271n 24 '1vvm64v'
25 2963476367n 25 '1d0dkzz'
26 2091060108n 26 'ykyob0'
27 950208396n 27 'fpq9ho'
28 3835888524n 28 '1rfsej0'
29 2695045004n 29 '18kk618'
30 1822628749n 30 'u559cd'
31 681777037n 31 'b9wuj1'
32 346231693n 32 '5q4y31'

Testing with:

  const {encode,decode} = require('./obfuscate')


for(let i = 1; i <= 1000; ++i) {
const j = encode(i);
const k = decode(j);
console.log(i, j, k, j.toString(36));
}

XOR1 and XOR2 are just random numbers between 0 and MAX. MAX is 2**32-1; you should set this to whatever you think your highest ID will be.

COPRIME is a number that's coprime w/ MAX. I think prime numbers themselves are coprime with every other number (except multiples of themselves).

INVERSE is the tricky one to figure out. These blog posts don't give a straight answer, but WolframAlpha can figure it out for you. Basically, just solve the equation (COPRIME * x) % MAX = 1 for x.

The build function is something I created to make it easier to create these encode/decode pipelines. You can feed it as many operations as you want as [encode, decode] pairs. These functions have to be equal and opposite. The XOR functions are their own compliments so you don't need a pair there.


Here's another fun involution:

function mixHalves(n) {
const mask = 2n**12n-1n;
const right = n & mask;
const left = n >> 12n;
const mix = left ^ right;
return (mix << 12n) | right;
}

(assumes 24-bit integers -- just change the numbers for any other size)