没有 cookie 或本地存储的用户识别

小开

最佳答案

简介

如果我没理解错的话，你需要识别一个你没有唯一标识符的用户，所以你需要通过匹配随机数据来找出他们是谁。你不能可靠地存储用户的身份信息，因为:

Cookies 可以被删除
IP 地址可能改变
浏览器可以更改
浏览器缓存可能被删除

Java Applet 或 Com Object 可能是一个使用硬件信息散列的简单解决方案，但是现在人们对安全非常敏感，很难让人们在他们的系统上安装这类程序。这使得您只能使用 Cookie 和其他类似的工具。

饼干和其他类似的工具

您可以考虑构建一个数据配置文件，然后使用概率测试来标识 可能使用者。一个有用的配置文件可以通过以下一些组合生成:

IP 地址
- 真正的 IP 地址
- 代理 IP 地址(用户经常重复使用同一个代理)
饼干
- HTTP Cookies
- 会话饼干
- 第三方饼干
- Flash Cookies (大多数人都不知道怎么删除这些)
网络错误(因为错误得到修复而不太可靠，但仍然有用)
- PDF Bug
- 闪电虫
- 爪哇虫
浏览器
- 单击“跟踪”(许多用户在每次访问时访问相同的页面系列)
- 浏览器指纹已安装的插件(人们通常有各种各样的，有点独特的插件集)
- 缓存图像(人们有时会删除他们的 cookie，但是留下缓存图像)
- 使用斑点
- URL (浏览器历史记录或 Cookie 可能在 URL 中包含唯一的用户 ID，例如 https://stackoverflow.com/users/1226894或 http://www.facebook.com/barackobama?fref=ts)
- 系统字体检测 (这是一个鲜为人知但通常是唯一的密钥签名)
HTML5 & Javascript
- HTML5本地存储
- HTML5地理定位 API 和反座标化
- 体系结构，操作系统语言，系统时间，屏幕分辨率等。
- 网络信息 API
- 电池状态 API

当然，我列出的项目只是唯一标识用户的几种可能方式。还有很多。

有了这组随机数据元素来构建数据配置文件，下一步是什么？

下一步是开发一些模糊逻辑，或者更好的是，一个人工神经网络(使用模糊逻辑)。在任何一种情况下，这个想法是训练您的系统，然后将其训练与贝叶斯推断结合起来，以提高结果的准确性。

Artificial Neural Network

PHP 的 NeuralMesh库允许生成人工神经网络。要实施贝叶斯推断，请浏览以下连结:

在这一点上，你可能会想:

为什么一个看似简单的任务需要这么多的数学和逻辑？

基本上，因为它是 不是一个简单的任务。实际上，您要实现的是 纯概率。例如，鉴于以下已知用户:

User1 = A + B + C + D + G + K
User2 = C + D + I + J + K + F

当你收到以下资料:

B + C + E + G + F + K

你实际上在问的问题是:

接收到的数据(B + C + E + G + F + K)实际上是 User1或 User2的概率是多少？这两个匹配哪个是 大部分可能的？

为了有效地回答这个问题，您需要了解频率与概率格式以及为什么联合概率可能是更好的方法。这里的细节太多了(这就是为什么我给你链接) ，但是一个很好的例子是医疗诊断向导应用程序，它使用综合症状来识别可能的疾病。

考虑一下构成数据概要文件(上面示例中的 B + C + E + G + F + K)的一系列数据点作为症状，未知用户作为疾病。通过识别疾病，您可以进一步识别适当的治疗方法(将此用户视为 User1)。

显然，我们已经识别了多于1个症状的疾病更容易识别。事实上，我们能识别的症状越多，我们的诊断就越简单和准确。

还有其他选择吗？

当然。作为一种替代措施，您可以创建自己的简单计分算法，并基于精确匹配。这不像概率那样有效，但对您来说实现起来可能更简单。

作为一个例子，考虑下面这个简单的得分表:

+-------------------------+--------+------------+
|        Property         | Weight | Importance |
+-------------------------+--------+------------+
| Real IP address         |     60 |          5 |
| Used proxy IP address   |     40 |          4 |
| HTTP Cookies            |     80 |          8 |
| Session Cookies         |     80 |          6 |
| 3rd Party Cookies       |     60 |          4 |
| Flash Cookies           |     90 |          7 |
| PDF Bug                 |     20 |          1 |
| Flash Bug               |     20 |          1 |
| Java Bug                |     20 |          1 |
| Frequent Pages          |     40 |          1 |
| Browsers Finger Print   |     35 |          2 |
| Installed Plugins       |     25 |          1 |
| Cached Images           |     40 |          3 |
| URL                     |     60 |          4 |
| System Fonts Detection  |     70 |          4 |
| Localstorage            |     90 |          8 |
| Geolocation             |     70 |          6 |
| AOLTR                   |     70 |          4 |
| Network Information API |     40 |          3 |
| Battery Status API      |     20 |          1 |
+-------------------------+--------+------------+

对于您可以根据给定的请求收集的每一条信息，授予相关的分数，然后使用 重要性解决分数相同时的冲突。

概念证明

对于一个简单的概念证明，请看看感知器。感知器是一种通常用于模式识别应用的 RNA 模型。甚至还有一个完美实现它的旧 PHP 类，但是您可能需要根据自己的目的对其进行修改。

尽管 Perceptron 是一个很好的工具，但它仍然可以返回多个结果(可能的匹配) ，因此使用 Score and different 比较对于识别这些匹配的 最好的仍然很有用。

假设

存储关于每个用户的所有可能的信息(IP、 cookies 等)

哪里的结果是精确匹配，增加1分

如果结果不是一个精确的匹配，减少1分

期望

生成 RNA 标签

生成模拟数据库的随机用户

生成单个未知用户

生成未知用户 RNA 和值

该系统将合并 RNA 信息，并传授感知器

训练感知器后，系统会有一系列的权重

您现在可以测试未知用户的模式，感知器将生成一个结果集。

存储所有正面匹配

首先按 Score 对匹配进行排序，然后按差异(如上所述)进行排序

输出两个最接近的匹配，如果没有找到匹配，则输出空结果

概念证明规范

$features = array( 'Real IP address' => .5, 'Used proxy IP address' => .4, 'HTTP Cookies' => .9, 'Session Cookies' => .6, '3rd Party Cookies' => .6, 'Flash Cookies' => .7, 'PDF Bug' => .2, 'Flash Bug' => .2, 'Java Bug' => .2, 'Frequent Pages' => .3, 'Browsers Finger Print' => .3, 'Installed Plugins' => .2, 'URL' => .5, 'Cached PNG' => .4, 'System Fonts Detection' => .6, 'Localstorage' => .8, 'Geolocation' => .6, 'AOLTR' => .4, 'Network Information API' => .3, 'Battery Status API' => .2 ); // Get RNA Lables $labels = array(); $n = 1; foreach ($features as $k => $v) { $labels[$k] = "x" . $n; $n ++; } // Create Users $users = array(); for($i = 0, $name = "A"; $i < 5; $i ++, $name ++) { $users[] = new Profile($name, $features); } // Generate Unknown User $unknown = new Profile("Unknown", $features); // Generate Unknown RNA $unknownRNA = array( 0 => array("o" => 1), 1 => array("o" => - 1) ); // Create RNA Values foreach ($unknown->data as $item => $point) { $unknownRNA[0][$labels[$item]] = $point; $unknownRNA[1][$labels[$item]] = (- 1 * $point); } // Start Perception Class $perceptron = new Perceptron(); // Train Results $trainResult = $perceptron->train($unknownRNA, 1, 1); // Find matches foreach ($users as $name => &$profile) { // Use shorter labels $data = array_combine($labels, $profile->data); if ($perceptron->testCase($data, $trainResult) == true) { $score = $diff = 0; // Determing the score and diffrennce foreach ($unknown->data as $item => $found) { if ($unknown->data[$item] === $profile->data[$item]) { if ($profile->data[$item] > 0) { $score += $features[$item]; } else { $diff += $features[$item]; } } } // Ser score and diff $profile->setScore($score, $diff); $matchs[] = $profile; } } // Sort bases on score and Output if (count($matchs) > 1) { usort($matchs, function ($a, $b) { // If score is the same use diffrence if ($a->score == $b->score) { // Lower the diffrence the better return $a->diff == $b->diff ? 0 : ($a->diff > $b->diff ? 1 : - 1); } // The higher the score the better return $a->score > $b->score ? - 1 : 1; }); echo "<br />Possible Match ", implode(",", array_slice(array_map(function ($v) { return sprintf(" %s (%0.4f|%0.4f) ", $v->name, $v->score,$v->diff); }, $matchs), 0, 2)); } else { echo "<br />No match Found "; }

输出:

Possible Match D (0.7416|0.16853),C (0.5393|0.2809)

“ D”的 Print _ r:

echo "<pre>"; print_r($matchs[0]); Profile Object( [name] => D [data] => Array ( [Real IP address] => -1 [Used proxy IP address] => -1 [HTTP Cookies] => 1 [Session Cookies] => 1 [3rd Party Cookies] => 1 [Flash Cookies] => 1 [PDF Bug] => 1 [Flash Bug] => 1 [Java Bug] => -1 [Frequent Pages] => 1 [Browsers Finger Print] => -1 [Installed Plugins] => 1 [URL] => -1 [Cached PNG] => 1 [System Fonts Detection] => 1 [Localstorage] => -1 [Geolocation] => -1 [AOLTR] => 1 [Network Information API] => -1 [Battery Status API] => -1 ) [score] => 0.74157303370787 [diff] => 0.1685393258427 [base] => 8.9 )

如果 Debug = true，则可以看到输入(传感器和期望) ，初始权重，输出(传感器，和，网络) ，错误，修正和最终权重。

+----+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+----+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----------+ | o | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | x14 | x15 | x16 | x17 | x18 | x19 | x20 | Bias | Yin | Y | deltaW1 | deltaW2 | deltaW3 | deltaW4 | deltaW5 | deltaW6 | deltaW7 | deltaW8 | deltaW9 | deltaW10 | deltaW11 | deltaW12 | deltaW13 | deltaW14 | deltaW15 | deltaW16 | deltaW17 | deltaW18 | deltaW19 | deltaW20 | W1 | W2 | W3 | W4 | W5 | W6 | W7 | W8 | W9 | W10 | W11 | W12 | W13 | W14 | W15 | W16 | W17 | W18 | W19 | W20 | deltaBias | +----+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+----+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----------+ | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 0 | -1 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | -1 | -1 | 1 | -19 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 19 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | -1 | -1 | 1 | -19 | -1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -1 | -1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -1 | -1 | -1 | -1 | 1 | 1 | 1 | | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | +----+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+-----+----+---------+---------+---------+---------+---------+---------+---------+---------+---------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----------+----+----+----+----+----+----+----+----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----------+

X1到 x20表示由代码转换的特性。

// Get RNA Labels $labels = array(); $n = 1; foreach ( $features as $k => $v ) { $labels[$k] = "x" . $n; $n ++; }

这是在线演示

使用类别:

class Profile { public $name, $data = array(), $score, $diff, $base; function __construct($name, array $importance) { $values = array(-1, 1); // Perception values $this->name = $name; foreach ($importance as $item => $point) { // Generate Random true/false for real Items $this->data[$item] = $values[mt_rand(0, 1)]; } $this->base = array_sum($importance); } public function setScore($score, $diff) { $this->score = $score / $this->base; $this->diff = $diff / $this->base; } }

改良感知器类别

class Perceptron { private $w = array(); private $dw = array(); public $debug = false; private function initialize($colums) { // Initialize perceptron vars for($i = 1; $i <= $colums; $i ++) { // weighting vars $this->w[$i] = 0; $this->dw[$i] = 0; } } function train($input, $alpha, $teta) { $colums = count($input[0]) - 1; $weightCache = array_fill(1, $colums, 0); $checkpoints = array(); $keepTrainning = true; // Initialize RNA vars $this->initialize(count($input[0]) - 1); $just_started = true; $totalRun = 0; $yin = 0; // Trains RNA until it gets stable while ($keepTrainning == true) { // Sweeps each row of the input subject foreach ($input as $row_counter => $row_data) { // Finds out the number of columns the input has $n_columns = count($row_data) - 1; // Calculates Yin $yin = 0; for($i = 1; $i <= $n_columns; $i ++) { $yin += $row_data["x" . $i] * $weightCache[$i]; } // Calculates Real Output $Y = ($yin <= 1) ? - 1 : 1; // Sweeps columns ... $checkpoints[$row_counter] = 0; for($i = 1; $i <= $n_columns; $i ++) { /** DELTAS **/ // Is it the first row? if ($just_started == true) { $this->dw[$i] = $weightCache[$i]; $just_started = false; // Found desired output? } elseif ($Y == $row_data["o"]) { $this->dw[$i] = 0; // Calculates Delta Ws } else { $this->dw[$i] = $row_data["x" . $i] * $row_data["o"]; } /** WEIGHTS **/ // Calculate Weights $this->w[$i] = $this->dw[$i] + $weightCache[$i]; $weightCache[$i] = $this->w[$i]; /** CHECK-POINT **/ $checkpoints[$row_counter] += $this->w[$i]; } // END - for foreach ($this->w as $index => $w_item) { $debug_w["W" . $index] = $w_item; $debug_dw["deltaW" . $index] = $this->dw[$index]; } // Special for script debugging $debug_vars[] = array_merge($row_data, array( "Bias" => 1, "Yin" => $yin, "Y" => $Y ), $debug_dw, $debug_w, array( "deltaBias" => 1 )); } // END - foreach // Special for script debugging $empty_data_row = array(); for($i = 1; $i <= $n_columns; $i ++) { $empty_data_row["x" . $i] = "--"; $empty_data_row["W" . $i] = "--"; $empty_data_row["deltaW" . $i] = "--"; } $debug_vars[] = array_merge($empty_data_row, array( "o" => "--", "Bias" => "--", "Yin" => "--", "Y" => "--", "deltaBias" => "--" )); // Counts training times $totalRun ++; // Now checks if the RNA is stable already $referer_value = end($checkpoints); // if all rows match the desired output ... $sum = array_sum($checkpoints); $n_rows = count($checkpoints); if ($totalRun > 1 && ($sum / $n_rows) == $referer_value) { $keepTrainning = false; } } // END - while // Prepares the final result $result = array(); for($i = 1; $i <= $n_columns; $i ++) { $result["w" . $i] = $this->w[$i]; } $this->debug($this->print_html_table($debug_vars)); return $result; } // END - train function testCase($input, $results) { // Sweeps input columns $result = 0; $i = 1; foreach ($input as $column_value) { // Calculates teste Y $result += $results["w" . $i] * $column_value; $i ++; } // Checks in each class the test fits return ($result > 0) ? true : false; } // END - test_class // Returns the html code of a html table base on a hash array function print_html_table($array) { $html = ""; $inner_html = ""; $table_header_composed = false; $table_header = array(); // Builds table contents foreach ($array as $array_item) { $inner_html .= "<tr>\n"; foreach ( $array_item as $array_col_label => $array_col ) { $inner_html .= "<td>\n"; $inner_html .= $array_col; $inner_html .= "</td>\n"; if ($table_header_composed == false) { $table_header[] = $array_col_label; } } $table_header_composed = true; $inner_html .= "</tr>\n"; } // Builds full table $html = "<table border=1>\n"; $html .= "<tr>\n"; foreach ($table_header as $table_header_item) { $html .= "<td>\n"; $html .= "<b>" . $table_header_item . "</b>"; $html .= "</td>\n"; } $html .= "</tr>\n"; $html .= $inner_html . "</table>"; return $html; } // END - print_html_table // Debug function function debug($message) { if ($this->debug == true) { echo "<b>DEBUG:</b> $message"; } } // END - debug } // END - class

结论

识别一个没有唯一标识符的用户不是一个简单的任务。它依赖于收集足够数量的随机数据，您可以通过各种方法从用户那里收集这些随机数据。

即使你选择不使用人工神经网络，我建议你至少使用一个有优先级和可能性的简单转移矩阵——我希望上面提供的代码和示例能够给你足够的信息。