指定_(下划线)变量的位置和方式?

大多数人都知道 _在 IRB 中作为最后返回值的持有者的特殊意义,但这就是我在这里要问的 没有

相反,我在询问在普通的 Ruby 代码中将 _用作变量名的情况。在这里,它似乎有一种特殊的行为,类似于“不在乎变量”(如 Prolog)。这里有一些有用的例子来说明它独特的行为:

lambda { |x, x| 42 }            # SyntaxError: duplicated argument name
lambda { |_, _| 42 }.call(4, 2) # => 42
lambda { |_, _| 42 }.call(_, _) # NameError: undefined local variable or method `_'
lambda { |_| _ + 1 }.call(42)   # => 43
lambda { |_, _| _ }.call(4, 2)  # 1.8.7: => 2
# 1.9.3: => 4
_ = 42
_ * 100         # => 4200
_, _ = 4, 2; _  # => 2

这些都是在 Ruby 中直接运行的(添加了 puts) ,而不是 IRB,以避免与其附加功能发生冲突。

不过,这都是我自己实验的结果,因为我在任何地方都找不到关于这种行为的任何文档(诚然,这不是最容易搜索的东西)。最后,我很好奇所有这些在内部是如何工作的,这样我就可以更好地理解 _到底有什么特别之处。因此,我要求参考文档,最好是 Ruby 源代码(也许还有 RubySpec) ,它揭示了 _在 Ruby 中的行为。

注意: 这大部分来自 这个讨论@ Niklas B.

31574 次浏览

_ is a valid identifier. Identifiers can't just contain underscores, they can also be an underscore.

_ = o = Object.new
_.object_id == o.object_id
# => true

You can also use it as method names:

def o._; :_ end
o._
# => :_

Of course, it is not exactly a readable name, nor does it pass any information to the reader about what the variable refers to or what the method does.

IRB, in particular, sets _ to the value of the last expression:

$ irb
> 'asd'
# => "asd"
> _
# => "asd"

As it is in the source code, it simply sets _ to the last value:

@workspace.evaluate self, "_ = IRB.CurrentContext.last_value"

Did some repository exploring. Here's what I found:

On the last lines of the file id.c, there is the call:

REGISTER_SYMID(idUScore, "_");

greping the source for idUScore gave me two seemingly relevant results:

shadowing_lvar_gen seems to be the mechanism through which the formal parameter of a block replaces a variable of the same name that exists in another scope. It is the function that seems to raise "duplicated argument name" SyntaxError and the "shadowing outer local variable" warning.

After greping the source for shadowing_lvar_gen, I found the following on the changelog for Ruby 1.9.3:

Tue Dec 11 01:21:21 2007 Yukihiro Matsumoto

  • parse.y (shadowing_lvar_gen): no duplicate error for "_".

Which is likely to be the origin of this line:

if (idUScore == name) return name;

From this, I deduce that in a situation such as proc { |_, _| :x }.call :a, :b, one _ variable simply shadows the other.


Here's the commit in question. It basically introduced these two lines:

if (!uscore) uscore = rb_intern("_");
if (uscore == name) return;

From a time when idUScore did not even exist, apparently.

There is some special handling in the source to suppress the "duplicate argument name" error. The error message only appears in shadowing_lvar_gen inside parse.y, the 1.9.3 version looks like this:

static ID
shadowing_lvar_gen(struct parser_params *parser, ID name)
{
if (idUScore == name) return name;
/* ... */

and idUScore is defined in id.c like this:

REGISTER_SYMID(idUScore, "_");

You'll see similar special handling in warn_unused_var:

static void
warn_unused_var(struct parser_params *parser, struct local_vars *local)
{
/* ... */
for (i = 0; i < cnt; ++i) {
if (!v[i] || (u[i] & LVAR_USED)) continue;
if (idUScore == v[i]) continue;
rb_compile_warn(ruby_sourcefile, (int)u[i], "assigned but unused variable - %s", rb_id2name(v[i]));
}
}

You'll notice that the warning is suppressed on the second line of the for loop.

The only special handling of _ that I could find in the 1.9.3 source is above: the duplicate name error is suppressed and the unused variable warning is suppressed. Other than those two things, _ is just a plain old variable like any other. I don't know of any documentation about the (minor) specialness of _.

In Ruby 2.0, the idUScore == v[i] test in warn_unused_var is replaced with a call to is_private_local_id:

if (is_private_local_id(v[i])) continue;
rb_warn4S(ruby_sourcefile, (int)u[i], "assigned but unused variable - %s", rb_id2name(v[i]));

and is_private_local_id suppresses warnings for variables that begin with _:

if (name == idUScore) return 1;
/* ... */
return RSTRING_PTR(s)[0] == '_';

rather than just _ itself. So 2.0 loosens things up a bit.