如何从 Ruby HEREDOC 删除主要的空白字符?

我和一个 Ruby Heredoc 之间出了点问题。它从每一行返回前导空格,即使我包含了-操作符,它应该禁止所有前导空格字符。我的方法是这样的:

    def distinct_count
<<-EOF
\tSELECT
\t CAST('#{name}' AS VARCHAR(30)) as COLUMN_NAME
\t,COUNT(DISTINCT #{name}) AS DISTINCT_COUNT
\tFROM #{table.call}
EOF
end

我的输出是这样的:

    => "            \tSELECT\n            \t CAST('SRC_ACCT_NUM' AS VARCHAR(30)) as
COLUMN_NAME\n            \t,COUNT(DISTINCT SRC_ACCT_NUM) AS DISTINCT_COUNT\n
\tFROM UD461.MGMT_REPORT_HNB\n"

当然,在这个特殊的情况下,这是正确的,除了第一个和 T 之间的所有空格。有人知道我在这里做错了什么吗?

35211 次浏览

Not much to do that I know of I'm afraid. I usually do:

def distinct_count
<<-EOF.gsub /^\s+/, ""
\tSELECT
\t CAST('#{name}' AS VARCHAR(30)) as COLUMN_NAME
\t,COUNT(DISTINCT #{name}) AS DISTINCT_COUNT
\tFROM #{table.call}
EOF
end

That works but is a bit of a hack.

EDIT: Taking inspiration from Rene Saarsoo below, I'd suggest something like this instead:

class String
def unindent
gsub(/^#{scan(/^\s*/).min_by{|l|l.length}}/, "")
end
end


def distinct_count
<<-EOF.unindent
\tSELECT
\t CAST('#{name}' AS VARCHAR(30)) as COLUMN_NAME
\t,COUNT(DISTINCT #{name}) AS DISTINCT_COUNT
\tFROM #{table.call}
EOF
end

This version should handle when the first line is not the one farthest to the left too.

<<- in Ruby will only ignore leading space for the ending delimiter, allowing it to be properly indented. It does not strip leading space on lines inside the string, despite what some documentation online might say.

You can strip leading whitespace yourself by using gsub:

<<-EOF.gsub /^\s*/, ''
\tSELECT
\t CAST('#{name}' AS VARCHAR(30)) as COLUMN_NAME
\t,COUNT(DISTINCT #{name}) AS DISTINCT_COUNT
\tFROM #{table.call}
EOF

Or if you just want to strip spaces, leaving the tabs:

<<-EOF.gsub /^ */, ''
\tSELECT
\t CAST('#{name}' AS VARCHAR(30)) as COLUMN_NAME
\t,COUNT(DISTINCT #{name}) AS DISTINCT_COUNT
\tFROM #{table.call}
EOF

Like the original poster, I too discovered the <<-HEREDOC syntax and was pretty damn disappointed that it didn't behave as I thought it should behave.

But instead of littering my code with gsub-s I extended the String class:

class String
# Removes beginning-whitespace from each line of a string.
# But only as many whitespace as the first line has.
#
# Ment to be used with heredoc strings like so:
#
# text = <<-EOS.unindent
#   This line has no indentation
#     This line has 2 spaces of indentation
#   This line is also not indented
# EOS
#
def unindent
lines = []
each_line {|ln| lines << ln }


first_line_ws = lines[0].match(/^\s+/)[0]
re = Regexp.new('^\s{0,' + first_line_ws.length.to_s + '}')


lines.collect {|line| line.sub(re, "") }.join
end
end

Here's a far simpler version of the unindent script that I use:

class String
# Strip leading whitespace from each line that is the same as the
# amount of whitespace on the first line of the string.
# Leaves _additional_ indentation on later lines intact.
def unindent
gsub /^#{self[/\A[ \t]*/]}/, ''
end
end

Use it like so:

foo = {
bar: <<-ENDBAR.unindent
My multiline
and indented
content here
Yay!
ENDBAR
}
#=> {:bar=>"My multiline\n  and indented\n    content here\nYay!"}

If the first line may be indented more than others, and want (like Rails) to unindent based on the least-indented line, you may instead wish to use:

class String
# Strip leading whitespace from each line that is the same as the
# amount of whitespace on the least-indented line of the string.
def strip_indent
if mindent=scan(/^[ \t]+/).min_by(&:length)
gsub /^#{mindent}/, ''
end
end
end

Note that if you scan for \s+ instead of [ \t]+ you may end up stripping newlines from your heredoc instead of leading whitespace. Not desirable!

If you can't use Ruby 2.3 or newer, but do have Rails 3.0 or newer, try #strip_heredoc. This example from the docs prints the first three lines with no indentation, while retaining the last two lines' two-space indentation:

if options[:usage]
  puts <<-USAGE.strip_heredoc
    This command does such and such.
 
    Supported options are:
      -h         This message
      ...
  USAGE
end

The documentation also notes: "Technically, it looks for the least indented line in the whole string, and removes that amount of leading whitespace."

Here was its Rails 3-era implementation from active_support/core_ext/string/strip.rb:

class String
def strip_heredoc
indent = scan(/^[ \t]*(?=\S)/).min.try(:size) || 0
gsub(/^[ \t]{#{indent}}/, '')
end
end

And you can find the matching tests in this version of test/core_ext/string_ext_test.rb.

Note: As @radiospiel pointed out, String#squish is only available in the ActiveSupport context.


I believe ruby's String#squish is closer to what you're really looking for:

Here is how I would handle your example:

def distinct_count
<<-SQL.squish
SELECT
CAST('#{name}' AS VARCHAR(30)) as COLUMN_NAME,
COUNT(DISTINCT #{name}) AS DISTINCT_COUNT
FROM #{table.call}
SQL
end

Some other answers find the indentation level of the least indented line, and delete that from all lines, but considering the nature of indentation in programming (that the first line is the least indented), I think you should look for the indentation level of the first line.

class String
def unindent; gsub(/^#{match(/^\s+/)}/, "") end
end

another easy to remember option is to use unindent gem

require 'unindent'


p <<-end.unindent
hello
world
end
# => "hello\n  world\n"

The <<- form of heredoc only ignores leading whitespace for the end delimiter.

With Ruby 2.3 and later you can use a squiggly heredoc (<<~) to suppress the leading whitespace of content lines:

def test
<<~END
First content line.
Two spaces here.
No space here.
END
end


test
# => "First content line.\n  Two spaces here.\nNo space here.\n"

From the Ruby literals documentation:

The indentation of the least-indented line will be removed from each line of the content. Note that empty lines and lines consisting solely of literal tabs and spaces will be ignored for the purposes of determining indentation, but escaped tabs and spaces are considered non-indentation characters.

I collect answers and got this:

class Match < ActiveRecord::Base
has_one :invitation
scope :upcoming, -> do
joins(:invitation)
.where(<<-SQL_QUERY.strip_heredoc, Date.current, Date.current).order('invitations.date ASC')
CASE WHEN invitations.autogenerated_for_round IS NULL THEN invitations.date >= ?
ELSE (invitations.round_end_time >= ? AND match_plays.winner_id IS NULL) END
SQL_QUERY
end
end

It generates excellent SQL and do not go out of AR scopes.

I needed to use something with system whereby I could split long sed commands across lines and then remove indentation AND newlines...

def update_makefile(build_path, version, sha1)
system <<-CMD.strip_heredoc(true)
\\sed -i".bak"
-e "s/GIT_VERSION[\ ]*:=.*/GIT_VERSION := 20171-2342/g"
-e "s/GIT_VERSION_SHA1[\ ]:=.*/GIT_VERSION_SHA1 := 2342/g"
"/tmp/Makefile"
CMD
end

So I came up with this:

class ::String
def strip_heredoc(compress = false)
stripped = gsub(/^#{scan(/^\s*/).min_by(&:length)}/, "")
compress ? stripped.gsub(/\n/," ").chop : stripped
end
end

Default behavior is to not strip newlines, just like all the other examples.