Using the star sign in grep

I am trying to search for the substring "abc" in a specific file in linux/bash

So I do:

grep '*abc*' myFile

It returns nothing.

But if I do:

grep 'abc' myFile

It returns matches correctly.

Now, this is not a problem for me. But what if I want to grep for a more complex string, say

*abc * def *

How would I accomplish it using grep?

243267 次浏览

Try grep -E for extended regular expression support

Also take a look at:

The grep man page

The "star sign" is only meaningful if there is something in front of it. If there isn't the tool (grep in this case) may just treat it as an error. For example:

'*xyz'    is meaningless
'a*xyz'   means zero or more occurrences of 'a' followed by xyz

'*' works as a modifier for the previous item. So 'abc*def' searches for 'ab' followed by 0 or more 'c's follwed by 'def'.

What you probably want is 'abc.*def' which searches for 'abc' followed by any number of characters, follwed by 'def'.

The asterisk is just a repetition operator, but you need to tell it what you repeat. /*abc*/ matches a string containing ab and zero or more c's (because the second * is on the c; the first is meaningless because there's nothing for it to repeat). If you want to match anything, you need to say .* -- the dot means any character (within certain guidelines). If you want to just match abc, you could just say grep 'abc' myFile. For your more complex match, you need to use .* -- grep 'abc.*def' myFile will match a string that contains abc followed by def with something optionally in between.

Update based on a comment:

* in a regular expression is not exactly the same as * in the console. In the console, * is part of a glob construct, and just acts as a wildcard (for instance ls *.log will list all files that end in .log). However, in regular expressions, * is a modifier, meaning that it only applies to the character or group preceding it. If you want * in regular expressions to act as a wildcard, you need to use .* as previously mentioned -- the dot is a wildcard character, and the star, when modifying the dot, means find one or more dot; ie. find one or more of any character.

The dot character means match any character, so .* means zero or more occurrences of any character. You probably mean to use .* rather than just *.

Use grep -P - which enables support for Perl style regular expressions.

grep -P "abc.*def" myfile

The expression you tried, like those that work on the shell command line in Linux for instance, is called a "glob". Glob expressions are not full regular expressions, which is what grep uses to specify strings to look for. Here is (old, small) post about the differences. The glob expressions (as in "ls *") are interpreted by the shell itself.

It's possible to translate from globs to REs, but you typically need to do so in your head.

You're not using regular expressions, so your grep variant of choice should be fgrep, which will behave as you expect it to.

This may be the answer you're looking for:

grep abc MyFile | grep def

Only thing is... it will output lines were "def" is before OR after "abc"

This worked for me:

grep ".*${expr}" - with double-quotes, preceded by the dot. Where ${expr} is whatever string you need in the end of the line.

So in your case:

grep ".*abc.*" myFile

Standard unix grep.

$ cat a.txt
123abcd456def798
123456def789
Abc456def798
123aaABc456DEF

* matches the preceding character zero or more times.

$ grep -i "abc*def" a.txt




$

It would match, for instance "abdef" or "abcdef" or "abcccccccccdef". But none of these are in the file, so no match.

. means "match any character" Together with *, .* means match any character any number of times.

$ grep -i "abc.*def" a.txt
123abcd456def798
Abc456def798
123aaABc456DEF

So we get matches. There are alot of online references about regular expressions, which is what is being used here.

I summarize other answers, and make these examples to understand how the regex and glob work.

There are three files

echo 'abc' > file1
echo '*abc' > file2
echo '*abcc' > file3

Now I execute the same commands for these 3 files, let's see what happen.

(1)

grep '*abc*' file1

As you said, this one return nothing. * wants to repeat something in front of it. For the first *, there is nothing in front of it to repeat, so the system recognize this * just a character *. Because the string in the file is abc, there is no * in the string, so you cannot find it. The second * after c means it repeat c 0 or more times.

(2)

grep '*abc*' file2

This one return *abc, because there is a * in the front, it matches the pattern *abc*.

(3)

grep '*abc*' file3

This one return *abcc because there is a * in the front and 2 c at the tail. so it matches the pattern *abc*

(4)

grep '.*abc.*' file1

This one return abc because .* indicate 0 or more repetition of any character.