LC_ALL=C - set locale to C, otherwise many extended chars will not match (even though they look like they are encoded > 0x80)
\x00-\x08 - non-printable control chars 0 - 7 decimal
\x0E-\x1F - more non-printable control chars 14 - 31 decimal
\x80-1xFF - non-printable chars > 128 decimal
-c - print count of matching lines instead of lines
-P - perl style regexps
Instead of -c you may prefer to use -n (and optionally -b) or -l
-n, --line-number
-b, --byte-offset
-l, --files-with-matches
( [\302-\337][\200-\277]|
[\340][\240-\277][\200-\277]|
[\355][\200-\237][\200-\277]|
[\341-\354\356-\357][\200-\277][\200-\277]|
[\360][\220-\277][\200-\277][\200-\277]|
[\361-\363][\200-\277][\200-\277][\200-\277]|
[\364][\200-\217][\200-\277][\200-\277] )
* please delete all newlines, spaces, or tabs in between (..)
* feel free to use bracket ranges {1,3} etc to optimize
the redundant listings of [\200-\277]. but don't change that
[\200-\277]+, as that might result in invalid encodings
due to either insufficient or too many continuation bytes
* although some historical UTF-8 references considers 5- and
6-byte encodings to be valid, as of Unicode 13 they only
consider up to 4-bytes