Python slice how-to, I know the Python slice but how can I use built-in slice object for it?

What's the use of built-in function slice and how can I use it?
The direct way of Pythonic slicing I know - l1[start:stop:step]. I want to know if I have a slice object, then how do I use it?

55010 次浏览

The slice function returns slice objects. Slice objects are one of Python's internal types, which are optimized for read performance - all of their attributes are read-only.

Altering slice could be useful if wish to change the default behaviour. For example, lxml uses slice notation to access DOM elements (however, I haven't confirmed how they did that myself).

You create a slice by calling slice with the same fields you would use if doing [start:end:step] notation:

sl = slice(0,4)

To use the slice, just pass it as if it were the index into a list or string:

>>> s = "ABCDEFGHIJKL"
>>> sl = slice(0,4)
>>> print(s[sl])
'ABCD'

Let's say you have a file of fixed-length text fields. You could define a list of slices to easily extract the values from each "record" in this file.

data = """\
0010GEORGE JETSON    12345 SPACESHIP ST   HOUSTON       TX
0020WILE E COYOTE    312 ACME BLVD        TUCSON        AZ
0030FRED FLINTSTONE  246 GRANITE LANE     BEDROCK       CA
0040JONNY QUEST      31416 SCIENCE AVE    PALO ALTO     CA""".splitlines()




fieldslices = [slice(*fielddef) for fielddef in [
(0,4), (4, 21), (21,42), (42,56), (56,58),
]]
fields = "id name address city state".split()


for rec in data:
for field,sl in zip(fields, fieldslices):
print("{} : {}".format(field, rec[sl]))
print('')


# or this same code using itemgetter, to make a function that
# extracts all slices from a string into a tuple of values
import operator
rec_reader = operator.itemgetter(*fieldslices)
for rec in data:
for field, field_value in zip(fields, rec_reader(rec)):
print("{} : {}".format(field, field_value))
print('')

Prints:

id : 0010
name : GEORGE JETSON
address : 12345 SPACESHIP ST
city : HOUSTON
state : TX


id : 0020
name : WILE E COYOTE
address : 312 ACME BLVD
city : TUCSON
state : AZ


id : 0030
name : FRED FLINTSTONE
address : 246 GRANITE LANE
city : BEDROCK
state : CA


id : 0040
name : JONNY QUEST
address : 31416 SCIENCE AVE
city : PALO ALTO
state : CA

Square brackets following a sequence denote either indexing or slicing depending on what's inside the brackets:

>>> "Python rocks"[1]    # index
'y'
>>> "Python rocks"[1:10:2]    # slice
'yhnrc'

Both of these cases are handled by the __getitem__() method of the sequence (or __setitem__() if on the left of an equals sign.) The index or slice is passed to the methods as a single argument, and the way Python does this is by converting the slice notation, (1:10:2, in this case) to a slice object: slice(1,10,2).

So if you are defining your own sequence-like class or overriding the __getitem__ or __setitem__ or __delitem__ methods of another class, you need to test the index argument to determine if it is an int or a slice, and process accordingly:

def __getitem__(self, index):
if isinstance(index, int):
...    # process index as an integer
elif isinstance(index, slice):
start, stop, step = index.indices(len(self))    # index is a slice
...    # process slice
else:
raise TypeError("index must be int or slice")

A slice object has three attributes: start, stop and step, and one method: indices, which takes a single argument, the length of the object, and returns a 3-tuple: (start, stop, step).

>>> class sl:
...  def __getitem__(self, *keys): print keys
...
>>> s = sl()
>>> s[1:3:5]
(slice(1, 3, 5),)
>>> s[1:2:3, 1, 4:5]
((slice(1, 2, 3), 1, slice(4, 5, None)),)
>>>

While trying to answer Subset a string based on variable , I recalled that numpy has a syntactically nice way to define slice objects:

>>> import numpy as np
>>> s = "The long-string instrument is a musical instrument in which the string is of such a length that the fundamental transverse wave is below what a person can hear as a tone."
>>> z = np.s_[18:26]  # in this case same as slice(18, 26, None)
>>> s[z]
'strument'

The problem solved here is how to store the slice in a variable for later use, and np.s_ allows to do just that. Yes, it's not built-in, but as that original question was redirected here I feel like my answer belong here as well. Also, numpy was one of the reasons why so advanced slicing abilities were added to Python, IIRC.

An example of a more complex "slicing":

>>> data = np.array(range(6)).reshape((2, 3))
>>> z = np.s_[:1, 1:2]
>>> data[z]
array([[1]])
>>> data
array([[0, 1, 2],
[3, 4, 5]])
>>> z
(slice(None, 1, None), slice(1, 2, None))

where z is now a tuple of slices.

Slice objects let you programmatically generate and manipulate slices. Especially for multidimensional numpy arrays, and especially if you don't know the dimensionality in advance, you might want to construct slices on-the-fly to specify the axes or dimensions that you want.

import numpy as np
dimension = np.random.randint(10) # Might have up to 10 dimensions
shape = []
for d in range(dimension):
shape.append(np.random.randint(10))
zz = np.random.rand(tuple(shape))
print(zz)
>>> array([[0.68379351, 0.50854469, 0.64578775, 0.73441699, 0.28977396],
[0.88797164, 0.81603025, 0.63978659, 0.22677299, 0.93455738],
[0.0892855 , 0.28048706, 0.04262895, 0.9353467 , 0.13062249],
[0.88561035, 0.93378367, 0.12124208, 0.25600301, 0.96035638]])

Here our data ended up being two dimensional (4-by-5), but there was no guarantee of that. How will you request slices from zz?

One problem is that I can't manipulate Python's slice notation. It's not valid syntax outside of a slicing operation.

my_slice = 2:3:1
>>> SyntaxError: Invalid Syntax

What if I could just build up the exact slice request I wanted in a loop, the way I can build up a string? Wouldn't that be great? I mean, sure you can use a string to do it, but it would be messy and requires eval.

your_slice_definitions = [(2,3,1), *[(None, None, None)]*(zz.ndim - 1)]
my_slice_str = ""
for slice_start, slice_end, slice_step in your_slice_definitions:
my_slice_str += "{}:{}:{},".format(slice_start, slice_end, slice_step)
eval("zz["+my_slice_str+"])

So here we are: slice objects let you do this. You can assemble lists and tuples of them on-the-fly, pass them as function parameters, sort them, shuffle them, and so on.

my_slices = []
for slice_start, slice_end, slice_step in your_slice_definitions:
my_slices += slice(slice_start, slice_end, slice_step)
print(zz[my_slices])
>>> array([[0.0892855 , 0.28048706, 0.04262895, 0.9353467 , 0.13062249]])