如何添加一个单一的项目到熊猫系列

我如何添加一个单一的项目到熊猫 Series实例?

我正在寻找类似于

>>> x = Series()
>>> N = 4
>>> for i in xrange(N):
>>>     x.some_appending_function(i**2)
>>> print(x)
0 | 0
1 | 1
2 | 4
3 | 9

同样,我怎样才能添加一个单一的行熊猫 DataFrame

187047 次浏览

How to add single item. This is not very effective but follows what you are asking for:

x = p.Series()
N = 4
for i in xrange(N):
x = x.set_value(i, i**2)

produces x:

0    0
1    1
2    4
3    9

Obviously there are better ways to generate this series in only one shot.

For your second question check answer and references of SO question add one row in a pandas.DataFrame.

You can use the append function to add another element to it. Only, make a series of the new element, before you append it:

test = test.append(pd.Series(200, index=[101]))

Adding to joquin's answer the following form might be a bit cleaner (at least nicer to read):

x = p.Series()
N = 4
for i in xrange(N):
x[i] = i**2

which would produce the same output

also, a bit less orthodox but if you wanted to simply add a single element to the end:

x=p.Series()
value_to_append=5
x[len(x)]=value_to_append

If you have an index and value. Then you can add to Series as:

obj = Series([4,7,-5,3])
obj.index=['a', 'b', 'c', 'd']


obj['e'] = 181

this will add a new value to Series (at the end of Series).

TLDR: do not append items to a series one by one, better extend with an ordered collection

I think the question in its current form is a bit tricky. And the accepted answer does answer the question. But the more I use pandas, the more I understand that it's a bad idea to append items to a Series one by one. I'll try to explain why for pandas beginners.

You might think that appending data to a given Series might allow you to reuse some resources, but in reality a Series is just a container that stores a relation between an index and a values array. Each is a numpy.array under the hood, and the index is immutable. When you add to Series an item with a label that is missing in the index, a new index with size n+1 is created, and a new values values array of the same size. That means that when you append items one by one, you create two more arrays of the n+1 size on each step.

By the way, you can not append a new item by position (you will get an IndexError) and the label in an index does not have to be unique, that is when you assign a value with a label, you assign the value to all existing items with the the label, and a new row is not appended in this case. This might lead to subtle bugs.

The moral of the story is that you should not append data one by one, you should better extend with an ordered collection. The problem is that you can not extend a Series inplace. That is why it is better to organize your code so that you don't need to update a specific instance of a Series by reference.

If you create labels yourself and they are increasing, the easiest way is to add new items to a dictionary, then create a new Series from the dictionary (it sorts the keys) and append the Series to an old one. If the keys are not increasing, then you will need to create two separate lists for the new labels and the new values.

Below are some code samples:

In [1]: import pandas as pd
In [2]: import numpy as np


In [3]: s = pd.Series(np.arange(4)**2, index=np.arange(4))


In [4]: s
Out[4]:
0    0
1    1
2    4
3    9
dtype: int64


In [6]: id(s.index), id(s.values)
Out[6]: (4470549648, 4470593296)

When we update an existing item, the index and the values array stay the same (if you do not change the type of the value)

In [7]: s[2] = 14


In [8]: id(s.index), id(s.values)
Out[8]: (4470549648, 4470593296)

But when you add a new item, a new index and a new values array is generated:

In [9]: s[4] = 16


In [10]: s
Out[10]:
0     0
1     1
2    14
3     9
4    16
dtype: int64


In [11]: id(s.index), id(s.values)
Out[11]: (4470548560, 4470595056)

That is if you are going to append several items, collect them in a dictionary, create a Series, append it to the old one and save the result:

In [13]: new_items = {item: item**2 for item in range(5, 7)}


In [14]: s2 = pd.Series(new_items)


In [15]: s2  # keys are guaranteed to be sorted!
Out[15]:
5    25
6    36
dtype: int64


In [16]: s = s.append(s2); s
Out[16]:
0     0
1     1
2    14
3     9
4    16
5    25
6    36
dtype: int64

As far as @joaqin's solution is deprecated, because set_value method will be removed in a future pandas release, I would mention the other option to add a single item to pandas series, using .at[] accessor.

>>> import pandas as pd
>>> x = pd.Series()
>>> N = 4
>>> for i in range(N):
...     x.at[i] = i**2

It produces the same output.

>>> print(x)
0    0
1    1
2    4
3    9

Here is another thought n for appending multiple items in one line without changing the name of series. However, this may be not as efficient as the other answer.

>>> df = pd.Series(np.random.random(5), name='random')
>>> df


0    0.363885
1    0.402623
2    0.450449
3    0.172917
4    0.983481
Name: random, dtype: float64




>>> df.to_frame().T.assign(a=3, b=2, c=5).squeeze()


0    0.363885
1    0.402623
2    0.450449
3    0.172917
4    0.983481
a    3.000000
b    2.000000
c    5.000000
Name: random, dtype: float64
import pandas as pd
import numpy as np


ser1 = pd.Series(np.linspace(1, 10, 2))
element = np.nan
ser1 = ser1.append(pd.Series(element))