Python 脚本中的错误“期望的2D 数组,得到的是1D 数组:”?

我跟随 本教程做这个 ML 预测:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style


style.use("ggplot")
from sklearn import svm


x = [1, 5, 1.5, 8, 1, 9]
y = [2, 8, 1.8, 8, 0.6, 11]


plt.scatter(x,y)
plt.show()


X = np.array([[1,2],
[5,8],
[1.5,1.8],
[8,8],
[1,0.6],
[9,11]])


y = [0,1,0,1,0,1]
X.reshape(1, -1)


clf = svm.SVC(kernel='linear', C = 1.0)
clf.fit(X,y)


print(clf.predict([0.58,0.76]))

我正在使用 Python 3.6,我得到了错误“ Expected 2D array,got 1D array:” 我认为这个脚本是针对旧版本的,但是我不知道如何将它转换成3.6版本。

已经试过了:

X.reshape(1, -1)
353695 次浏览

You are just supposed to provide the predict method with the same 2D array, but with one value that you want to process (or more). In short, you can just replace

[0.58,0.76]

With

[[0.58,0.76]]

And it should work.

EDIT: This answer became popular so I thought I'd add a little more explanation about ML. The short version: we can only use predict on data that is of the same dimensionality as the training data (X) was.

In the example in question, we give the computer a bunch of rows in X (with 2 values each) and we show it the correct responses in y. When we want to predict using new values, our program expects the same - a bunch of rows. Even if we want to do it to just one row (with two values), that row has to be part of another array.

The problem is occurring when you run prediction on the array [0.58,0.76]. Fix the problem by reshaping it before you call predict():

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style


style.use("ggplot")
from sklearn import svm


x = [1, 5, 1.5, 8, 1, 9]
y = [2, 8, 1.8, 8, 0.6, 11]


plt.scatter(x,y)
plt.show()


X = np.array([[1,2],
[5,8],
[1.5,1.8],
[8,8],
[1,0.6],
[9,11]])


y = [0,1,0,1,0,1]


clf = svm.SVC(kernel='linear', C = 1.0)
clf.fit(X,y)


test = np.array([0.58, 0.76])
print test       # Produces: [ 0.58  0.76]
print test.shape # Produces: (2,) meaning 2 rows, 1 col


test = test.reshape(1, -1)
print test       # Produces: [[ 0.58  0.76]]
print test.shape # Produces (1, 2) meaning 1 row, 2 cols


print(clf.predict(test)) # Produces [0], as expected

The X and Y matrix of Independent Variable and Dependent Variable respectively to DataFrame from int64 Type so that it gets converted from 1D array to 2D array.. i.e X=pd.DataFrame(X) and Y=pd.dataFrame(Y) where pd is of pandas class in python. and thus feature scaling in-turn doesn't lead to any error!

I faced the same issue except that the data type of the instance I wanted to predict was a panda.Series object.

Well I just needed to predict one input instance. I took it from a slice of my data.

df = pd.DataFrame(list(BiogasPlant.objects.all()))
test = df.iloc[-1:]       # sliced it here

In this case, you'll need to convert it into a 1-D array and then reshape it.

 test2d = test.values.reshape(1,-1)

From the docs, values will convert Series into a numpy array.

I faced the same problem. You just have to make it an array and moreover you have to put double squared brackets to make it a single element of the 2D array as first bracket initializes the array and the second makes it an element of that array.

So simply replace the last statement by:

print(clf.predict(np.array[[0.58,0.76]]))

With one feature my Dataframe list converts to a Series. I had to convert it back to a Dataframe list and it worked.

if type(X) is Series:
X = X.to_frame()

I use the below approach.

reg = linear_model.LinearRegression()
reg.fit(df[['year']],df.income)


reg.predict([[2136]])

I was facing the same issue earlier but I have somehow found the solution, You can try reg.predict([[3300]]).

The API used to allow scalar value but now you need to give a 2D array.

Just insert the argument between a double square bracket:

regressor.predict([[values]])

that worked for me

Just enclose your numpy object with two square brackets or vice versa.

For example:

If initially your x = [8,9,12,7,5]

change it to x = [ [8,9,12,7,5] ].

That should fix the dimension issue

You can do it like this:

np.array(x)[:, None]