SolvedValueError: Found array with dim 4. TSNE expected <= 2.

Problem description

Traceback (most recent call last):
File “/home/visionx/nickle/temp/SimCLR/linear_evaluation.py”, line 229, in
).fit_transform(all_nodes_unnormalized_scores)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/visionx/anaconda3/envs/simclr-pt/lib/python3.11/site-packages/sklearn/utils/_set_output.py”, line 157, in wrapped
data_to_wrap = f(self, X, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/visionx/anaconda3/envs/simclr-pt/lib/python3.11/site-packages/sklearn/base.py”, line 1152, in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/visionx/anaconda3/envs/simclr-pt/lib/python3.11/site-packages/sklearn/manifold/_t_sne.py”, line 1111, in fit_transform
embedding = self._fit(X)
^^^^^^^^^^^^
File “/home/visionx/anaconda3/envs/simclr-pt/lib/python3.11/site-packages/sklearn/manifold/_t_sne.py”, line 841, in _fit
X = self._validate_data(
^^^^^^^^^^^^^^^^^^^
File “/home/visionx/anaconda3/envs/simclr-pt/lib/python3.11/site-packages/sklearn/base.py”, line 605, in _validate_data
out = check_array(X, input_name=”X”, **check_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/home/visionx/anaconda3/envs/simclr-pt/lib/python3.11/site-packages/sklearn/utils/validation.py”, line 951, in check_array
raise ValueError(
ValueError: Found array with dim 4. TSNE expected <= 2.

Solution

1. Reason: Dimension mismatch. The array dimension is 4 dimensions, and the current expectation is <= 2 dimensions

The solution ideas refer to the two pictures above.

Step one, print dimensions

(128, 3, 224, 224)

The second step is to change the input to two dimensions
x_i = x_i.reshape(128, -1)
2. Description of each parameter
 tsne = TSNE(perplexity=30, n_components=2, init='pca', n_iter=5000); plot_only = 500 #Only draw the first 500 points
                #Perform tsne dimensionality reduction on the intermediate layer output
                low_dim_embs = tsne.fit_transform(flat_representation[:plot_only, :])
                #The data is two-dimensional after passing through tsne
                #Draw the data to pass the two-dimensional data and the real category
2.1. sklearn.manifold.TSNEThe function is defined as follows:

class sklearn.manifold.TSNE(n_components=2, perplexity=30.0, early_exaggeration=4.0, learning_rate=1000.0, n_iter=1000, n_iter_without_progress=30, min_grad_norm=1e-07, metric=’euclidean’, init=’random’,verbose =0, random_state=None, method=’barnes_hut’, angle=0.5)

2.2. Parameters:

n_components:int, optional (default: 2) Dimensions of the embedded space.

perplexity: Float, optional (default: 30) Larger data sets usually require larger perplexity. Consider choosing a value between 5 and 50. Since t-SNE is very insensitive to this parameter, the choice is not very important.

early_exaggeration: float, optional (default: 4.0) The choice of this parameter is not very important.

learning_rate: float, optional (default: 1000) Learning rate can be a key parameter. It should be between 100 and 1000. If the cost function increases during initial optimization, the early exaggeration factor or learning rate may be too high. Learning rate can sometimes help if the cost function gets stuck in a local minimum.

n_iter:int, optional (default: 1000) Maximum number of iterations for optimization. It should be at least 200.

n_iter_without_progress: int, optional (default: 30) The maximum number of iterations without progress before we abort the optimization.

New version 0.17 Features: The parameter n_iter_without_progress controls the stop condition.

min_grad_norm: float, optional (default: 1E-7) If the gradient norm falls below this threshold, the optimization will be aborted.

metric: String or iterable, optional, the metric to use when calculating distances between instances in the feature array. If the metric is a string, it must be one of the options allowed by scipy.spatial.distance.pdist for its metric argument, or one of the metrics listed in pair.PAIRWISE_DISTANCE_FUNCTIONS. If the metric is “precomputed”, X is assumed to be a distance matrix. Alternatively, if the metric is a callable function, it is called on each pair of instances (rows) and the resulting value is logged. The callable should take as input two arrays from X and return a value representing the distance between them. The default value is “euclidean”, which is interpreted as the square of the Euclidean distance.

init:String, optional (default: “random”) embedded initialization. Possible options are “random” and “pca”. PCA initialization cannot be used with precomputed distances and is generally more globally stable than random initialization.

random_state:int or RandomState instance or None (default)
Pseudo-random number generator seed control. If not, use numpy.random singleton. Note that different initializations may lead to different local minima of the cost function.

method:String (default: ‘barnes_hut’)
By default, the gradient calculation algorithm uses the Barnes-Hut approximation which runs in O(NlogN) time. method=’exact’ will run on a slower but accurate algorithm in O(N^2) time. When the nearest neighbor error needs to be better than 3%, the exact algorithm should be used. However, the exact method does not scale to millions of examples. New version 0.17 Feature: Approximate optimization method through Barnes-Hut.

angle:float (default: 0.5)
Only used if method=’barnes_hut’ This is a trade-off between speed and accuracy of Barnes-Hut T-SNE. angle’ is the angular size of the remote node measured from a point (called theta in [3]). If this size is lower than ‘angle’, it is used as a summary node for all points contained within it. This method is less sensitive to changes in this parameter in the range of 0.2-0.8. Angle less than 0.2 will quickly increase the calculation time and angle, so 0.8 will quickly increase the error.
—————-
Copyright statement: This article is an original article by CSDN blogger “Chen Shancai” and follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement when reprinting.
Original link: https://blog.csdn.net/qq_44702847/article/details/90044884

Finished with flowers

Once people have sustenance, it is easy to forget their dreams