DataLoader is used to process data in NLP. When defining Collate_fn, ValueError keeps appearing, which are:
ValueError: not enough values to unpack (expected 2, got 1) VauleError: too much values to unpack (expected 2)
The corresponding line in error is:
def collate_fn(batch): for x,y in batch //error line ?… ?… return
(The input and output of collate_fn are briefly introduced here. XJTU-Qidong’s: The use of the collate_fn function in pytorch & how to pass parameters to the collate_fn function is very detailed.)
The error is the batch line, so start here. The input of Collate_fn is batch, and batch comes from the __getitem__ function of class dataset. The structure of class dataset is as follows:
class dataset(dataset): def __init__(self, path): self.x = [] self.y = [] def __len__(self): ?… def__getitem__(self, idx): ?… return self.x, self.y
x, y is the batch content, and the shape of the batch is (batch_size, 2). The problem arises here, the return values x and y are in list form, and the input required by collate_fn is in dict form, so you need to change the return part of the __getitem__ segment to:
return {<!-- -->'x':self.x[idx], 'y':self.y[idx]}
The output can be detected with the following code snippet
dataset = dataset() //Instantiate class dataset() dataset.__init__(path) dataset[idx]['x'] dataset[idx]['y']
The output should be a pair of extracted x and y values.