PyTorch CNN Tutorial: The “Flatten Layer” Problem Every Beginner Faces (And How to Fix It)

Yokeswaran S

Published Apr 23, 2026

If you’ve started building a convolutional neural network (CNN) from scratch in PyTorch, you’ve probably hit this confusing moment:

“What should be the input size for my Linear (fully connected) layer?”

The Common Problem

In a typical CNN, you stack:

Conv → ReLU → Pool → Conv → ReLU → Pool → Flatten → Linear

Everything works fine until you reach this line:

nn.Linear(in_features=???, out_features=10)

And now you're stuck.

Why?

Because after multiple convolution and pooling layers, your tensor shape changes. So beginners try to manually calculate:

channels × height × width

Example:

10 × 7 × 7 = 490

But this leads to:

Errors when architecture changes
Errors when input size changes
Confusion and wasted time

The Simple Trick I used to

Instead of guessing, let PyTorch give you the answer.

We use a dummy tensor.

The Dummy Tensor Trick

Here’s the clean and reliable way:

with torch.no_grad():
            dummy = torch.randn(1, input_shape, 28, 28)
            x = self.conv_block_1(dummy)
            x = self.conv_block_2(x)
            #n_features = x.shape[1] * x.shape[2] * x.shape[3]
            n_features = x.numel

What’s happening here?

Let’s break it down:

1. Create a fake input

dummy = torch.randn(1, input_shape, 28, 28)

This simulates a real batch of images.

2. Pass through convolution layers

x = self.conv_block_1(dummy) x = self.conv_block_2(x)

Now x has the exact shape your model produces before flattening.

3. Use .numel()

n_features = x.numel()

This is the key step.

👉 .numel() means “number of elements” in the tensor.

If:

x.shape = [1, 10, 7, 7]

Then:

numel = 1 × 10 × 7 × 7 = 490

That is exactly how many values will go into the Linear layer after flattening.

Use it in your model

self.classifier = nn.Sequential(
    nn.Flatten(),
    nn.Linear(n_features, output_shape)
)

Why this works and why it’s better

No manual shape calculation
Works even if you change layers
Prevents shape mismatch errors
Keeps your code clean and flexible

Let the model define its own shape.

So Your Final Architecture looks like:

Article content — Image from author's IDE

Final Takeaway

Don’t hardcode dimensions. Don’t rely on manual calculations.

Use a dummy input and .numel() to get the correct input size automatically. Keep learning keep growing.

I’m currently learning deep learning step by step and sharing everything I understand in simple terms.

Follow along if you're on the same journey 🚀

Rohini Singh 5d

Good work Yokeswaran S

1 Reaction

Rohith Bharathithasan 1w

Great detail yoki

1 Reaction

Yuvaraja M 1w

good guide

1 Reaction

Vaseekaran Krishnan Vinodhan 1w

Good work da Yokeswaran S

1 Reaction

Sooraj Nikam 1w

That's a great work understanding the architecture yogi ❤️

1 Reaction

See more comments

To view or add a comment, sign in

PyTorch CNN Tutorial: The “Flatten Layer” Problem Every Beginner Faces (And How to Fix It)

Yokeswaran S

The Common Problem

Why?

The Simple Trick I used to

The Dummy Tensor Trick

What’s happening here?

1. Create a fake input

2. Pass through convolution layers

3. Use .numel()

Use it in your model

Why this works and why it’s better

So Your Final Architecture looks like:

Final Takeaway

More articles by Yokeswaran S

Explore content categories

The Common Problem

Why?

The Simple Trick I used to

The Dummy Tensor Trick

What’s happening here?

1. Create a fake input

2. Pass through convolution layers

3. Use .numel()

Use it in your model

Why this works and why it’s better

So Your Final Architecture looks like:

Final Takeaway

More articles by Yokeswaran S

Why Batching Is the Secret Behind Stable Model Training

BCELoss vs BCEWithLogitsLoss (PyTorch Made Simple)

Sigmoid vs Softmax — The Simple Difference Every ML Beginner Should Know

CNN Kernels Explained Visually (No Math, No Confusion)

Underfitting vs. Overfitting: When Your Model's Either Too Dumb or Too Obsessed with the Details!

Understanding Data Encoding Techniques in Machine Learning:

NVIDIA GTC 2025: The Future of AI is Here – and It's a Total Game Changer!

Finding the Slope of a Curve: From Secant to Tangent

Exploring the Slope of Straight Lines in Differential Calculus

Understanding JSON in python

Explore content categories