JourneyToCoding

Code for Fun

The transformer which is distinguished by its adoption of self-attention and multi-head attention, is a deep learning model using the encoder-decoder architecture. It can be used in both CV and NLP. The encoder of transformer generates BERT and the decoder of transformer generates GPT.

Read more »

Attention mechanisms is a layer of neural networks added to deep learning models to focus their attention to specific parts of data, based on different weights assigned to different parts. Just as the neural network is an effort to mimic human brain actions in a simplified manner, the attention mechanism is also an attempt to implement the same action of selectively concentrating on a few relevant things while ignoring others in neural networks.

Read more »

The Encoder-Decoder Architecture views neural networks in a new perspective. It takes the neural network a kind of signal processor which encode the input and decode it to generate output.

Read more »

CNN is good at processing spatial information but it is not good at processing sequence information. RNN (Recurrent Neural Network) can better process sequence information than other neural networks.

Read more »

CNN is a special kind of MLP. Why do we still need CNN even though MLP can work well? This involves a classic problem in the computer field: the trade-off between memory and computing speed. CNN is widely used in image processing. An image is characterized by its representation in the computer by millions of pixels. Each pixel is a feature of the image. It is unbearable for GPU to store so many model parameters for these features. Hence, we need CNN to compress the number of parameters and extract features from an image.

Read more »

Discrete Fourier Transform (DFT) is a linear transform that converts a finite sequence of equally-spaced samples of a function in the time domain into a same-length sequence of equally-spaced samples in the frequency domain, which is a complex-valued function of frequency.

Read more »
0%