More powerful deep learning with transformers (Ep. 84)

Some of the most powerful NLP models like BERT and GPT-2 have one particular component in common. They all use the transformer architecture. But deep learning transformers are a very powerful tool that can performs well across several domains. The machine learning community widely deploy transformers already. Moreover, the common component to all transformer architectures is the self-attention mechanism.

The most powerful aspect of self-attention is that it is completely unsupervised. In fact, one does not require labelled data to apply self-attention. But, as in any unsupervised method, one just need the input data.

Moreover, the self-attention mechanism is based on a very simple linear algebra operation that modern CPUs (and GPUs) can perform very quickly. Such operation is the dot product between any two matrices.

deep learning transformers

In this episode I explain some technical aspects about these methods. Moreover, I elucidate the reasons behind their effectiveness. Why do they actually work so well?
With a very simple example that comes from the field of recommender systems, I explain the nuts and bolts of both self-attention and the transformer architecture built on top. Needless to say, such approach can be used to many domains out there. From images, to sound, text and of course numerical samples.

Don’t forget to subscribe to our Newsletter from which you will receive our updates straight to your inbox (spam is not included!).

Wouldn’t it be great to discuss about previous episodes? How about proposing new ones. We are sure there is that one topic you would like to know more about. Come join the discussion on our Discord server and chat with us.

deep learning transformers


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Data Science

Our Services

Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve.

We provide solutions in:
  1. AI/ML
  2. Fintech
  3. Healthcare/RWE
  4. Predictive maintenance

Discord community chat

Join our Discord community to discuss the show, suggest new episodes and chat with other listeners!

Subscribe to our newsletter