The dawn of large neural networks and their transformative impact on AI
The last two years have seen a dramatic growth of neural networks. Models such as Bert Large with its 345 million parameters which were considered huge just three years ago are being dwarfed by modern DNNs which are frequently more than 1000 times bigger. This growth is not driven by accademic vanity of the AI comunity but by their exceptional few shot learning capability. In contrast to their smaller counterparts that require thousands/tens of thousands of training samples to adapt to new tasks large models achieve state of the art performance with just a handful of samples fundamentally transforming the way we do AI. This revolution started in Natural Language Processing but is already transforming countless other disciplines like computer vision, automatic speach recognition, chmical compount modeling or protein modeling where very large unsupervised models are becoming a common practice. In this talk we will dive deeper into the properties of large DNNs and discuss the process of creating them, from data prepartion through distributed training to production deployment.