Trends and Innovations in the World of Artificial Intelligence in 2024

What models and applications will be at the forefront of the sector next year?

2023 was a crucial year in the world of AI: from the launch of Chat-GPT, and with the spread of generative models in the general market, Artificial Intelligence has captured the interest of the public as well as companies - including many that until a few months earlier considered themselves completely unrelated to the topic. The moment of popularity also coincided with renewed excitement in applied research: from refining neural network architectures to experimenting with new models to enable robots to move autonomously, there are continuous announcements of new techniques and presentations of innovative tools based on Artificial Intelligence. In such a lively context, it is difficult (but essential) to stay updated on all technological evolutions and identify the applications that will prove most effective. For this reason, drawing on the academic expertise and field experience of the Aidia team, we have decided to compile a list of the most relevant AI trends for 2024: a point of reference for the aspects of the sector to pay the most attention to.

Innovations in academic research

Given the developments of the past year, with the great interest in the generative field and some interesting innovations in the robotics sector, it is easy to believe that the next trends, regarding the creation of new models and techniques, will primarily concern the generative sector and secondly the learning systems designed for robotics. Other aspects to watch in the research field will be experiments with new types of neural architectures (increasingly often inspired by physics and natural phenomena) and attempts to combine the potential of Quantum Computing with Artificial Intelligence.

In the most talked-about field of AI, Generative AI, we will likely see increasingly sophisticated video generation tools emerge. The recent developments of Runaway, the startup co-creator of Stable Diffusion that launched in November the first model that produces realistic videos and the launch, just a few days ago, of Sora, OpenAI’s video generation model, suggest that 2024 will be the year of generative AI for video production. The next prototypes of these and other models (such as Imagen and Make-A-Video) will likely produce longer, more detailed videos free of some of the most noticeable imperfections produced by the first models. This means improving aspects such as the lack of permanence of subjects that momentarily leave the scene or the ability to maintain a certain proportion in the size of objects. The improvement in the ability to generate high-quality videos (and images) will go hand in hand with the improvement and refinement of diffusion models**: the foundational models of this type of Artificial Intelligence. Diffusion models (or “probabilistic diffusion models”) learn to generate original images by first learning how to “degrade” and hide images (by inserting noise, i.e., interference, into the original dataset) and then decoding and making the same datasets, the same images, visible again, previously compromised. The process, after many iterations, allows the creation of a system that can generate new images from noise alone. For this reason, it is believed that the next innovations in this field will aim to accelerate sampling processes: new techniques to improve the understanding of the specific distribution of a dataset and learn to generate consistent outputs, even when the available data is scarce. These developments in the video field will certainly be accompanied by *further refinements of Large Language Models* - the basis of some of the most widespread generative AI applications - just think of the new model from Google, *Gemini, launched two months ago, and _Mixtral, the largest European LLM, which was presented just a few weeks ago: both significant steps forward compared to their direct predecessors.

AI for robotics: multimodal models and new learning techniques

Robotics, driven by innovations related to Machine Learning and AI, is undergoing a profound renewal - which could turn into a real technological revolution in a few years. The renewal is particularly driven by two aspects of research related to Machine and Deep Learning: innovations related to multimodal learning (a type of Deep Learning that can process various types of data in parallel, such as texts and images) and progress related to learning methods. In the last six months, for example, various experiments on new types of frameworks and architectures have been made public (such as OK-Robot, which combines language models and visual recognition with algorithms aimed at spatial navigation and object manipulation); but also various original methods to “teach” the models underlying robots how to move and manipulate the context around them. In October, for example, a group of researchers from Berkeley and MIT, in collaboration with DeepMind, hypothesized the use of realistic simulators to train models for robotics - while other studies have focused on new types of learning, such as “self-supervised learning” or “Human Guided Exploration (HuGE)”. These increasingly close and concrete developments suggest that the next year will be full of news.

Practical applications: trends and evolutions

From medicine to scientific research, the practical implications of Artificial Intelligence have increasingly concrete consequences in the real world - but among these, which will be the most important? Which applications will have the greatest effects? Given the vastness of the issue, the most interesting innovations could emerge both in scientific research, for example in the field related to the discovery and experimentation of new materials, and in the refinement of tools and applications that tangibly improve business productivity - and equally significant innovations could emerge in the field of video games and entertainment and in the sector dedicated to customer assistance. Many of the innovations are closely linked to the progress and refinement of generative models, which could open up new spaces for personalization and interaction, but also simply new computational capabilities.

A new type of business productivity: augmented work

If humans are more prone to errors, fatigue, and unpredictable variations in productivity, on the other hand, AI systems are much more limited when it comes to creatively solving a problem or inventing and making strategic decisions. For this reason, the greatest opportunities in the world of work will come from the possibility of combining these two elements: providing the workforce with agility and speed tools that minimize mundane operations; pairing AI with human control and guidance, which can direct and optimize its work. In its early stages last year, the so-called “augmented work” will become increasingly prevalent in the coming months, with the effective refinement of digital assistants and automation tools, but also with the formation of new work paradigms - designed and structured to truly harness the power of Artificial Intelligence.

Enhancing video games with AI

Among the most fascinating applications with the greatest development prospects are those related to entertainment: the support for creating graphic content and the ability to generate stories, conversations, and new environments open up new possibilities in cinema, but even more so in the world of video games. Video game production companies have been trying for years to guarantee immersive and highly personalized experiences to players, with characters managed by bots and dynamic plots that can vary with the player’s choices. However, these are usually procedural generation algorithms that combine the repetition of “patterns” created by humans with random elements generated by algorithms: techniques that can be used to expand game environments but not to provide a truly tailored experience for users. The capabilities of generative AI, on the other hand, could drastically transform the gaming experience: ensuring original and interactive dialogues with NPCs (characters guided by bots), generating vast worlds, or elaborating new levels and game difficulties in real-time. Given the nascent nature of the technology (the first AI capable of creating images is not even 3 years old), however, there have not yet been any concrete applications of generative models in video games: the first could emerge in the coming months.