AI research

An old AI architecture shows off some new tricks

Summary GigaGAN shows that Generative Adversarial Networks are far from obsolete and could be a faster alternative to Stable Diffusion in the future. Current generative AI models for images are diffusion models trained on large datasets that generate images based on text descriptions. They have replaced GANs (Generative Adversarial Network), which were widely used in …

An old AI architecture shows off some new tricks Read More »

Metas DINOv2 is a foundation model for computer vision

Summary Metas DINOv2 is a foundation model for computer vision. The company shows its strengths and wants to combine DINOv2 with large language models. In May 2021, AI researchers at Meta presented DINO (Self-Distillation with no labels), a self-supervised trained AI model for image tasks such as classification or segmentation. With DINOv2, Meta is now …

Metas DINOv2 is a foundation model for computer vision Read More »

Instruct-NeRF2NeRF lets you edit NeRFs via text prompt

Summary Instruct-NeRF2NeRF uses methods of generative AI models and can edit 3D scenes according to text input. Earlier this year, researchers at the University of California Berkeley demonstrated InstructPix2Pix, a method that allows users to edit images in Stable Diffusion using text instructions. The method makes it possible to replace objects in images or change …

Instruct-NeRF2NeRF lets you edit NeRFs via text prompt Read More »

Zip-NeRF is another step towards a digital time machine

Summary People take photos for many reasons, one of which is to capture memories. The next generation of keepsake photos may be NeRFs, which get a quality upgrade at high speed with Zip-NeRF. Google researchers demonstrate Zip-NeRF, a NeRF model that combines the advantages of grid-based techniques and the Mipmap-based mip-NeRF 360. Grid-based NeRF methods, …

Zip-NeRF is another step towards a digital time machine Read More »

OpenAI CEO sees ‘end of an era’ in number of parameters

Newsletter In recent years, the potential progress of large language models has been measured primarily by the number of parameters. Sam Altman, CEO of OpenAI, believes that this practice is no longer useful. Altman compares the race to increase the number of parameters in large language models to the race to increase the clock speed …

OpenAI CEO sees ‘end of an era’ in number of parameters Read More »

Google’s medical language model “Med-PaLM 2” enters pilot phase with first customers

Summary Google is rolling out Med-PaLM 2 on a limited basis for initial testing. Update April 14, 2023: Google Cloud announces that Med-PaLM 2 will be rolled out to select Google Cloud customers for a “limited test” in the coming weeks. The goal, the company says, is to explore safe, responsible and meaningful use scenarios. …

Google’s medical language model “Med-PaLM 2” enters pilot phase with first customers Read More »

Here’s how OpenAI’s DALL-E 3 could leapfrog the competition

Summary All generative AI models for images currently use diffusion models. OpenAI presents an alternative that is significantly faster and could power new models like DALL-E 3. DALL-E 2, Stable Diffusion, or Midjourney use diffusion models that gradually synthesize an image from noise during image generation. The same iterative process is used in audio or …

Here’s how OpenAI’s DALL-E 3 could leapfrog the competition Read More »

An image model at Midjourney’s level?

Summary A new beta version of Stable Diffusion delivers much more aesthetic and photorealistic results than the previous version. Will this make commercial offerings obsolete? While Stable Diffusion is the most developed open-source image model, it can’t always match the quality and especially the accessibility of commercial competitors like Midjourney. Its strength so far is …

An image model at Midjourney’s level? Read More »

Sims running on ChatGPT are a glimpse into the social future of AI

Summary Characters in video games could soon feel even more realistic. But these AI sims could also be helpful outside of gaming. In a new paper, researchers from Google and Stanford University simulate human behavior using large-scale language models. The paper, titled “Generative Agents: Interactive Simulacra of Human Behavior” relies on ChatGPT and offers more …

Sims running on ChatGPT are a glimpse into the social future of AI Read More »

Google’s medical language model Med-PaLM 2 passes exam questions

Summary Med-PaLM is Google’s variant of the PaLM language model optimized for medical questions. The latest version is designed to answer medical questions reliably at an expert level. Last December, Google unveiled Med-PaLM, a version of Google’s giant PaLM (Pathways Language Model) language model optimized for answering medical questions. Med-PaLM was developed using a special …

Google’s medical language model Med-PaLM 2 passes exam questions Read More »

Scroll to Top