How AI can empower local journalism: Part 1 - Image Generation
Over the past few months, I’ve spent time exploring the many different ways AI could enhance local journalism, equipping journalists and newsrooms with additional tools that could help them increase efficiency, reduce costs, produce more engaging content and most importantly, thrive rather than survive.
In the coming weeks, I will delve into a series of posts, each focusing on specific AI applications including: image generation, text generation, sentiment analysis, speech-to-text, language models, video, data analysis and more.
We've already started incorporating some of these functionalities into our free journalist tool suite, which you can access here. As we continue to develop the platform we will be integrating most of the features I will discuss in this series.
In my first post I want to discuss AI-powered image generation and how journalists could leverage its potential.
When someone mentions AI-powered images, you probably think of the recent viral pictures of the Pope in a puffa jacket or Donald Trump in handcuffs. Unfortunately, with any new technology, people will be waiting in the wings to exploit its flaws and weaknesses. But there are a great number of use cases where AI image generation could help news and journalism.
In my opinion, images of current affairs, real life events, famous people and a few other categories should always be real. AI will not replace a photojournalist and the many areas they specialise in such as: Breaking News, Documentary, Sports, Conflict/War, Travel, Environmental etc… These people tell stories through their images. Their images create feelings and draw people into articles. Trying to replace these images with AI would not only cause controversy, bad reporting and possible legal issues; it would also just be bad practice and not ethical.
On the other hand AI-powered images could have a massive role in feature stories, evergreen content and areas such as lifestyle, where quite often articles are accompanied by stock photography. Currently publications spend considerable amounts of money on stock photography. Stock photos can cost anything from a few dollars per image up to $30-$40 per image and even much higher. Multiply that across a day or a month, and it soon adds up. Yes, there are free sites such as Unsplash and Pexels, but they often just don’t have the exact image or quality required to complement an article. Using these alternatives to save money often results in a less effective image, leading to less visual impact and click throughs.
Pretty soon that won’t be the case! Imagine if you could generate the perfect stock photo to accompany your article. Imagine if it perfectly correlated with your headline and had the exact impact required to pull users in. Now imagine if that technology was free and you didn’t have to worry about copyright issues.
Well that's exactly what we are working on. Currently the technology is a little clunky, but it is close to being ready (Hopefully by fall 2023). The image models that power AI image generation have to date been mainly focused on people, animals and large objects like cars.
Here are some examples of images I generated in just a few seconds each.
I generated this image based on a headline about DNA From Beethoven’s hair unlocking medical and family secrets.
Prompts Used for this image:
beautiful ornate treehouse in a gigantic pink cherry blossom tree :: on a high blue grey and brown cliff with light snow and pink cherry blossom trees :: Roger Deakins and Moebius and Alphonse Much and Guweiz :: Intricate details, very realistic, cinematic lighting, volumetric lighting, photographic, --ar 9:20 --no blur bokeh defocus dof --s 4000
Prompts Used for this image:
photograph close up portrait 62-year-old tough decorated general, CLEAN SHAVEN, serious, stoic cinematic 4k epic detailed 4k epic detailed photograph shot on kodak detailed bokeh cinematic hbo dark moody --ar 17:22 --beta --upbeta
A picture in a potential interior design piece.
This was inspired by an article about retirement.
This was inspired by a healthy eating article.
The images presented above look good but required an extensive amount of prompts to produce the detail and realism necessary to make them worthy of professional use.
They are also the best looking versions from quite a few attempts. Currently the AI is not error free and can make mistakes such as giving people 6 fingers or 3 ears. This happens because the AI is trained on millions of images with accompanying text descriptions, but not always on the rules and boundaries that govern reality. The AI for example might have 10,000 images of hands, but it hasn't been trained that a hand can't have 6 fingers. These additional parameters are what is needed to make it truly capable of being used in a professional environment.
This picture demonstrates some of the limitations with more generic image prompts. It was meant to be "tablets on a table next to a glass of water", but obviously lacks the intended clarity and accuracy. Many image models have not yet been trained on a broad enough range of images to accurately compose everyday objects. Hopefully over the next few months this will change as the focus on training models expands from faces, animals and larger objects such as cars and trains.
The process also needs to be simplified, so that with a few simple words, or better yet, the title of an article, a journalist could generate a usable image.
We are currently using Stable Diffusion, which is a deep learning text-to-image based AI model created and released under a Creative ML OpenRAIL-M licence by Stability.AI. This licence means we are free to use the software in our own way and adapt it to suit journalism.
What about image rights, I hear you ask! As we build our system, we will only use open source image models. This means the AI is trained using images that are freely available and have no copyright infringement. All the images generated from our system will therefore be free to use and publish with no restrictions or copyright implications for the end users.
Our goal is to offer a tool for journalists that will take a headline or a few keywords and generate a great looking feature image for use in the article.
If a local newsroom currently purchases just 2 stock images a day at $5 each, that's $70 per week or $3,640 per year. We want to remove that cost altogether. Extrapolate that out across thousands of local publications and that adds up to millions of dollars of potential savings and money that can be put back into local news in other ways.
We plan to have a tool that can generate tens of thousands of images per day without the need to pass on any costs to journalists or newsrooms. We will release more details on this in the coming months.
Hopefully this was an insightful look at how AI image generation may play a part in journalism in the very near future. Next week I will focus on generative content models and how platforms like Chat GPT and Bard can give journalists and publications skills and insights they would otherwise have to outsource.
You can check out some cool examples of other AI generated images here: https://prompthero.com