Visualize crime dataset and create realistic faces using DALL-E API
In my previous post, How to Use DALL-E for Data Visualization, I explored how text-to-image generation can be used for data visualization.
I started by looking at Chernoff Faces, a visualization that displays multivariate data on the shape of a human face by mapping fields to facial features. Typically, Chernoff faces are cartoonish line drawings.
But I was curious whether DALL-E’s ability to generate more realistic faces helped to convey the data more effectively.
I decided to make a simple A/B test. I started by choosing a dataset, US crime rates by state, and selected a subset of the available areas: burglary, burglary, and robbery.
With the data set in hand, I created two visualizations:
- The first visualization used a cartoonish Chernoff face, which was created using
APLPACK(Another Plot Package) Package by Peter Wolf.
- The second visualization uses photorealistic faces created using DALL-E via text signals generated by a Python script.
For this part of the process, I used R Studio and
APLPACK to generate faces. Nathan Yau has a great article on how to visualize data with cartoonish faces a la Chernoff faces, so I won’t repeat what I wrote. Read that excellent article if you want to follow and create your own article.
Here are the results of Visualization #1:
To visualize data sets using DALL-E, I created a Python script that does three important things:
First, it creates a mapping that defines which face parts are associated with each metric. In the summary below, you can see the field
burglary is mapped to hair length,
robbery is mapped to the eye shape, and
Second, it uses
Pandas To open and use a CSV file
pandas.qcut To split the values into equal sized buckets based on the number of labels in the mapping.
it means that
qcut Tries to split the data into equally sized bins (one for each label). For example,
robbery There are three labels: “closed eyes,” “relaxed eyes,” and “fearful eyes wide open,” so
robbery The values will be ‘binded’ to one of those three buckets.
The script stores these discretionary, or ‘bind’ values, as a new field in the data frame. Here’s what it looks like:
Here is a sample output of a subset of the data frames.
Third, for each row in the data set, it generates pointers from a template. The template populates its values using the binned values from the previous step.
Putting this all together, we have the final script to generate the signal. Running it produces the following output:
With the DALL-E API now available, automating the process of creating the actual images was a snap.
After some experimentation, I added “30-year-old” and “white male” to all cues to try to normalize the baseline photos as much as possible. I also added “Mugshot Photograph” to try to control how much of the person’s face is presented in the image. Considering the data set we are working with, mugshots are also good.
And the final result…
Chernoff’s main idea behind using faces is that humans easily recognize faces and notice small changes in faces without difficulty.
For my informal n=1 a/b test, I can more easily see subtle changes in photorealistic faces than changes in cartoons.
When choosing a state to live in, you may want to stay away from states with scared, long-haired dudes (at least when it comes to burglaries, burglaries, and robberies).
On the other end of the spectrum, you have the states represented by sleepy-looking bald people.
With right field → facial feature mapping, the resulting photorealistic image conveys emotion in a way not expressed in the cartoonish counterpart, making some faces stand out.
There were a few challenges, not the least of which was the limited ability to control features in the generated image, which made it difficult to make subtle distinctions between individual faces.
But the pursuit of broad strokes? I think it can be useful.
what do you think