Data science is a balancing act—math and science have their role to play, but so do art and communication. Storytelling can be the binding force that unites them all. In this talk, Wendy Foster, Director of Engineering and Data at Shopify, explores how to tell an effective data story and illustrate with examples from Shopify's practice.
Wendy Foster: Thank you for joining me today I'm Wendy Foster and I'm a director of engineering and data at Shopify. I'm going to be talking a little bit about data storytelling and frameworks to support data storytelling today so just before I kickoff a couple of housekeeping like notes I will be in a separate tab managing my presentation but I encourage um folks who are watching to leave any questions as we go through the presentation in the comment section and I've left like a lot of time at the end in order to like loop back and be able to answer some of your questions if you have any So let's get this kicked off.
The place I actually wanted to start was back in 2010 and this is when Drew Conway published his now famous Venn diagram of data science so that model of overlapping skill sets that you can see on your screen right now really supported a new way of thinking about the job of the traditional statistician. And this is as the technological foundations of the discipline had begun to shift with the emergence of data science as a field so Conway's Venn diagram was unpacked as the new discipline began to evolve really to emphasize the critical component of creativity.
So we most commonly generalize this to communication and that was and remains essential to successful data science work that we do today, so in a meaningful way Conway's creative hacker mindset component really became an early differentiator of the data scientist in the rapidly changing landscape sounding applied practices in industry. And I think he calls this out in a blog post he has about the creation of this diagram as well. So for Conway it was really really important to capture the inherent interdisciplinary nature of the emerging discipline so the emphasis that Conway placed on the hacking and substantive Dimensions is actually pretty telling. So the ability for us to ask the right questions around the world and test those assumptions using statistical methods requires the ability to creatively and compellingly activate and express those hypotheses and results.
So the interlock between Dimensions became the field of play of the data scientist and of the discipline as it began to mature. So currently one of the largest challenges um an applied data scientists faces in industry is how to communicate insights in actionable ways specifically back to their organization applied data science has historically been challenged by organizational design and expectations so when I think of this I think we're in an organization does data science report into is it engineering, finance, product…Also, how do they interact with the business and product development teams? Is it a center of excellence. Is it embedded a service line or product line and as well how data science work is derived, prioritized and disseminated back to partners and stakeholders. And I think to be certain the shape of the role of the data scientist in industry has changed and diversified so what I've seen is insular research roles are increasingly more rare and demand for product analytics expertise has increased so pushing more directly on the creative communication aspects of the data science role.
More recently I'll say like the accelerated implementations of generative AI and products has made time to value for insights more critical than ever for understanding deriving value from and evaluating these new use cases and so given these challenges industry-based data scientists often struggle to, I think One, frame and communicate insights back to their industry Partners in accessible and actionable terms and two, create impact for their work in an environment where speed of decision making can often seem at odds with how data science just like really works. And when I was thinking about this a little more deeply I was in my for myself it feels like if it feels like there's an impotence mismatch between data science work in the organizations and products their work is intended to support. I think that there's like a correctness in that feeling so when the output of data science work is not speaking the language of the business, and the work itself is not proximate to the impact domains really the effort, the outcomes and the value is sidelined and lost. And I think that that's critically important for us to both interrogate as well as resolve.
So what is data what is data storytelling and why do I think data storytelling can help address some of the challenges that data science faces in industry contexts? So at its core really, data storytelling is about taking the step beyond the simple relaying of data points it's about trying to make sense of the world and leveraging storytelling to present insights to stakeholders in a way they can understand reflect on reference and probably most importantly put into action so data scientists can inform and influence through data storytelling by creating personal touch points between the audience and their analysis via right-sized artifacts and novel communication modalities.
So to give you some concrete examples, here at Shopify we have a wide range of insight presentation formats that I have found most valuable for teams are annotated pulses, so these are short automated contextually embedded insights that are delivered to domain-specific audiences. We typically deliver these via slot channels and slide docs which is a data storytelling format adapted from the design studio Duarte's principles for document design and that's when I spent most of the time talking about today so duarte's slide docs framework is a way of using presentation software so like PowerPoint to create visual reports where the intention is to be read not presented. So additionally unlike a chart or a dashboard, what the slide deck gives you is a well-framed narrative and it gives you that for free so you can kind of think of this a bit like a policy brief um that you would get in government so you can pack a dense amount of information and visuals into an easily digestible format that your stakeholders can read.
I think really the most important thing here too is that it's an artifact that they can return to okay so it supports active reflection and decision making. And again can't really emphasize this enough it's intended to be a long-lived artifact so as an example storytelling points baked into Shopify slide docs include the data question we're trying to answer, a description of our findings a graph, or a visualization of the data recommendations based on the findings, a link to the in-depth report, and how to contact the Storyteller.
Overall though preparing a slide deck is a creative exercise and there's no one correct way to present the data but it does demand an understanding of your audience and the shaping of the story that is most compelling and actionable so the slide doc should have a beginning and middle and an end and they are intended as a format to support guided exploration helping to create those personal touch points that I talked about a little bit earlier with the data so this does allow stakeholders to make a better informed decision at the end.
Okay so to map some of these concepts more concretely to each other we can consider the critical elements of a good story. So number one, we have a main character. Every story needs its hero and the central figure our main character in a data story is probably obviously the business problem, so you need to make sure to clearly identify the problem, summarize what you explored when considering that problem not just on the technical side either but on the domain side as well and provide any reframing of the problem that is necessary to get both deeper insight as well as to connect the problem back to your audience.
And then number two we have the setting so this is really setting the stage for your story um with context so what background information is key to understanding the problem you have to remember you're not just telling the story you're also providing direction for the interpretation so ideally we do want that to be in as unbiased away as possible. I always tell folks to remember too that creating a data story doesn’t mean force fitting the data into a preset narrative. As data scientists, it is our job to analyze the data and uncover the unique narrative or story arc that it presents and convey that back to the audience.
So number three the narrator. So really to guide your audience effectively you do need to be able to speak to them in a way they understand and resonate with. Ideally you communicate your data story in the language of the receiver so for example if you're communicating to a non-technical audience you try to avoid using jargon they won't be familiar with and if you have to use technical terms or acronyms you have to be sure to define them so you're all on the same page the language of the presentation here is critical it's really the primary component of how you engage your audience and drive actions and this is what will translate back into value for your business. The engagement part is crucial there.
And then number four we have the plot. So we talk about this internally is not leaving your audience hanging. It's really to be sure that you're clear as to what happens next the most compelling stories guide the reader to a response and data can direct the action by providing suggestions for as an authentic partner it really is helping your stakeholders figure out different approaches to solving the problem.
So in other words I really think it's like: be the author of your own conclusions and avoid offloading that responsibility back to the reader. Using data to tell compelling stories isn't just an internal lever at Shopify so this is a part I'm pretty excited to share because for us it's also a framework that we productize for our merchants analytic experiences so we have an interactive notebooks product that provides our merchants with a data storytelling experience and these experiences showcase a comprehensive overview of their store performance so one powerful way that our notebooks product has supported data storytelling is our bfcm–so that's our Black Friday and Cyber Monday notebook templates. So the BFCM period is a peak selling period for our merchants and e-commerce more generally of course. And it's a story that our Merchants want to share back out to the world and explore deeply with some degree of autonomy.
We do already have existing features for our Merchants that show like this is through reports dashboards contextual analytics how their business is performing but with notebooks we wanted to take it to the next level um really wanted to give them more agency over and a personal connection back to their own data via this interactive notebook experience and like in saying all of that we do understand as well it can be overwhelming for our merchants or anyone to have access to a massive set of data, but not know how to explore it. People might not know where to start or they may feel scared that they'll do it wrong. So what the BFCM notebook provided was really a scaffold to support merchants’ data exploration. It's an interactive visual companion that enables our merchants to dive into their performance data. So simple examples here, total sales top performing products buyer locations during their business busiest sales season so talking about the BFCM Notebook template there.
Starting with total sales as an example, merchants could drill into their data to understand their results um based on products days of the week location any Dimension that they were interested in exploring and if they wanted to go even deeper they can click over the visualizations to see the queries that power them so being able to see the queries is also a bit of a pedagogical too, it enables them to start thinking about writing queries of their own. So outside of BFCM the general notebooks product provides structured templates that give end users starting points for building and telling their own data stories and presenting those narratives back to their stakeholders. I think of this truly as a bit of a virtuous cycle so what we have seen from deploying data storytelling Frameworks in both internal and external contexts is an increase in time spent for end users engaging in ad-hoc querying or exploring their own data outside of structured reporting and dashboard experiences. So qualitative feedback from external end users has also surfaced that being able to create their own data stories has really been a catalyst for increasing confidence in data understanding and feeling really safe and supported in modifying the query templates that we do provide with the notebooks product and the templates can be extended or they can create new stories from their own data on their own as well. So the thing I love about this product is really that personal connection and the heightened ownership carries over into how they're able to more effectively average reporting tools outside of the notebooks environment it really makes them feel confident in being able to understand their own data so turning data storytelling into an experience has given our internal partners and our external Merchants really the confidence to explore and reflect on the stories their own data tells which empowers them in turn I think to take ownership of it and convert it to actions so this really to me is the core Roi of insight generation so while this today was just a lightning talk that begins to touch on some of the potential data storytelling I really do hope that there's kernels of inspiration here you can take forward into your own practices for yourselves and for your teams for me I think the primary takeaways um especially when like I first started experimenting with these practices was the importance of making data personal, making it connected prioritizing its engagement possibilities like I think of this like fundamentally as making data tactile so data science is as much a creative enterprise really as it is a scientifically grounded one. And my ask for the field is to lean into that um as strongly as possible.
So I see that there's a question from community is what are the key elements of a compelling data-driven story that can engage both Technical and non-technical stakeholders I don't know if anyone wants to make a comment around that in my experience I've seen a lot of focus and engagement around actionability I don't know if that resonates.
How will the role of data storytelling evolve for machine learning practitioners in the future?
This is a really good question and it's one I've put especially probably over the last couple of weeks like not in substantial amounts of like the amount of thought into like I think a lot if I focus too on like the the kind of rapid acceleration of implementation for generative AI products and tools. That's happening today I think like the excitement that's generated around those product opportunities is the ability to be able to solve especially workflow automation and creative enterprise problems that have been I think like particularly hard to solve previously, but again like I'm going to go back to to the point for all kind of like data endeavors some of the hardest like pieces to communicate are the ROI on that um and I think that especially when we're talking about being able to derive new use cases um for new AI tools um telling the story of how and why these Investments are important and being successful um is going to be critical um maybe it's moved beyond the excitement that we can do this now and start talking about the value that it's bringing our end users hopefully that answers a little bit of the question I don't know that the actual like parameters of it has to evolve or like change I think the formats like are pretty easily transposable.
Okay please feel free to post any other questions that you may have otherwise we can probably do a wrap up.
So I want to thank everybody so much for joining me today. I wish you much happy storytelling.