AI success depends on multiple factors, but the key to innovation is the quality and accessibility of an organization’s proprietary data.
I sat down with Teresa Tung to discuss the opportunities of proprietary data and why it is so critical to value creation with AI. Tung is a researcher whose work spans breakthrough cloud technologies, including the convergence of AI, data and computing capacity. She’s a prolific inventor, holding over 225 patents and applications. And as Accenture’s Global Lead of Data Capability, Tung leads the vision and strategy that ensures the company is prepared for ever-changing data advancements.
We discussed a host of topics, including Teresa’s six insights.
Finally, we concluded with Teresa’s Advice for business leaders using or interested in AI
Susan Etlinger (SE): In your recent article, “The new data essentials,” you laid out the notion that proprietary data is an organization’s competitive advantage. Would you elaborate?
Teresa Tung (TT): Until now, data has been treated as a project. When new insights are needed, it can take months to source the data, access it, analyze it, and publish insights. If those insights spur new questions, that process must be repeated. And if the data team has bandwidth limitations or budget constraints, even more time is needed.
“Instead of treating it as a project—an afterthought—proprietary data should be treated as a core competitive advantage.”
Generative AI models are pre-trained on an existing corpus of internet-scale data, which makes it easy to begin on day one. But they don’t know your business, people, products or processes and, without that proprietary data, models will deliver the same results to you as they do your competitors.
Companies invest every day in products based solely on their opportunity. We know the opportunity of data and AI—improved decision making, reduced risk, new paths to monetization—so shouldn’t we think about investing in data similarly?
SE: Since so much of a company’s proprietary knowledge sits within unstructured data, can you talk about its importance?
TT: Yes, most businesses run on structured data—data in tabular form. But most data is unstructured. From voice messages to images to video, unstructured data is high fidelity. It captures nuance. Here’s an example: if a customer calls customer support and leaves a product review, that data could be extracted by its components and transferred to a table. But without nuanced inputs like the customer’s tone of voice or even curse words, there isn’t a complete and accurate picture of that transaction.
Unstructured data has historically been challenging to work with, but generative AI excels at it. It actually needs unstructured data’s rich context to be trained. It’s so important in the age of generative AI.
SE: We hear a lot about synthetic data these days. How do you think about it?
TT: Synthetic data is necessary to fill in data gaps. It enables companies to explore multiple scenarios without the extensive costs or risks associated with real data collection.
Advertising agencies can run various campaign images to forecast audience reactions, for example. For automotive manufacturers training self-driving cars, pushing cars into dangerous situations isn’t an option. Synthetic data teaches AI—and therefore the car—what to do in edge situations, including heavy rain or a surprise pedestrian crossing.
Then there’s the idea of knowledge distillation. If you’re using the technique to create data with a larger language model—let’s say, a 13-billion-parameter model—that data can be used to fine tune a smaller model, making the smaller model more efficient, cost effective, or deployable to a smaller device.
AI is so hungry. It needs representative data sets of good scenarios, edge conditions, and everything in between to be relevant. That’s the potential of synthetic data.
SE: Unstructured data is generally data that human beings generate, so it’s often case-specific. Can you share more about why context is so important?
TT: Context is key. We can capture it in a semantic layer or a domain knowledge graph. It’s the meaning behind the data.
Think about every domain expert in a workplace. If a company runs a 360-degree customer data report that spans domains or even systems, one domain expert will analyze it for prospective customers, another for customer service and support, and another for customer billing. Each of these experts wants to see all the data but for their own purpose. Knowing trends within customer support may influence a marketing campaign approach, for example.
Words often have different meanings, as well. If I say, “that’s hot for summer,” context will determine whether I was implying temperature or trend.
Generative AI helps surface the right information at the right time to the right domain expert.
SE: Given the pace and power of intelligent technologies, data and AI governance and security are top of mind. What trends are you noticing or forecasting?
TT: New opportunities come with new risks. Generative AI is so easy to use, it makes everybody a data worker. That’s the opportunity and the risk.
Because it’s easy, generative AI embedded in apps can lead to unintended data leakage. For this reason, it’s critical to think through all the implications of generative AI apps to reduce the risk that they inadvertently reveal confidential information.
We need to rethink data governance and security. Everyone in an organization needs to be aware of the risks and of what they’re doing. We also need to think about new tooling like watermarking and confidential compute, where generative AI algorithms can be run within a secure enclave.
SE: You’ve said generative AI can jumpstart data readiness. Can you elaborate on that?
TT: Sure. Generative AI needs your data, but it can also help your data.
By applying it to your existing data and processes, generative AI can build a more dynamic data supply chain, from capture and curation to consumption. It can classify and tag metadata, and it can generate design documents and deployment scripts.
It can also support the reverse engineering of an existing system prior to migration and modernization. It’s common to think data can’t be used because it’s in an old system that isn’t yet cloud enabled. But generative AI can jumpstart the process; it can help you understand data, map relationships across data and concepts, and even write the program including the testing and documentation.
Generative AI changes what we do with data. It can simplify and speed up the process by replacing one-off dashboards with interactivity, like a chat interface. We should spend less time wrangling data into structured formats by doing more with unstructured data.
SE: Finally, what advice would you give to business and technology leaders who want to build competitive advantage with data?
TT: Start now or get left behind.
We’ve woken up to the potential AI can bring, but its potential can only be reached with your organization’s proprietary data. Without that input, your result will be the same as everyone else’s or, worse, inaccurate.
I encourage organizations to focus on getting their digital core AI-ready. A modern digital core is the technology capability to drive data in AI-led reinvention. It’s your organization’s mix of cloud infrastructure, data and AI capabilities, and applications and platforms, with security designed into every level. Your data foundation—as part of your digital core—is essential for housing, cleansing and securing your data, ensuring it’s high quality, governed and ready for AI.
Without a strong digital core, you don’t have the proverbial eyes to see, brain to think, or hands to act.
Your data is your competitive differentiator in the era of generative AI.
Teresa Tung, Ph.D. is Global Data Capability Lead at Accenture. A prolific inventor with over 225 patents, Tung specializes in bridging enterprise needs with breakthrough technologies.
Learn more about how to get your data AI-ready:
- Get to know the Microsoft Intelligent Data Platform.
- Learn why a strong data and AI foundation is essential to every organization’s success, today and tomorrow.
- Learn how to develop an intelligent data strategy that endures in the era of AI with the downloadable e-book.
Visit Azure Innovation Insights for more executive perspective and guidance on how to transform your business with cloud.
The post 6 insights to make your data AI-ready, with Accenture’s Teresa Tung appeared first on Microsoft Azure Blog.