State of AI Engineering: Survey
Industries are racing to integrate large language models (LLMs) into their core operations. From better summarizing medical research to navigating complex case law, many early movers are seeing outsized benefits. The professionals building LLM systems — AI engineers, developers, data scientists, and industry leaders — play a pivotal role in shaping this future. This survey delves into the minds of these builders, offering a glimpse into their plans and challenges as they navigate the complexities of deploying LLMs in production environments.
Methodology
To capture the current state of AIOps and AgentSREs, we surveyed 471 AI engineers, data scientists, developers, and others across a range of sectors. The survey, which was promoted in technical communities and via email to adopters of open source tools, includes questions about deployment strategies, model preferences, and the primary obstacles faced in implementing LLMs.
Small Language Models; Big Impact
Despite models like Microsoft’s Phi-3 and OpenAI’s GPT-4o mini only launching recently, organizations are already rapidly shifting to using them in production given their economical performance. Over half of AI teams (57.7%) say they are planning a production deployment of a small language model in the next 12 months or “as fast as possible,” and 5.9% are already in production.
LLMs Escape the Workplace
AI engineers and developers are leveraging generative AI outside of work. Nearly half (45.9%) say they are using LLMs to code personal projects. Over one in ten also report using LLMs to create social media posts (11.5%) and apply to jobs (11.3%). 3.4% even admit to using LLMs on dating apps.
Rising Tide Lifts All Foundation Models
Despite so many new models launching over the past year, OpenAI’s early dominance does not appear to be eroding. Adoption of most foundation models is up year over year. Despite its oldest model being under a year old, Mistral is now used by 23.1% of AI teams.
Use Cases
The most common use cases cited by AI teams who have or are planning production deployments of LLMs are chatbots, followed by code generation, summarization, and structured data extraction. Chatbots appear evenly split between internal-facing and external-facing applications. Common “Other” responses provided by respondents include text and image generation (i.e. for product documentation), entity extraction, and translation.
Barriers To Implementation
Privacy is cited as the top implementation barrier to production deployments of LLMs, followed closely by accuracy of responses and hallucinations. The most common “Other” answer is cost, followed by various calls for better guardrails and safeguards.
Prompt Engineering and New Tools Enter the Mainstream
Most AI teams using LLMs today rely on prompt engineering. In terms of tooling, nearly one in five (41.0%) rely on LLM observability to better evaluate and trace generative AI applications, followed closely by vector DBs and AI memory (39.7%).
Preferences: Open-Source Versus Proprietary
Developers and AI teams are nearly evenly divided on different options for proprietary and open source models. Over the past year, there appears to be a surge in interest in third-party cloud hosted (i.e. GPT-4o on Azure) and a slight decrease in proprietary fine-tuned (i.e. Llama 3 deployed on a private cloud).
Split On Regulation
Do AI engineers and others building with LLMs favor more regulation of AI? No, it seems. The 18.9% who favor more regulation appear to be outnumbered by the one-third prefer to hold off on new regulation or better enforce existing regulations and 31.9% who are neutral.
Serving Language
When it comes to serving languages, Python rules the day.
Conclusion
The survey highlights a decisive shift towards widespread adoption of LLMs, driven by their potential to revolutionize everything from medical research to business operations and customer experiences. AI engineers are poised to navigate challenges like cost and expertise, with a focus on practical applications and a mix of open-source and proprietary models. As the industry evolves, we anticipate a surge in tools and best practices to streamline LLM deployment and maximize their impact.