Community papers - resources image

Four Tips on How To Read AI Research Papers Effectively

Published Apr 25, 2024

amber roberts arize

Amber Roberts

Machine Learning Engineer

According to a recent survey, over two-thirds (66.9%) of developers and machine learning teams are planning production deployments of LLM apps in the next 12 months or “as fast as possible” – and 14.1% are already in production!

Given the rapid rate of progress and constant drumbeat of new foundation models, orchestration frameworks and open source libraries – as well as the workaday challenges of getting an app into production – it can be difficult to find the time to digest and read the dizzying array of cutting-edge AI research papers hitting arXiv.

That task has never been more critical, however, as the time between academic discovery and industry application moves from years to weeks. How can teams read AI research papers without losing nuance, with an eye toward pragmatic application, while balancing real-world challenges?

In a recent webinar with Deep Learning AI, we explored strategies for understanding and applying the latest research, reducing mean time to application. Here are four takeaways.

Follow the Right People

explosion of ai research
The onslaught of new research necessitates a culling mechanism in determining what to read. Investing the time to build a list of researchers and industry executives to follow based on your specific industry or interest areas is critical. Some of our favorite follows range from Yannic Kilcher to DAIR_AI, Yann LeCun, Andrej Karpathy, swyx, Jeff Dean and Greg Kamradt to name only a few. Joining technical communities like Cerebral Valley and attending AI research paper community readings can also help.

Identify the Type of Paper and Break It Down Accordingly

While there are probably dozens of archetypes of AI research papers across academia and industry, many fall under three general categories that are a useful shorthand for practitioners.

Surveys

Surveys typically give a detailed overview of a certain topic, providing a summary spread of where the field is right now in a specified area. Generally, the goal with a survey paper is to get an overview of what is happening in a given field to identify trends and common patterns for research opportunities.

An example is illustrative. Say you are wondering whether to read “A Survey of Large Language Models.” This might be useful if you want to:

Use one of the LLMs or compare your current LLM
Compare open vs. closed source capabilities
Compare pre-training, data curation and prompting methods
Compare architectures and parameters (Encoder/Decoder, Size, Normalization, Activation, Bias, Attention patterns)
Compare cost, compute or hardware components
Review the comparative capacities and evaluations

It’s worth keeping in mind that survey papers aren’t as good at offering a technical deep dive into a specific model or introducing any new or novel ideas.

Benchmarking and Dataset Papers

Benchmarking papers are usually the first step after a breakthrough paper because they often define how we evaluate new breakthroughs. Examples in the world of LLMs many will recognize include MMLU, HellaSwag, and TruthfulQA. These papers typically introduce a dataset for testing or roll out a new evaluation approach on a dataset with a goal of using a new dataset or evaluation metric to evaluate capabilities of an LLM, learning the limitations of a model based on what and how they are evaluated, or considering how to expand benchmark capabilities.

To readers, these papers are worth reading when you want to use the metric to benchmark your current LLM, compare model costs to performance benchmarks, or potentially modify a benchmark to be better for your use case.

A few things to look out for on these papers:

Bias in the dataset
Does a single definitive answer exist?
Does the question provide enough context?
Does it count if they get the right answer the wrong way?

Generally, these papers are less useful for introducing new capabilities or breakthroughs or detailed breakdowns.

Breakthrough Papers

Breakthrough papers – think Mixtral of Experts, QLoRA: Efficient Fine Tuning of Quantized LLMs, or LLaMA: Open and Efficient Foundation Language Models – are must-reads because they represent major leaps forward in the field. Reading these effectively generally means understanding what novel idea is being introduced and how it impacts the current environment, with potential applications. These papers do not usually provide a good overview of a whole space or show the datasets launched to evaluate a model.

Be An Active, Agile Reader

Approach each paper knowing that it’s a piece of a larger puzzle. Recognize that what you’re reading today might be challenged or built upon tomorrow. This field evolves rapidly, and maintaining an open, inquisitive mind is essential. For technical readers, this means constantly questioning and validating findings, even if they come from reputed sources or established theories. To that end, getting hands-on is critical. Implementing a model from a paper in a notebook, replicating a study, or even proposing an alternative approach can all help not only understand the paper better but also contribute to the field.

Follow Real-Time Progress In the Field

(1/6) Can LLMs Do Time Series Analysis ⏲️? GPT-4 vs Claude 3 Opus 🥊

We have seen a lot of customers trying to apply LLMs to all kinds of data, but have not seen many Evals that show how well LLMs can analyze patterns in data that are not text related – especially timeseries🕰️… pic.twitter.com/7t95VEG9aQ

— Aparna Dhinakaran (@aparnadhinak) March 29, 2024

Traditional, peer-reviewed research takes a long time. Even preprint papers involve long review processes and tight reviews of results, with collaboration between many authors. In an industry where foundation model breakthroughs and new frameworks are upending traditional machine learning use cases overnight, however, it is important to stay abreast of research in all of its forms.

To that end, our co-founder and Chief Product Officer Aparna Dhinakaran recently started releasing bi-weekly research on social media and our blog on burning questions from customers, publishing repeatable open source results – reviewing with internal teams prior to publishing to test for “holes.” We are encouraged to see others embracing this approach as well on fast-moving topics, with the understanding that we are going fast so we might make mistakes – and that’s OK so long as we own up to them.

Conclusion

As AI continues to grow, these skills will be increasingly crucial for staying abreast of the latest developments and making meaningful contributions.

Share

Suggested reading

Understanding LLM Benchmarks

40 Large Language Model Benchmarks and The Future of Model Evaluation

Building and Deploying Observable AI Agents with Google Agent Framework and Arize