Keeping up with research
It depends on which field you are in but if you are a graduate student in machine learning (ML), a thought that would have crossed your mind is - How do I keep up with this enormous volume of research? Thousands of papers get uploaded to ArXiv every day and every other week there’s a new open-source large language model released by a big tech company. Because I work in ML, the analogies I draw will be related to ML but the general advice on how to read could be applied to any field (not only for research papers but books too).
As a disclaimer, note that everyone has to come with up their own way of reading and coping with the literature so take what follows with a grain of salt.
I recently came across an article by Morgan Housel that essentially said that a good strategy for reading [books] is to take lots of inputs, inundate yourself with information, but have a strong filter to retain what’s best. I have been an avid reader of his work which gives a great outsider perspective on finance and investing, with examples from history that withstood the test of time. Simply put, his articles are short, incisive, fun to read, and always leave you with something more to think about.
Starting with books, it often so happens that we start reading a book and when it seems that it is not interesting anymore, we tend to push through just so that we can tell ourselves that we have finished it. I have been guilty of this numerous times. This is the way we have been taught to approach reading in school so it is innate in that sense. But, some really smart people do not follow this norm. For example, Charlie Munger does not go beyond the first chapter, Naval Ravikant skims and jumps around different chapters. Neither of them carries the “burden” of not getting through a book.
But why should you read more? This is a question that I have myself asked and been fascinated by the answer. It all started to make sense. First, when you go through the books of the same genre that you’re comfortable with and know what to expect (like reading research papers in your field), you limit the ability to connect the dots across different fields. Secondly, when you increase your bandwidth and let new ideas in, you also increase the chances of finding that idea (or, set of ideas) that might have a great application in your field. In purely ML terms, supposing that each neuron is an idea that you’ve come across, by increasing the width of your neural network, you are essentially setting the stage for those lottery ticket neurons (ideas) that would be just enough to solve your problem. Another great example comes from the history of mankind as put forth by Matt Ridley in his book The Rational Optimist. The point here is that true innovations, i.e. the innovations that have a significant impact on the world, are the combination and recombination of several existing ideas.
Need for a filter Our brains have finite capacity, so a strong/narrow filter is crucial after absorbing a lot of information. It is here that one has to be ruthless in terms of what’s retained. For research papers, if you find yourself going over an introductory paragraph or an abstract multiple times to understand what the contribution or what the paper is trying to say, save yourself some time and move on. Lots of fish in the sea. Unless you’re trying to reproduce a paper, go through the paper superficially and try to extract its main idea. Practically, this means getting a high-level understanding of the paper such that it can be explained in a few sentences.
It is important to understand that training your filter in a slew of research papers takes time and effort. But, the better your filter gets, the easier it is to get the gist of the paper(s), and eventually mix and match them to get a creative, novel idea. The importance of reading more and having a good filter can be summarized below as Housel does in his article:
Without flooding your brain with inputs you’ll be stuck in the tiny world of what you’ve personally experienced. But without a strong filter, you’ll be overwhelmed with choice and paralyzed by inaction.
Okay so how can I do it? As mentioned before, everyone has to come up with an approach that works best for them. I am going to briefly describe the approach I am taking to keep up with the research in my field. As a note, what I am going to describe is a miniature version of this post by Maya Gosztyla. And, I will only focus on the “finding” part as there are several articles out there explaining how to read and organize literature that do a much better job.
A classical way to get the relevant literature delivered to your inbox is by signing up for keyword alerts on Google Scholar or alerts from the impactful journals in your field. If you are taking this route, make sure that you have these alerts categorized into labels in your inbox so that it doesn’t get flooded.
In addition to that (and this is what helped me the most), subscribing to RSS feeds and using a feed aggregator app is a great way to pool all the relevant literature. RSS (Really Simple Syndication) is a web feed that allows users/applications to access updates to websites in real-time. ArXiv has its own RSS feed for each category as does every major journal. Feedly is a great RSS feed aggregator that allows you to have up to three folders with the free version and you can add as many RSS feeds within these folders. Important papers can also be saved by adding them to the “Read Later” board. Some examples of RSS feeds:
- Machine Learning (cs.LG) on ArXiv - http://arxiv.org/rss/cs.CV
- Image and Video Processing (eess.IV) - http://arxiv.org/rss/eess.IV
- Elsevier’s Medical Image Analysis - http://rss.sciencedirect.com/publication/science/13618415
One of the good features of Feedly is that you don’t have to “find” the exact RSS feed link. Its search function already has a good collection of RSS feeds that can be easily subscribed to. In my personal experience, I have found that subscribing to relevant feeds (like the ones above) automatically brings the latest literature and I just have to set some time aside to go through them. It’s just convenient.
It can be challenging to keep up with the vast amount of literature currently being produced. I hope that this post has left you with something to think about reading. Good luck with your research!