Objectives:
- Normalize struggle inherent in developing literature skills, especially around tasks the seem trivially easy but are in fact difficult (like googling the correct keywords)
- Promote a growth-mindset: this is a skill that is learned, not inborn.
- Teach students how to break-down the literature search into smaller, simpler task – see guide on how to conduct a literature search.
Pre-class work
- Read Et al. for all: Citations as a Tool for Racial Equity, Inclusion, and Justice, and answer:
- What are the negative effects of citation bias?
- What can we do to adopt more conscientious citation practices?
- What do you find to be the most challenging aspect of doing a literature search?
- Describe your current strategy for doing a literature search for each of the following components: finding papers, reading papers, and tracking papers.
In class [slides]
This unit generally follows the structure of our guide on how to conduct a literature search.
- [15min] Students introduce themselves with the slides they made
- [15min] Introduction: recap of last class, group discussion of reading about citation biases
- What were everyone’s takeaways from last class about how to skim a paper?
- Has anyone tried this strategy over the last week? How did it go?
- What are some takeaways from the pre-class work about citation biases?
Insight: (a) Citations are often used as a measure of success, problematically leading to inequities. (b) More broadly, everything we do as researchers leaves a fingerprint – who we collaborate with, who we cite, etc.
- [5min] Overview: how to do a literature search?
- What is a literature search? What is it for? Why is it hard?
- Present a step-by-step guide on how to conduct a literature search
- [15min] Small group activity #1: find the relevant keywords for a given research prompt (see example below)
- [10min] Regroup: what were the keywords found? How did you know they were relevant? what was hard about this? (probably: knowing which keywords to search for, verifying a paper is relevant)
- [5min] Presenting a guide to conduct a literature search
- [20min] Small-group activity #2: given a research prompt and a list of papers: which are relevant? and in what ways are they relevant (i.e. which category do they belong to)? (see example below)
- [20min] Regroup: go over list of papers and discuss
- [5min] Go over paper tracking software (e.g. Zotero, the google scholar web extension), show example document keeping track of related work
- [5min] In-class survey
- Describe a new strategy you learned (if any) for doing a literature search for each of the following components: finding papers, reading papers, and tracking papers.
- Is there anything else you took away? If so, what is it?
Insight: Students found it valuable to see how the instructors skim the paper abstracts, which keywords they focused on, which technical terms they glossed over, etc. to determine whether a paper is relevant.
Example Exercise
Given the research problem below (purposefully selected to be in a CS-adjacent field – statistical genetics – so that no student has prior knowledge):
- Search for relevant keywords online. What are they? How did you know if they are relevant?
- Given the list of papers below, which are relevant? In what way(s) are they relevant/irrelevant to the problem statement?
Problem: There is significant bias in the data collection used Genome-Wide Association Studies (GWAS) towards populations that identify as of European ancestry. As a result, statistical models of the data are substantially less predictive for populations identifying as of non-European ancestry.
Significance: Increased health disparities between populations identifying as of European vs. non-European ancestry.
Goal: To develop statistical methodology capable of both using existing data and effectively leveraging data from a more inclusive collection process to generalize well to populations identifying as of non-European ancestry.
Relevant: For each of the paper below, in what way it is (or is not) relevant to the problem statement? For example,
- The paper is irrelevant
- The paper motivates the problem
- The paper solves the same problem
- The paper solves a similar problem
- The paper solves a subproblem
List of papers:
- Non-parametric genetic prediction of complex traits with latent Dirichlet process regression models
- Current clinical use of polygenic scores will risk exacerbating health disparities Understanding the population structure correction regression
- Understanding the population structure correction regression
- Leveraging fine-mapping and non-European training data to improve trans-ethnic polygenic risk scores
- Genes mirror geography within Europe
- Improving Polygenic Prediction in Ancestrally Diverse Populations
- Getting Genetic Ancestry Right for Science and Society
- Multi-group Gaussian Processes
- Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations
- Toward a fine-scale population health monitoring system