>
>
>
Research mentorship for data science students
Research mentorship for data science students
Research mentorship for data science students | RISE Research
Research mentorship for data science students | RISE Research
RISE Research
RISE Research

TL;DR: This post explains what original data science research looks like for high school students, which journals publish it, and how RISE Research mentorship guides students from a raw question to a published paper. RISE Scholars pursuing data science research are accepted to Top 10 universities at three times the standard rate. The Summer 2026 Priority Deadline is April 1st. Read on to understand what the process actually involves, then book a free Research Assessment to see if it fits your timeline.
Why Data Science Research Is One of the Most Competitive Edges a High Schooler Can Build
Most high school students who love data science spend their time building projects: a sentiment analysis tool, a Kaggle competition entry, a personal finance dashboard. These are valuable. But they are not research. And university admissions committees at MIT, Stanford, and Carnegie Mellon know the difference.
Original data science research means formulating a question that has not been answered, collecting or curating a dataset to answer it, applying rigorous methodology, and contributing a finding that other researchers can build on. That is a fundamentally different activity from building a portfolio project, and it signals a fundamentally different level of intellectual maturity to admissions readers.
Research mentorship for data science students bridges that gap. With the right PhD mentor, a high school student in Grade 10 or 11 can produce work that belongs in an academic journal, not just a GitHub repository. This post covers what that research looks like, who the mentors are, where the work gets published, and how the RISE Research program structures the entire process from first conversation to final submission.
What Kind of Data Science Research Can a High School Student Actually Do?
High school students can conduct original, publishable data science research without access to proprietary datasets, supercomputing clusters, or university labs. The most credible student projects use publicly available datasets, open-source tools, and well-defined methodologies that a PhD mentor can guide end to end.
The range of approaches is wider than most students expect. Data science research at the high school level spans statistical analysis, machine learning model development, natural language processing, computational social science, and data-driven policy analysis. Each of these methodologies can produce a genuine contribution to the academic literature when the research question is sharp and the execution is rigorous.
Here are five specific research directions that RISE students have pursued or that fit well within the program structure:
Predicting Student Dropout Rates Using Longitudinal Educational Data: A supervised learning study using publicly available datasets from national education agencies, targeting journals focused on educational data mining.
Algorithmic Bias in Hiring Recommendation Systems: A Systematic Literature Review: A qualitative and quantitative synthesis of published bias audits, suitable for journals covering AI ethics and fairness.
Forecasting Air Quality Index Variations Using Time-Series Machine Learning Models: A computational study drawing on open EPA or WHO atmospheric datasets, connecting data science to environmental outcomes.
Sentiment Dynamics on Social Media During Public Health Crises: An NLP Analysis: A natural language processing study using Twitter or Reddit public APIs, with applications in computational social science.
Identifying Socioeconomic Predictors of Broadband Access Gaps Using U.S. Census Data: A regression-based analysis addressing digital equity, publishable in journals covering data science and public policy.
The right topic depends on your interests within data science, whether that is machine learning, data ethics, computational biology, or social impact analytics. That alignment is exactly what the first mentorship session is designed to find.
The Data Science Mentors Who Guide RISE Students
RISE matches students to mentors based on subject fit and research overlap, not on who is available that week. For data science students, this means being paired with a PhD researcher whose own work intersects directly with the student's chosen direction.
Dr. Aisha Nkemdirim holds a PhD from MIT and researches algorithmic fairness, bias mitigation in machine learning systems, and responsible AI deployment. RISE students working on AI ethics, fairness audits, or NLP-based social research are frequently matched with her because her methodological expertise maps directly onto the kinds of questions they are pursuing.
Dr. Rohan Mehta completed his doctorate at Carnegie Mellon University, where he focused on predictive modeling and applied statistics in health informatics. Students interested in healthcare data, epidemiological modeling, or clinical prediction tools work with Dr. Mehta because his research background gives them a precise framework for structuring their own analysis.
Dr. Lena Hartmann holds a PhD from the University of Oxford and specializes in computational social science, with a focus on large-scale behavioral data and network analysis. Students whose data science projects intersect with economics, political science, or sociology benefit from her ability to bridge quantitative methods and social theory in a way that produces genuinely interdisciplinary research.
You can browse all data science mentors on RISE to see the full range of research backgrounds available.
What a Real Data Science Research Project Looks Like From Start to Finish
Priya was a Grade 11 student from Singapore with a strong foundation in Python and a genuine interest in how algorithmic systems affect real-world outcomes. She had completed several online machine learning courses and built a few personal projects, but she wanted her university application to reflect something more substantive than a portfolio of tutorials.
In her first session with her RISE mentor, Dr. Nkemdirim, Priya described her interest in hiring technology and fairness. Together, they narrowed the focus to a specific, answerable question: whether resume screening algorithms trained on historical hiring data systematically disadvantage candidates from lower-income educational backgrounds, using a publicly available synthetic hiring dataset from a published bias benchmark study.
Over eight weeks, Priya built a classification pipeline in Python, tested three model architectures, and ran a series of fairness audits using established metrics from the algorithmic accountability literature. Her mentor guided her through the statistical interpretation, helped her situate her findings within the existing research, and reviewed each section of her paper before submission.
Priya's paper was accepted by the Journal of Artificial Intelligence Research student track. She went on to receive an offer from University College London's Computer Science program and listed her published research as a central element of her personal statement. Her advice to younger students: the research question matters more than the tools. Get the question right first.
You can read more about how RISE students develop their projects on the RISE student projects page.
Which Journals Publish High School Data Science Research?
Several peer-reviewed and indexed journals actively publish rigorous data science research from pre-university authors. The most relevant for RISE students are the Journal of Emerging Investigators, the Undergraduate Journal of Mathematical Modeling, Cureus for data-driven health research, and the MIT Science Policy Review for data science work with policy implications.
The Journal of Emerging Investigators is peer-reviewed and specifically designed to publish research from pre-university and early undergraduate students. It accepts empirical and computational work across STEM disciplines, including data science, and its review process is rigorous enough to carry genuine weight on a university application.
The Undergraduate Journal of Mathematical Modeling publishes quantitative modeling work, making it a strong fit for students whose data science projects involve statistical modeling, simulation, or applied mathematics. It is indexed and peer-reviewed, and acceptance is competitive.
Cureus is an open-access, peer-reviewed medical and health sciences journal that accepts data-driven research, including machine learning applications in clinical settings. For students whose projects sit at the intersection of data science and health informatics, it is one of the most credible publication venues available.
The MIT Science Policy Review publishes work that connects scientific findings to policy questions. Data science students analyzing broadband access, algorithmic governance, or digital equity will find it a natural home for their research. Acceptance is selective, and publication there signals a level of analytical sophistication that admissions committees at top universities notice.
You can explore the full list of publication venues RISE students have used on the RISE publications page. Your mentor will advise on which journal is the right fit for your specific research question. Some topics suit more than one venue.
How RISE Data Science Research Mentorship Works, Week by Week
The program begins with a free Research Assessment, which is a 20-minute conversation, not an interview. There is no test, no portfolio requirement, and no prior research experience needed. The goal is simply to understand where the student is, what they find genuinely interesting within data science, and whether the timing works for the Summer 2026 cohort.
In the first two weeks, the student and mentor work together to develop the research question. This is not a process where a topic is assigned. The mentor asks questions, the student responds, and the question emerges from that conversation. For data science students, this phase also involves identifying the right dataset and confirming that the methodology is achievable within the program timeline.
Weeks three through eight are the active research phase. Students meet weekly with their PhD mentor for a focused session that covers methodology, analysis, and writing in parallel. For data science projects, this typically means reviewing code and statistical outputs in the early weeks, then shifting toward interpretation and academic writing as the analysis matures. The mentor does not write the paper. They ask the questions that help the student write it better.
In weeks nine and ten, the focus shifts to submission and application strategy. The paper is finalized and submitted to the target journal. Simultaneously, the student works with their mentor to articulate the research experience in their Common App or UCAS personal statement. The research does not just appear on an activities list. It becomes the intellectual backbone of the application narrative. RISE Scholars who go through this process are accepted to Top 10 universities at three times the standard rate, and the RISE results page documents those outcomes in detail.
The Summer 2026 cohort opens in April. If your child is serious about data science research and wants to publish original work before their university applications, book a free 20-minute Research Assessment here to see if the timing works.
Frequently Asked Questions About Data Science Research Mentorship
Do I need access to proprietary or industry datasets to do real data science research?
No. The majority of RISE data science projects use publicly available datasets from sources like the U.S. Census Bureau, the WHO, NASA, the World Bank, or published academic benchmarks. Rigorous research depends on the quality of the question and the methodology, not on whether the data is proprietary.
Many of the most cited papers in machine learning and computational social science use open datasets precisely because reproducibility is a core standard in academic research. Your RISE mentor will help you identify a dataset that is both publicly accessible and rich enough to support an original contribution.
What coding or statistics background does a student need before starting?
Students should have some familiarity with Python or R before the program begins, but they do not need to be advanced programmers. A student who has completed an introductory Python course and understands basic statistics is ready to start. The mentor fills the methodological gaps that matter for the specific project.
What matters more than technical skill at the start is intellectual curiosity and the ability to ask a precise question. The research question drives everything else. Technical skills can be developed during the program. The instinct to ask a good question is what the mentor is looking for in the first session.
Is a data science research paper the same thing as a coding project or a Kaggle competition entry?
No. A research paper makes a claim, supports it with evidence, and situates it within the existing academic literature. A coding project or competition entry demonstrates technical skill but does not contribute a finding to a scholarly conversation. Admissions committees at MIT, Stanford, and similar institutions distinguish clearly between the two.
A published research paper tells an admissions reader that the student can think like a researcher: identify a gap, design a study, interpret results, and communicate findings to a scholarly audience. That is a different signal entirely from a strong GitHub profile, and it carries significantly more weight in a competitive application.
How does a published data science paper actually appear in a university application?
It appears in multiple places simultaneously. On the Common App, it is listed in the Activities section with the journal name and publication date. In the personal statement or supplemental essays, it provides a concrete intellectual narrative. For schools that accept research portfolios or additional materials, the paper itself can be submitted directly.
RISE Scholars report that their research becomes the most distinctive element of their application, precisely because it is specific, verifiable, and difficult to replicate. An admissions reader can look up the journal, find the paper, and confirm the contribution. That level of credibility is rare at the high school level. You can read about how RISE approaches the full admissions picture on the RISE about page.
How early should a student start data science research to have it ready for university applications?
Grade 10 or Grade 11 is the ideal starting point. A student who publishes in Grade 11 has the paper in hand before their Grade 12 application cycle begins, which gives them time to write about it thoughtfully rather than rushing to include it. Starting in Grade 12 is still possible, but the timeline is tighter and the margin for revision is smaller.
The earlier a student starts, the more options they have: a second paper, a conference presentation, or an award submission built on the original research. RISE students who begin in Grade 10 often have two publications and at least one award by the time they apply. You can see examples of those outcomes on the RISE awards page.
The Decision That Separates Good Data Science Students From Exceptional Applicants
Data science is one of the most popular subjects among high-achieving high school students globally. That popularity makes it harder, not easier, to stand out. Every competitive applicant to MIT, Stanford, or Carnegie Mellon has Python skills. Many have Kaggle rankings. Far fewer have published original research in a peer-reviewed journal under the guidance of a PhD mentor.
RISE Scholars who pursue data science research do not just add a line to their activities list. They develop a research identity: a specific intellectual contribution they can speak to in interviews, write about in essays, and point to as evidence of what they are capable of at the university level. That is what produces a 3x higher acceptance rate to Top 10 universities for RISE students compared to the general applicant pool.
The Summer 2026 Priority Deadline is April 1st. If this is the year your child moves from being strong at data science to doing something original with it, schedule a free Research Assessment and we will take it from there.
TL;DR: This post explains what original data science research looks like for high school students, which journals publish it, and how RISE Research mentorship guides students from a raw question to a published paper. RISE Scholars pursuing data science research are accepted to Top 10 universities at three times the standard rate. The Summer 2026 Priority Deadline is April 1st. Read on to understand what the process actually involves, then book a free Research Assessment to see if it fits your timeline.
Why Data Science Research Is One of the Most Competitive Edges a High Schooler Can Build
Most high school students who love data science spend their time building projects: a sentiment analysis tool, a Kaggle competition entry, a personal finance dashboard. These are valuable. But they are not research. And university admissions committees at MIT, Stanford, and Carnegie Mellon know the difference.
Original data science research means formulating a question that has not been answered, collecting or curating a dataset to answer it, applying rigorous methodology, and contributing a finding that other researchers can build on. That is a fundamentally different activity from building a portfolio project, and it signals a fundamentally different level of intellectual maturity to admissions readers.
Research mentorship for data science students bridges that gap. With the right PhD mentor, a high school student in Grade 10 or 11 can produce work that belongs in an academic journal, not just a GitHub repository. This post covers what that research looks like, who the mentors are, where the work gets published, and how the RISE Research program structures the entire process from first conversation to final submission.
What Kind of Data Science Research Can a High School Student Actually Do?
High school students can conduct original, publishable data science research without access to proprietary datasets, supercomputing clusters, or university labs. The most credible student projects use publicly available datasets, open-source tools, and well-defined methodologies that a PhD mentor can guide end to end.
The range of approaches is wider than most students expect. Data science research at the high school level spans statistical analysis, machine learning model development, natural language processing, computational social science, and data-driven policy analysis. Each of these methodologies can produce a genuine contribution to the academic literature when the research question is sharp and the execution is rigorous.
Here are five specific research directions that RISE students have pursued or that fit well within the program structure:
Predicting Student Dropout Rates Using Longitudinal Educational Data: A supervised learning study using publicly available datasets from national education agencies, targeting journals focused on educational data mining.
Algorithmic Bias in Hiring Recommendation Systems: A Systematic Literature Review: A qualitative and quantitative synthesis of published bias audits, suitable for journals covering AI ethics and fairness.
Forecasting Air Quality Index Variations Using Time-Series Machine Learning Models: A computational study drawing on open EPA or WHO atmospheric datasets, connecting data science to environmental outcomes.
Sentiment Dynamics on Social Media During Public Health Crises: An NLP Analysis: A natural language processing study using Twitter or Reddit public APIs, with applications in computational social science.
Identifying Socioeconomic Predictors of Broadband Access Gaps Using U.S. Census Data: A regression-based analysis addressing digital equity, publishable in journals covering data science and public policy.
The right topic depends on your interests within data science, whether that is machine learning, data ethics, computational biology, or social impact analytics. That alignment is exactly what the first mentorship session is designed to find.
The Data Science Mentors Who Guide RISE Students
RISE matches students to mentors based on subject fit and research overlap, not on who is available that week. For data science students, this means being paired with a PhD researcher whose own work intersects directly with the student's chosen direction.
Dr. Aisha Nkemdirim holds a PhD from MIT and researches algorithmic fairness, bias mitigation in machine learning systems, and responsible AI deployment. RISE students working on AI ethics, fairness audits, or NLP-based social research are frequently matched with her because her methodological expertise maps directly onto the kinds of questions they are pursuing.
Dr. Rohan Mehta completed his doctorate at Carnegie Mellon University, where he focused on predictive modeling and applied statistics in health informatics. Students interested in healthcare data, epidemiological modeling, or clinical prediction tools work with Dr. Mehta because his research background gives them a precise framework for structuring their own analysis.
Dr. Lena Hartmann holds a PhD from the University of Oxford and specializes in computational social science, with a focus on large-scale behavioral data and network analysis. Students whose data science projects intersect with economics, political science, or sociology benefit from her ability to bridge quantitative methods and social theory in a way that produces genuinely interdisciplinary research.
You can browse all data science mentors on RISE to see the full range of research backgrounds available.
What a Real Data Science Research Project Looks Like From Start to Finish
Priya was a Grade 11 student from Singapore with a strong foundation in Python and a genuine interest in how algorithmic systems affect real-world outcomes. She had completed several online machine learning courses and built a few personal projects, but she wanted her university application to reflect something more substantive than a portfolio of tutorials.
In her first session with her RISE mentor, Dr. Nkemdirim, Priya described her interest in hiring technology and fairness. Together, they narrowed the focus to a specific, answerable question: whether resume screening algorithms trained on historical hiring data systematically disadvantage candidates from lower-income educational backgrounds, using a publicly available synthetic hiring dataset from a published bias benchmark study.
Over eight weeks, Priya built a classification pipeline in Python, tested three model architectures, and ran a series of fairness audits using established metrics from the algorithmic accountability literature. Her mentor guided her through the statistical interpretation, helped her situate her findings within the existing research, and reviewed each section of her paper before submission.
Priya's paper was accepted by the Journal of Artificial Intelligence Research student track. She went on to receive an offer from University College London's Computer Science program and listed her published research as a central element of her personal statement. Her advice to younger students: the research question matters more than the tools. Get the question right first.
You can read more about how RISE students develop their projects on the RISE student projects page.
Which Journals Publish High School Data Science Research?
Several peer-reviewed and indexed journals actively publish rigorous data science research from pre-university authors. The most relevant for RISE students are the Journal of Emerging Investigators, the Undergraduate Journal of Mathematical Modeling, Cureus for data-driven health research, and the MIT Science Policy Review for data science work with policy implications.
The Journal of Emerging Investigators is peer-reviewed and specifically designed to publish research from pre-university and early undergraduate students. It accepts empirical and computational work across STEM disciplines, including data science, and its review process is rigorous enough to carry genuine weight on a university application.
The Undergraduate Journal of Mathematical Modeling publishes quantitative modeling work, making it a strong fit for students whose data science projects involve statistical modeling, simulation, or applied mathematics. It is indexed and peer-reviewed, and acceptance is competitive.
Cureus is an open-access, peer-reviewed medical and health sciences journal that accepts data-driven research, including machine learning applications in clinical settings. For students whose projects sit at the intersection of data science and health informatics, it is one of the most credible publication venues available.
The MIT Science Policy Review publishes work that connects scientific findings to policy questions. Data science students analyzing broadband access, algorithmic governance, or digital equity will find it a natural home for their research. Acceptance is selective, and publication there signals a level of analytical sophistication that admissions committees at top universities notice.
You can explore the full list of publication venues RISE students have used on the RISE publications page. Your mentor will advise on which journal is the right fit for your specific research question. Some topics suit more than one venue.
How RISE Data Science Research Mentorship Works, Week by Week
The program begins with a free Research Assessment, which is a 20-minute conversation, not an interview. There is no test, no portfolio requirement, and no prior research experience needed. The goal is simply to understand where the student is, what they find genuinely interesting within data science, and whether the timing works for the Summer 2026 cohort.
In the first two weeks, the student and mentor work together to develop the research question. This is not a process where a topic is assigned. The mentor asks questions, the student responds, and the question emerges from that conversation. For data science students, this phase also involves identifying the right dataset and confirming that the methodology is achievable within the program timeline.
Weeks three through eight are the active research phase. Students meet weekly with their PhD mentor for a focused session that covers methodology, analysis, and writing in parallel. For data science projects, this typically means reviewing code and statistical outputs in the early weeks, then shifting toward interpretation and academic writing as the analysis matures. The mentor does not write the paper. They ask the questions that help the student write it better.
In weeks nine and ten, the focus shifts to submission and application strategy. The paper is finalized and submitted to the target journal. Simultaneously, the student works with their mentor to articulate the research experience in their Common App or UCAS personal statement. The research does not just appear on an activities list. It becomes the intellectual backbone of the application narrative. RISE Scholars who go through this process are accepted to Top 10 universities at three times the standard rate, and the RISE results page documents those outcomes in detail.
The Summer 2026 cohort opens in April. If your child is serious about data science research and wants to publish original work before their university applications, book a free 20-minute Research Assessment here to see if the timing works.
Frequently Asked Questions About Data Science Research Mentorship
Do I need access to proprietary or industry datasets to do real data science research?
No. The majority of RISE data science projects use publicly available datasets from sources like the U.S. Census Bureau, the WHO, NASA, the World Bank, or published academic benchmarks. Rigorous research depends on the quality of the question and the methodology, not on whether the data is proprietary.
Many of the most cited papers in machine learning and computational social science use open datasets precisely because reproducibility is a core standard in academic research. Your RISE mentor will help you identify a dataset that is both publicly accessible and rich enough to support an original contribution.
What coding or statistics background does a student need before starting?
Students should have some familiarity with Python or R before the program begins, but they do not need to be advanced programmers. A student who has completed an introductory Python course and understands basic statistics is ready to start. The mentor fills the methodological gaps that matter for the specific project.
What matters more than technical skill at the start is intellectual curiosity and the ability to ask a precise question. The research question drives everything else. Technical skills can be developed during the program. The instinct to ask a good question is what the mentor is looking for in the first session.
Is a data science research paper the same thing as a coding project or a Kaggle competition entry?
No. A research paper makes a claim, supports it with evidence, and situates it within the existing academic literature. A coding project or competition entry demonstrates technical skill but does not contribute a finding to a scholarly conversation. Admissions committees at MIT, Stanford, and similar institutions distinguish clearly between the two.
A published research paper tells an admissions reader that the student can think like a researcher: identify a gap, design a study, interpret results, and communicate findings to a scholarly audience. That is a different signal entirely from a strong GitHub profile, and it carries significantly more weight in a competitive application.
How does a published data science paper actually appear in a university application?
It appears in multiple places simultaneously. On the Common App, it is listed in the Activities section with the journal name and publication date. In the personal statement or supplemental essays, it provides a concrete intellectual narrative. For schools that accept research portfolios or additional materials, the paper itself can be submitted directly.
RISE Scholars report that their research becomes the most distinctive element of their application, precisely because it is specific, verifiable, and difficult to replicate. An admissions reader can look up the journal, find the paper, and confirm the contribution. That level of credibility is rare at the high school level. You can read about how RISE approaches the full admissions picture on the RISE about page.
How early should a student start data science research to have it ready for university applications?
Grade 10 or Grade 11 is the ideal starting point. A student who publishes in Grade 11 has the paper in hand before their Grade 12 application cycle begins, which gives them time to write about it thoughtfully rather than rushing to include it. Starting in Grade 12 is still possible, but the timeline is tighter and the margin for revision is smaller.
The earlier a student starts, the more options they have: a second paper, a conference presentation, or an award submission built on the original research. RISE students who begin in Grade 10 often have two publications and at least one award by the time they apply. You can see examples of those outcomes on the RISE awards page.
The Decision That Separates Good Data Science Students From Exceptional Applicants
Data science is one of the most popular subjects among high-achieving high school students globally. That popularity makes it harder, not easier, to stand out. Every competitive applicant to MIT, Stanford, or Carnegie Mellon has Python skills. Many have Kaggle rankings. Far fewer have published original research in a peer-reviewed journal under the guidance of a PhD mentor.
RISE Scholars who pursue data science research do not just add a line to their activities list. They develop a research identity: a specific intellectual contribution they can speak to in interviews, write about in essays, and point to as evidence of what they are capable of at the university level. That is what produces a 3x higher acceptance rate to Top 10 universities for RISE students compared to the general applicant pool.
The Summer 2026 Priority Deadline is April 1st. If this is the year your child moves from being strong at data science to doing something original with it, schedule a free Research Assessment and we will take it from there.
Interested in research mentorship?
Book a free call
Book a free call
Read More









