top of page
Post: Blog2_Post

10 Summer Data Science Programs for Middle School Students

If you’re curious about how data shapes the world around you, from public health to sports analytics to artificial intelligence, a summer data science program can be a strong place to start. These programs give you early exposure to college-level academics, structured research experiences, and hands-on analytical work, all while you’re still in middle school. You learn how to collect, clean, analyze, and interpret data using real tools and real-world datasets, building practical skills that extend far beyond the classroom. Many of these programs also introduce you to university campuses, research labs, or federal agencies, helping you understand what academic and professional pathways in data science look like. Compared to long-term commitments, a focused summer program allows you to explore this field in a concentrated format, often with funding or aid options available, without requiring a major financial investment.


If you’re especially interested in data science, statistics, or machine learning, exploring a summer program (including online options) can help you test your interest while strengthening your analytical foundation. In narrowing down this list of top summer data science programs for middle school students, we prioritized programs that are academically rigorous, offer strong mentorship or networking opportunities, maintain low acceptance rates, or are hosted by prestigious universities, research institutes, or government organizations. Here is our list of 10 summer data science programs for middle school students.


10 Summer Data Science Programs for Middle School Students


Location: Remote

Cost: Varies, financial aid available

Acceptance rate / Cohort size: Not specified

Program dates: 8 weeks in the summer

Application deadline: Varies by cohort

Eligibility: Students in grades 6 to 8


The Lumiere Junior Explorer Program is an online mentorship-based research experience tailored for middle school students who want to create an academic project in a field they’re passionate about. You’ll work closely with a mentor from universities like MIT, Harvard, or Stanford, who supports you throughout the research and project development process. 


Throughout the program, you learn to conduct independent investigations, think critically, and complete a final project that reflects your area of interest. The program balances academic challenge with scheduling flexibility, featuring multiple application rounds during the year. Need-based scholarships are available, encouraging students from diverse backgrounds to participate.


Acceptance rate/cohort size: Enrollment-based program with limited seats

Location: MehtA+ (Virtual Program)

Cost: $1,990 USD; limited scholarships of up to $1,000 available based on financial need; a separate paid internship-study option is available for qualifying students

Program Dates: June 23 – August 1

Application Deadline: Applications close June 11 or earlier if slots fill

Eligibility: Students entering Grades 8–12 (ages 13–18); prior math background recommended; no formal prerequisites listed.


In this six-week virtual bootcamp, you study university-level machine learning concepts beginning with foundational mathematics and Python programming. The syllabus covers supervised and unsupervised learning models, including k-nearest neighbors, support vector machines, neural networks, convolutional neural networks, reinforcement learning, and natural language processing. You work extensively with Python libraries such as NumPy, Pandas, Scikit-learn, TensorFlow, Keras, PyTorch, Matplotlib, and NLTK, while also learning version control through GitHub and cloud computing basics via AWS and GCP. A significant component of the program is team-based research: you complete a midterm and final interdisciplinary machine learning project, write a conference-style research paper in LaTeX, and prepare a technical poster


Location: Virtual

Application deadline: Rolling deadlines. You can apply to the program here.

Program dates: 25 hours over 10 weeks (on weekends) during the spring cohort and 25 hours over 2 weeks (on weekdays) during the summer cohort.

Eligibility: Students in grades 6-8


The AI Trailblazers program by Veritas AI is a virtual program that teaches middle school students the fundamentals of artificial intelligence and machine learning. Over 25 hours, you will learn the basics of Python as well as topics like data analysis, regression, image classification, neural networks, and AI ethics.  Students learn through lectures and group sessions with a 5:1 student-to-mentor ratio. Previous student projects have included building a machine-learning model to classify music genres and creating a machine-learning algorithm to provide a custom list of educational resources based on selected specifications.


Acceptance rate/cohort size: Selective admission requiring standardized test scores and teacher recommendation

Location: Michigan State University, East Lansing, MI

Cost: $1,045 commuter (before March 1) or $1,140 commuter (after March 1); $2,100 residential (before March 1) or $2,490 residential (after March 1)

Program Dates: June 22–June 26 or July 6–July 10

Application Deadline: Applications close May 8; admissions decisions are made on a rolling basis.

Eligibility: Students currently in Grades 7–8; must submit ACT/SAT/CogAT/IQ scores meeting minimum thresholds, teacher recommendation, grade report, and essay.


In this one-week academic enrichment program, you enroll in a structured STEM track that includes “Decoding Mathematical Data: An Introduction to Data Science.” The course focuses on statistical thinking, organizing large datasets, drawing conclusions from numerical information, and presenting results clearly. You engage in daily sessions that progress from understanding statistical reasoning to analyzing and interpreting real-world data. The curriculum emphasizes research methodology, critical analysis, and communication of findings. As part of the broader MST experience, you attend three two-hour classes daily, immersing yourself in advanced STEM coursework designed for academically talented middle school students.


Acceptance rate/cohort size: Highly selective; approximately 30–36 students selected from hundreds of applicants

Location: National Center for Health Statistics, Hyattsville, MD

Cost: Free

Program Dates: August 3 – August 7

Application Deadline: Applications open in February and close on March 19

Eligibility: Students entering Grades 6 or 7; teacher recommendation required; students may attend only once.


At this week-long in-person camp, you explore statistics and data science through hands-on investigative activities led by scientists from the CDC and partner agencies. You learn how to formulate research questions, collect and organize data, analyze datasets, and visually present results. Activities emphasize real-world public health data, helping you understand how statistics inform national health policy. Throughout the week, you work collaboratively in teams and engage directly with federal statisticians and researchers. By the end of the camp, you will have developed foundational skills in critical thinking, data interpretation, and statistical reasoning within an authentic government research environment.


Acceptance rate/cohort size: Application-based selection; exact cohort size information is not published

Location: The Ohio State University, Columbus, OH

Cost: Free

Program Dates: July 7–July 11 (tentative dates)

Application Deadline: March 30 (tentative dates)

Eligibility: Students entering Grades 7–9 in the fall; applicants must attend school within the State of Ohio


In this five-day in-person camp hosted by the Translational Data Analytics Institute, you explore core concepts in data science and analytics through collaborative, team-based learning. You work in small groups alongside mentors to analyze datasets and develop structured problem-solving approaches. Daily sessions introduce you to how data is collected, interpreted, and applied across fields such as business, biology, healthcare, and public policy. You also interact with university researchers and current college students to better understand academic pathways in data science. The program emphasizes analytical thinking, teamwork, and real-world applications rather than theoretical instruction. The week concludes with group presentations showcasing your analytical findings and project work.


Acceptance rate/cohort size: Enrollment-based course with limited seats; exact cohort information is not available

Location: University of California, San Diego Extension, San Diego, CA.

Cost: 1.50 units; cost varies by session

Program Dates: Dates TBD

Application Deadline: Enrollment-based registration; students must enroll before the session start date

Eligibility: Middle school students; no prior programming required.


In this technical course, you are introduced to Python programming as a foundation for machine learning and data science. You write scripts using conditionals, loops, and functions before applying those skills to build an image classification model. Using Google Teachable Machine, you train a model with at least three classes and then integrate it into a Python-based application. A distinctive component is the hardware integration, where you test your classifier on a Raspberry Pi platform. The curriculum emphasizes applied coding, model testing, and debugging workflows. By the end of the course, you have built and deployed a basic machine learning application in both web and hardware environments.


Acceptance rate/cohort size: Enrollment-based; cohorts average 6 students per group

Location: Berkeley Coding Academy, Virtual

Cost: ~ $2,699

Program Dates: July 6–July 31 (Full AI Package); July 6–July 24 (Python + Data Science); July 13–July 31 (Data Science + AI)

Application Deadline: Rolling enrollment until seats fill

Eligibility: Students ages 12–14 (middle school eligible); no prior programming required for the full package; advanced package assumes prior Python experience.


In this four-week virtual program, you learn Python programming and apply it to data science and artificial intelligence workflows. Live online lectures introduce you to core topics such as data visualization, machine learning fundamentals, and neural networks, followed by small cohort sessions where you build projects and reinforce concepts. You work extensively in coding notebooks and develop a portfolio of completed assignments and machine learning models. Advanced students can focus more deeply on neural networks and AI model implementation. 


Acceptance rate/cohort size: Enrollment-based; class size is not publicly specified

Location: The Harker School, San Jose, CA

Cost: $790 (June 22–July 2 session); $870 (July 6–July 17 session)

Program Dates: June 22–July 2 (8:30–11:30 a.m. or 12:45–3:45 p.m.); July 6–July 17

Application Deadline: Rolling registration until seats fill

Eligibility: Students in Grades 6–8; no prerequisites; laptop required (tablets not permitted).


In this two-week course within Harker’s Summer Institute, you explore foundational data science concepts through structured, question-driven analysis. You examine how data is collected, organized, and interpreted, using the Pyret platform to code and manipulate datasets. The curriculum introduces probability, inference, sampling, and visualization tools such as histograms, box plots, and scatter plots. During the second week, you focus on dataset exploration and complete a data analysis project culminating in a presentation. The course builds statistical reasoning and critical thinking skills. 


Acceptance rate/cohort size: Enrollment-based; cohort size not publicly specified

Location: UCLA, Los Angeles, CA

Cost: Tuition required (varies by campus and housing option); financial details available upon inquiry.

Program Dates: June 28 – July 17

Application Deadline: Round Two deadline: March 24; rolling admissions thereafter based on availability

Eligibility: Students in Grades 6–8; must meet Institute for the Gifted eligibility criteria; no prior coding experience required.


In this three-week course, you learn foundational programming concepts in Python before applying them to data science workflows. The curriculum covers algorithms, variables, functions, object-oriented programming, and binary computation before transitioning into data visualization and introductory machine learning. You explore how data science intersects with fields such as business, policy, and technology through applied examples. Assignments focus on building coding modules and analyzing datasets to generate predictions and insights. The course emphasizes practical implementation, allowing you to develop a structured coding portfolio. Upon completion, you receive a certificate and formal recognition of your work.


Stephen is one of the founders of Lumiere and a graduate of Harvard College, where he earned an A.B. in Statistics. He founded Lumiere as a PhD student at Harvard Business School. Lumiere is a selective research program where students work 1-1 with a research mentor to develop an independent research paper.


Image Source - UCSD logo




One__3_-removebg-preview.png
  • Facebook
  • Instagram
  • Twitter
  • LinkedIn

+1 ​‪(573) 279-4102‬

919 North Market Street,

Wilmington, Delaware, 19801

We are an organization founded by Harvard and Oxford PhDs with the aim to provide high school students around the world access to research opportunities with top global scholars.

©2024 by Lumiere Education.

bottom of page