Bringing Data Together

As mentioned in a previous post, I am in the process of updating the mathematical model used by my school to determine when students are ready to take college-level courses. This model is important to us because we send over a third of our juniors and half of our seniors to college each year and we don’t want to mistakenly send students to college before they are ready. Using this model, my team has gotten pretty good at determining readiness; last year our students passed 97% of the college courses they attempted.

Before the model can be applied, it must first be brought together into a single database or spreadsheet. Depending on your systems, this can be a quick or timely endeavor. For me, bringing together all of the data we have on students took a little over six hours. Here’s what I did:

Google Sheets

Because it is shareable and applies edits in real-time, I do all of my modeling in a single Google Sheet. For anyone who is an Excel devotee, this may sound crazy. It is. But, for me, the benefits outweigh the costs.

For this year’s update, I created a new Google Sheet called “Master Data File” where I pasted an export from our Student Information System (SIS) containing each student’s name, ID, DOB, sex, graduation year, and commutative GPA. Because our SIS contains the most up-to-date information regarding student enrollments, I always start there and then use that data as reference for gathering the rest. No need to gather data on a student no longer enrolled.

Microsoft Excel

So far, there is only one function I need that is not easily done in Google Sheets: consolidating data. At one time, I would spend hours manually inputting data from one system’s export file to another. Excel can consolidate data from two spreadsheets in minutes.

 The Consolidate function is in the "Data" ribbon on Microsoft Excel.

The Consolidate function is in the "Data" ribbon on Microsoft Excel.

For example, data downloaded from the College Board website looks different than data taken from our SIS. The College Board data includes some students who have left my school, is missing data for students who are newly enrolled, and may have other formatting differences that would make a simple copy/paste impossible to do.

As long as I have a single column that uniquely identifies individual student (student ID, “Last Name, First Name” combinations, etc.), Excel can consolidate the data from both sources into a single row to be included in the master file.

Data Brought Together

Here’s the data I consolidated into the single Google Sheet for each student organized by source:

Student information System

  • Demographic Information used for sorting and aggregated data analysis
  • High School Grade Point Average: used as a primary indicator of future college success. This topic will be expanded upon further in a later post.

College Board

  • PSAT 8/9, 10, and 11: We give the PSAT to all students every year in grades 8 through 11. While we do not yet use this data in our model, I decided to pull it in hopes of future analysis and reporting.
  • SAT: In Michigan, all 11 graders are required to take the new SAT. Our community college partner accepts SAT scores for determining college course placement, so we use these scores as part of our readiness model.
  • Accuplacer: While this is technically a College Board product, we get this data from our college partner. Our students take this college placement assessment each year until they place into college-level coursework beginning in the 9th grade.

ACT

  • ACT: Now that the state of Michigan has moved from ACT to the SAT for it’s college readiness assessment, we only have a few students each year who take this assessment. For those who do, though, I need to consider their scores when determining readiness.
  • Compass: Until this year, our college partner used the ACT’s Compass assessment for determining college placement. This assessment was replaced by Accuplacer but we still consider Compass data in determining students’ college readiness.

Other

  • Agency Score: Each year, we ask our teachers to rate each student’s skill at exercising agency on a scale of 0-5. Agency, for those not familiar with the concept is one’s ability to be an “agent” of his or her own learning. It consists of two components, both a part of our instructional model: 1.) ability to complete tasks to specification and on time, and 2.) growing from challenging work and setbacks. I simply ask teachers to rate each student and take the average of their input. More on this measure of college readiness later.

When recording assessment data, I like to separate it by the year it was taken relative to the student. I like to know what each student’s score was each year they took it. This allows me to see growth or stagnation in student performance, and makes analysis and reporting of data much easier to do.

Next up: what I do with this data once I have it all in one location.

Modeling Future Student Success

Over the next few weeks, I will be updating the mathematical model I created to predict students' future success in college. That model, which my school has been using and revising for the past four years, looks for patterns in academic and behavioral data to help predict individual student's likelihood of earning passing scores in college coursework.

I created the model in response to learning that standardized test scores alone left far too many edge cases to accurately predict future academic success. Too many students had previously scored well on tests yet did poorly in college classes. Similarly, some students we thought could handle college coursework did not score well on traditional measures of college "readiness."

Using this model, my school sends a third of its juniors and half of its seniors to college. Last year, these students passed 97% of the courses attempted. Ninety-three percent passed with a C or better.

To learn more about my school and why we send so many students to college while still in high school, I recommend reading my post from June titled Early College For All.

There is nothing magical about the model. It simply applies what is already known about past students' success to predict how well current students might do in college coursework.

The model uses three primary sources of data:

  1. Standardized college placement or college readiness scores: I have used data from different assessments over the years with relatively similar results (Compass, Accuplacer, ACT, and SAT).
  2. High school grade point average: in my school, the strongest predictor of future academic success is past student success.
  3. Teachers' subjective assessment of student "agency:" Each winter, I ask my faculty to evaluate each student on how well they are perceived to grow through challenging work and complete work on time.

Each year, the weight applied to each of these data sources has changed to reflect what we've learned about past student success. Last year, high school GPA and test scores were weighted about evenly. Agency, while found to be an accurate predictor, was weighted very little (approximately 10%) due to its subjective nature and the potential for perceived bias.

Over the coming weeks, as I update the model, I hope to share more of the details that go into its creation and revision. I see great value in having more schools analyzing data in this way and think it's a simple enough process that can be replicated with bit of time and effort.

Disclaimer: I am not a mathematician and do not claim to be an expert in inferential statistics. I am simply a practitioner with a good memory of his Statistics 101 class. I welcome any feedback from readers with stronger mathematical grounding.

If you have questions about this model that you would like me to expand upon or would simply like to learn more, feel free to leave a comment or reach out by email.

New Tech GPA Stronger Predictor of College Success

A few weeks ago, I shared that my dual enrollment students' high school GPA was the strongest predictor of college success — stronger even than scores on college placement exams. Last week, it struck me that half the group of students we sent (our juniors) were taught in 100% New Tech courses before dual enrolling in college. The other half were seniors who were taught in traditional classes one year ahead of our New Tech initiative. What a great opportunity for data comparison!

For those unfamiliar with New Tech, let me explain: 

Three years ago, my district contracted with the New Tech Network to support change in our high school in three key areas:

  1. Empowering students through increased voice and choice in their learning.
  2. Engaging students in deeper learning of course content through wall-to-wall implementation of project- and problem-based learning as our instructional model.
  3. Enabling students to foster their own learning by providing them with 1-to-1 technology and teaching them to use it effectively.

As part of this initiative, we spent over 2.5 million dollars renovating spaces, buying furniture and technology, and training teachers and leaders. As a result, our staff is now working collaboratively to design authentic projects. We've moved our teacher desks into one of two "Bullpens" where teachers meet between classes and during prep. We integrate courses whenever integration makes sense. Our students take classes like "GeoDesign," "BioLit," "American Studies," and "Civic Reasoning." Each of these classes have two teachers and more time to learn from their work. We are doing a lot of things differently. And better.

To put things back into perspective, we have two groups of students dual enrolling this year: seniors and juniors. Both were educated by the same teachers in the same school. The juniors are part of our New Tech initiative. The seniors are not. The circumstances are begging for further analysis!

To start, let me describe the students. Last semester, we had 67 students dual enroll: thirty-nine juniors and twenty-eight seniors. Both groups represent what we would consider our "top third" performers (more juniors dual enrolled because their class size was larger). The average high school GPA for the groups were close: 3.39 and 3.32 respectively.

They were also demographically similar. Both groups had a few more boys than girls. They represented only a third of our free and reduced lunch population (only 18% of dual enrolled students vs 55% total high school enrollment). They were racially similar, 99% white, which is consistent with our district and community makeup. 

The one demographic difference that stands out to me is the obvious one: seniors are, on average, one year older than juniors. They also have one more year of high school experience and are one year closer to entering college full-time. While I cannot say that this information is statistically significant, after working in high schools for the past ten years, it feels anecdotally significant.

In college, they also performed similarly when looking at the average. Seniors passed 96% of college classes with a GPA of 3.01. Juniors passed 92% of college classes with a GPA of 2.90. Failure was experienced by just three students, one senior and two juniors.

One other comparison that seems notable is that both juniors and seniors took similar courses in college with one potentially significant exception: being farther ahead in curriculum, more seniors took advanced math than juniors (46% vs 13% respectively). 

Where performance differences become noticeable is in the way individual GPA distributes across students. The graphs below demonstrate that difference by overlapping the distribution of high school and college GPAs for each group independently.

image (3).png

Generally speaking, it is clear that both groups performed better at the top of the GPA range in high school than they did in college; both groups saw fewer individual students with a college GPA in the 3.0—4.0 range. It is notable, however, that the size of the gap between  high school GPA and college GPA at the top of the range is smaller for the New Tech juniors than it is for the seniors (this will be highlighted later). And, while that gap continues to exist — albeit in the opposite direction — for seniors in the middle of the GPA range (1.5–3.0, it seems to disappear for juniors. At the bottom of the range, of course, more juniors than seniors earned a GPA below a 1.5.

The degree to which high school GPA and college GPA move together can be further illustrated in the following two scatterplots:

 N=28, R=+0.65, r^2=0.418

N=28, R=+0.65, r^2=0.418

 N=39, R=+0.84, R^2=0.705

N=39, R=+0.84, R^2=0.705

As previously reported, there was a strong positive correlation between high school GPA and college GPA for all dual enrolled students (r=+0.74). As this data shows, the correlation was higher for juniors (r=+0.84) than it was for seniors (r=+0.65). And, while I do not have the mathematical chops to tell you yet whether or not this difference (r=+0.19) is groundbreaking, I can only tell you that I find it encouraging.

As an educator, I strive to give students accurate information about their potential to succeed after high school. I find it satisfying to learn that our New Tech initiative may be increasing that accuracy. 

Time will tell whether or not this trend will continue. I don't want to make any broad claims about why our New Tech educated students' GPAs are better predictors of college success. I will, however, close with some wonders:

  1. I wonder what effect our measurement of skills (collaboration, agency, oral & written communication) in addition to content is having on high school success as it relates to college success? 
  2. I wonder if this trend will continue with our next group of New Tech students who dual enroll? Specifically, I wonder if the model will apply equally to lower high school GPA-earning students?
  3. I wonder if other New Tech high schools have found similar results. 
  4. I wonder if I will be satisfied if the only quantifiable difference between our New Tech educated students' college success and those students taught in our traditional high school is this increase in our ability to predict said success? I wonder if our community would be satisfied?
  5. I wonder what questions I'm not asking that may have compelling answers in this data?

Our New Tech students are taking the ACT for the first time next week. We will also begin scheduling our second group of Early College participants. I can't wait to add this data to the mix for further analysis to see how they compare.

The Problem with Boys

As previously mentioned, my high school is now dual enrolling more students than ever — about ten times more. A quarter of all juniors and seniors took half their classes at the community college last semester as part of our early college efforts.

By most measures, these students did very well. As a group, they earned over 95% of the credits they attempted with an average GPA over 3.0. They were, after all, able to dual enroll because of their past performance on standardized tests and high school coursework. They went to college because we thought they were "ready."

Yet, unsurprisingly, not all students performed equally well. About 15% of our dual enrolled students ended the semester with a college GPA below a 2.0.  A few students even experienced their first academic failure in college. So, even within our high average of success, not all students shared the same experience. 

 First Semester 2014-15 Dual Enrollment GPA Distribution (N=67)

First Semester 2014-15 Dual Enrollment GPA Distribution (N=67)

We consider this fact — that some students didn't do as well as expected — to be a really big deal. It means that our algorithm for credentialing students for college readiness isn't yet perfect. To be clear, we didn't expect it to be, and while we acknowledge that reaching "perfect" isn't probable, wanting perfect gives us reason to dig into our data in hopes of finding some clues that will help us identify relative risk in the future.

Our biggest takeaway?

Boys did much worse in college coursework than girls — a whole grade point worse, on average.

 Girls earned college GPAs that were 1.05 points higher than boys, on average.

Girls earned college GPAs that were 1.05 points higher than boys, on average.

This is despite the fact that girls and boys performed equally on both the COMPASS and ACT assessments, which we use to determine eligibility for college-level coursework. We're talking less than 0.01 difference between boys and girls on these tests.

Being a boy had a stronger negative effect on student success than any other factor: free/reduced status, high school GPA, etc. At the same time, these factors still added to the risk — going to college as a boy receiving free lunch with a high school GPA below 3.0 was clearly tough — these students earned an average GPA below 1.5 in college.

The average college GPA for girls receiving free lunch with a high school GPA below 3.0: a respectable 2.5.  

What now?

We certainly can't increase our requirements for boys above that of girls without raising some eyebrows. What we can do is educate parents and students on the relative risks of going to college and how our data should inform that risk. While hope will likely spring eternal for most, some students may delay college entry in hopes of better results down the road.

We can also raise our expectations overall since doing so would result in sending fewer students with high school GPAs below 3.0. Even though most boys saw their GPA decline in college, the decline was less detrimental on students that started college with a high school GPA that was above 3.0. This seems obvious. It is good to have data to back this up now.

Lastly, I think it's crucial that we think of new ways to support students, specifically these struggling boys, while in college. To do this appropriately, we're going to have to get to know our boys a bit better to start to decipher what is going on. Is it maturity? Is it social expectations? Is it video games? We need to learn more about what is going on with them so that we can build in better supports for them to be successful.

Predicting College Success

I spent my morning analyzing the grades of the sixty-seven juniors and seniors who dual enrolled from my school this past semester. Of the 464 college credits attempted, 440 were earned, giving us a pass rate just a hair under ninety-five percent. Half the group had a college GPA above a 3.43. I'd say this is pretty good news for our first cohort of New Tech students taking college classes.

One of the goals of my analysis was to assess how well we predicted college readiness amongst these young advanced students. While only four of the sixty-seven students who dual enrolled experienced failure, some students still performed worse than expected. Pushing students to college too early could potentially blemish their college transcript. Defining "ready" has therefore become a really big deal.

Aligning our thinking with both our college partner and the state, we placed the greatest weight on students' college entrance exam scores last year. In deciding who got to go, we let test scores trump all other valid readiness indicators such as high school GPA, teacher perception, etc.

So, how did that work out for us?

The worst predictor of student success for us was their score earned on the COMPASS, taken by our current juniors who had not yet taken the ACT. The COMPASS is used by our community college partner to place students into courses at appropriate levels. For us, it turned out that the COMPASS provided only a minor ability to predict college success (r=0.25).

 The correlation between student COMPASS scores and college GPA was a low r=+0.25.

The correlation between student COMPASS scores and college GPA was a low r=+0.25.

Coming in second was the ACT assessment, taken by all juniors in the state of Michigan. The ACT proved to be a fair predictor of college success (r=0.44).

 The correlation between student ACT scores and college GPA was a moderate r=+0.44.

The correlation between student ACT scores and college GPA was a moderate r=+0.44.

The best predictor of college success turned out to be student GPA (r=0.76).

 The correlation between student high school GPA and college GPA was a high r=+0.74.

The correlation between student high school GPA and college GPA was a high r=+0.74.

While the state of Michigan allows schools to use varied methods of determining college readiness before allowing students to dual enroll, it is interesting that they will not not allow GPA be a primary determining factor, given it's apparent ability to correctly predict student success.

What we will most likely do in the future, given this data, is create a single numerical value for each student that takes into account their college entrance exam score and their high school GPA. This would appear to provide some additional predictive ability (r=+0.82 to r=+0.86) not possible using test scores alone.

UPDATE—January 30, 2015: Looking at this with fresh eyes, I think it's important to point out that we used the minimum COMPASS and ACT scores required for college-level coursework placement with our community college partner as our cutoff for allowing students to dual enroll. We did not use the state minimum scores, which are higher. It is logical that using the higher scores would have increased these assessments' predictive ability. We are choosing to use the lower scores to increase access with the hope of keeping risk to a minimum for our students.