The Limits of Data

I’m a big fan of data.  I read The Circle, which kind of creeped me out, but I also loved the access to data the book explored.  I often find that when I look at data, I see things I didn’t expect to see.  For example, I look at tests and how people answer questions.  The questions I think are hard sometimes have the most right answers.  It doesn’t mean the question isn’t hard; it means there’s another story there.

The data issue from my last post is a case in point.  That’s raw data there, but it’s been interpreted by someone (or group of someones) in a certain way.  Subjects and industries have been grouped arbitrarily.

Often people bring up data as a way to solve problems.  If we just have enough data, then we can fix X problem.  Think about data and students and schools.  We’ve been trying to use data to create better schools forever.  It’s hit or miss at best.  Data only tells us so much.  And what it does tell us is often our own interpretation.  It’s rare that you crunch some numbers and a graph appears and it’s suddenly clear what needs to be done.

I’ve been thinking about my own data lately as I try to lose the handful of pounds I’ve gained in the last few months.  I invested in a fitbit to help.  And I’ve gone back to tracking my calorie intake and my weight.   What my initial data tells me is that I’m pretty sedentary naturally, that I have to make myself move.  I don’t think that’s a desire on my part, just a factor of where and how I live.  I have to drive most places to get stuff.  To walk more than a couple of miles, I have to plan it.  It doesn’t happen naturally.  Also, I eat more than one might think.  I like food and cutting back is a challenge.

Looking at losing weight as just a numbers game can be helpful. It’s useful to know that I went 162 calories over today based on my intake and my activity.  But knowing that doesn’t necessarily make me want to get out of bed right now and go for a 2-mile walk (which is what I’d have to do to burn off those extra calories).   There’s a human element to all of this.  Yes, the data gives me valuable feedback, but I have to do something with that feedback.

Data can tell us all kinds of things, but it’s still up to us to either interpret the data or act on it or both.  And sometimes our interpretations or actions are wrong.  And then what?

CMK Day 3: From exhaustion to success

I woke up way too early, thinking about my project. I basically finished but am working to make it faster and better. I also conceived a new project and I might work on that today as well.

I like my project but, as I said yesterday during reflection, I sort of feel like I didn’t push myself far enough. Another CS teacher said the same thing. What do you do when you are at least moderately familiar with everything at a conference. If I’d wanted to truly ho beyond my comfort zone, I would have done more with electronics or woodworking. But my goal wasn’t to go outside my comfort zone. It was to work with data and art.

And when people have seen what I’ve done, they’re impressed. Somebody just told me I could make a whole class out of my project. And I think that’s exactly what I’m going to do!

Here’s the video of my project:

Weather Data as Art from CMK 2014 on Vimeo.

Analyzing Data Analysis

On Friday, I introduced the computation part of our data analysis project.  I was very excited about this and created an example using Google spreadsheets.  Even though I think another tool would be more powerful, I stuck with spreadsheets since most of the students are completely unfamiliar with anything else.

What we want the students to do is to take a question from the survey we conducted and break it down not just by how many people answered it a certain way, but also by a piece of demographic data.  So, they might look at the question of whether people expect to have children and see whether more women or men expect to have children.  To to that, you need to make a statement like “if ‘yes’ [to children question] and ‘male'”.  And you have to do that for all combinations. I walked through my example in the class and eyes glazed over.  Admittedly, I went fairly quickly, but these are mostly seniors, and I would hope they would have some experience with formulas in Excel or Google spreadsheets.  But no.  Nothing wrong with that, really, but something I want to correct going forward.  I do know that one of our math teachers teaches some simple formulas during a single class period, but it’s out of context and they never–as far as I know–return to it.

In order for our students to complete this project, they have to use formulas.  Well, they could do it by hand, but that would be so time consuming and crazy.  So I’m thinking I need to run a workshop for the teachers on ways they can incorporate this skill and I need to find out more about where it could be used.

I was talking about this with Mr. Geeky, and he pointed out that most people are not good at this kind of analysis.  They don’t even think to ask questions that drill down into the data, questions like, “What is the income breakdown? Or gender breakdown? Or racial breakdown?”  They don’t know the difference between mean and median and how important looking at both might be.  I often use the classic example of a bar where the average (mean) income of the customers is $40k.  Bill Gates walks in and now the average income is over $1 million.  Now the average income has become meaningless as something that tells you anything about the customers in the bar.  One thing that computing offers is ways to slice data quickly so that you can start to see questions to ask and you can start trying to answer them with the data.  This makes me even more convinced that this assignment is an important one.  I’m looking forward to its outcome.