Thursday, August 29, 2013

Data quality vs cost

How often do we see surveys and other after the fact data capture trying to capture people's impressions and desires? Too often I fear. Here's a story about something going very wrong.
An unnamed university used to have students fill out their instructor ratings on paper towards the end of the semester. The professor would hand out the forms, in one of the last lectures, leave the room for the 10 or 15 minutes while the students filled in the forms, put the completed forms into an envelope and place the envelopes in a basket. The reviews are anonymous, so the simple act of placing them in a sealed envelope gives the sorts of confidence needed.
Then the data for each class is collated, tabulated and made available to the instructor as averages, max, min values without any specific student identification. A relatively straightforward system. A burden on some poor administrator perhaps - the entering of the data could be a bit tedious - especially in some of the larger classes.
Bean counters get control. Aha, we could save money by giving the students an online survey form instead. They would fill that out. It would be anonymous still. The administrator would be freed up. What could go wrong? Answer - just about everything.
Students are strongly discouraged from taking computing equipment into a class room, so they now have to fill in the online survey at some other time; The URL is another painful link to have to remember when they go to the other location; Many of the students use mainly smart phones or tablets and the survey form is not conducive to that form factor; There is no value to the individual students to fill in the form anyway. It is for instructor appraisal and measurement not for the students' benefit.
The completion rate for forms went from about 70% for the paper forms to less than 20% for the online forms. The data quality suffered. Instructors are now deprived of valuable feedback. The school is desperate for the feedback. Rather than fixing the root cause of the problem, they are now suggesting that instructors give "extra credit" to students who do the survey. That is of course wrong at many levels. Here are students receiving grade for something not related to their academic progress (ethics anyone?). The instructor isn't supposed to know which students actually filled in the survey anyway.
The key message here is that if you want good data make the capture of the data as unintrusive to the data source as possible - even if you have to do work at the back end to make the data usable.
At some level that's one axis of what "big data" is about. Capturing the data at the point of use without requiring any extra steps. Analyze that data in the ways you want to.

1 comment:

  1. I hear that at at least one University this practice is being replaced. I just hope they didn't spend too much public money on it.

    ReplyDelete