The Long Life of a Data Trail

With educational technology, data can be collected and shared in various ways.

February 01, 2016
Bill Fitzgerald Director, Privacy Initiative
Director, Privacy Initiative

CATEGORIES Privacy Evaluation Initiative

Within educational technology, tech companies can acquire data via multiple routes. The most direct way is via a direct sign-up: A teacher creates an account to use a service, and the teacher is the only person using the service. BetterLesson is an example of a site like this -- a teacher creates an account, and only teacher data gets collected. The site is primarily teacher-focused.

A second model includes sites that get their initial data from a teacher or school sign-up, but then, as part of the service offered on the site, they acquire student information. Basic gradebook applications and some simple student-information systems work like this; the teacher or school signs up for the application, and in the process of using it they enter student names, grades, notes, parent information, and other details. The actual data added will vary based on the needs of the application, but information is shared about people without their direct involvement or consent. Online IEP programs also fit this description. While the teacher or school provisions the account, the vendor ends up getting additional data through the teacher's use of the account. In this model, students and parents do not use the service directly, but data about them is collected and stored in the service as teachers use the service.

A third model involves a teacher, school, or district creating an account on the service and either creating accounts for their students/parents or using an invitation process. In this version, teachers, students, and potentially parents sign up for and interact with the service. The data trail here involves information about all participants that includes personally identifiable information, location data (via IP addresses and/or phone GPS location), and behavioral and interaction data pulled from time spent using the service. Examples of services such as this include many edtech products out there, including Edmodo, Remind, ClassDojo, Schoology, most digital text offerings from traditional publishers such as Pearson and McGraw-Hill, learning programs such as Agilix Buzz, and app ecosystems such as Amplify tablets, iPads, and Chromebooks.

A fourth model that collects both learner and parent data includes apps for kids, marketed either to parents or children. Examples of applications like this include most educational apps sold in the Apple and Google app stores and some online learning sites. Because these apps are built to be used outside of schools, the data collected by them is not considered an educational record under the Family Educational Rights and Privacy Act (FERPA) and is therefore covered by the privacy policies and terms of service put in place by the app vendor. If the app is primarily intended for children under 13, parental sign-off (and therefore parental data) is often required for use.

The fifth model includes highly structured data stores to collect longitudinal data, and glue services that integrate multiple external services. These applications can support both storage and analysis of data collected in a variety of applications. A very incomplete list of examples here includes Knewton, Infinite Campus, eScholar, Schoolnet, Learnsprout, or Clever.

Context of EdTech Data Collection

The context around educational data is arguably different from that of data collected in consumer technology. In both K–12 and higher education, schools can sign up for services that students use directly, and in many of these cases student data is uploaded before students or parents are consulted. For example, if a teacher signs up for Remind, parents aren't asked if their contact information gets shared as part of an "invite" feature. While many consumer tech apps include invite features, edtech apps are used within a different context. When a student or a parent sees an app or an invite coming from a school or a teacher, there is a level of implied trust. Increasingly, the implicit trust that students and parents give schools and districts appears to be unearned.

Data Privacy Plays Out over Time

Unless data collected by an app is deleted or destroyed -- and this includes data in backups and in systems that provide redundancy -- we need to start thinking of data trails as timeless. This means that a data trail can be transferred from one entity to another if the conditions allow these transfers to occur. In technology, the terms and conditions and privacy policies are where we can see the conditions in which our data trails are preserved.

While privacy policies and terms of service should be read in full (free log-in is required for access), for the purposes of this post we are going to focus on two specific sections that can be used to gut the terms in any policy: how policies can be changed, and how data is treated in case of a sale, merger, or bankruptcy.

Changes to Terms

Over time, a site's policies should change. Unfortunately, many sites specify that terms can be changed at any point, with no notice to users and no explicit sign-off from users. Many sites state that visiting a site or logging in to a site means that a user accepts the updated terms -- so, the simple act of reading updated terms gets interpreted as "acceptance" of the terms. Even on sites that have better notification policies (and at this point, the "best" policies generally include an email and a banner on the top of the site), users often have no recourse (aside from stopping their use of the site) if they don't like the updated policies. Additionally, many sites do not allow users to delete their data from a site, so even if they stop using a site, their data is still stuck in the site.

To summarize: Most sites reserve the right to change their terms whenever they want, with minimal notification and no option to remove data. In the case of a site where a learner has been added to a site by a school or district, learners have even less recourse.

Data Transfer During Sale, Merger, or Bankruptcy

While a sale, merger, or bankruptcy are three very different events, most privacy policies treat them identically: If it happens, user information is an asset that gets sold.

Edmodo's privacy policy, shown in the screenshot below, is pretty standard for edtech terms:

Edmodo Privacy Policy Screenshot

For a start, the weak terms used throughout edtech could be improved by the following four changes:

  • users opt in to changed terms and/or export data as part of an account cancellation process;
  • users opt in when data is transferred to a new owner;
  • user account cancellation or data deletion is a regular feature available to users of the app;
  • and, in case of a bankruptcy or a going-out-of-business, user data gets destroyed and is not treated as an asset.

These four changes would ensure that user awareness and buy-in is included as a factor when terms are changed. There are additional ways that vendor practices could be improved, but starting with these four would be a solid beginning.

Who Cares Where Data Ends Up? It's Only School!

Data brokers care.

They have created lists of victims of sexual assault and lists of people with sexually transmitted diseases. Lists of people who have Alzheimer's, dementia, and AIDS. Lists of the impotent and the depressed. There are lists of "impulse buyers." Lists of suckers -- gullible consumers who have shown that they're susceptible to "vulnerability-based marketing." And lists of those deemed commercially undesirable because they live in or near trailer parks or nursing homes. Not to mention lists of people who have been accused of wrongdoing, even if they were not charged or convicted.

In the housing market, we have examples of where information from data brokers was used to discriminate based on race. If you're looking for a relatively benign example (if you can get through the marketing speak) of how data from brokers can be mashed up to create profiles, spend some time on this ZIP code profiler put out by ESRI. It's worth noting that the profiles were created based on aggregated data of individuals, so that the summaries here are the result of millions of data profiles on individuals. The description of how the site was put together provides a superficial glimpse of how data on individuals from multiple sources can be combined to tell a story.

Now, imagine the increased accuracy that could be added to personal profiles if they were fleshed out with data sets that contain personal information, starting with habits formed in elementary school.

The life of a data trail matters.


When a student is signed up for a service by a school or teacher, data is collected from people who have no say in forming the relationship or shaping the terms of the deal. In some cases, this involves student work being sold without student knowledge. The fact that edtech companies treat student data (which really is a track record of learning, personal interest, and growth) as an asset to be bought or sold is on very shaky ground, both pedagogically and ethically. Given that a learning record is also -- to an extent -- a snapshot of behavior, and that behavioral information is gold for marketers, it raises the real question: Why should education records ever have the possibility of ending up outside an educational context? Treating records as a financial asset that can be acquired in a merger or bankruptcy ensures that some records end up being used outside an education context. The combination of the "fail faster" mantra of VC-funded tech and ongoing deals -- over $8 billion worth in 2014 alone -- ensures that student data is getting sold and used outside an educational context.

Image credit: Scott Payne, released under a CC0 license

Share your thoughts: