Our survey of encryption practices found over half of edtech services support encryption.
In the last two weeks of October, we ran automated tests on 1,221 logins used by 1,128 vendors for technology used in schools and by youth. We surveyed login locations (login URLs) to assess the level of basic support for encryption. While our list of sites is not exhaustive, it is representative of websites from small developers, well-established companies, start-ups, privately held companies, and enterprise-level applications.
Our findings indicate that a significant number of vendors do not provide even basic support for encryption. While 52 percent percent of the 1,221 login URLs we surveyed require encryption, 25 percent do not support encryption at all, and an additional 20 percent do not require an encrypted connection.
It should be noted that a complete lack of support for encryption was observed in applications from vendors of all sizes — from small services to early and middle-stage start-ups to privately held companies that have been used for years to enterprise applications that are used in thousands of districts with millions of learners.
The write-up that follows describes our methodology and goes into more detail about what this survey does and does not cover. These details provide essential context that helps us use the information uncovered by this survey constructively. This snapshot indicates that there is room for significant improvement within software used with youth to support teaching and learning.
Approximately 90 days from now — in February 2017 — we will publish a follow-up post that highlights if, or how, things have changed from our initial survey results. As we attempt to highlight in this post, this survey is most useful as a snapshot of current data-security practices within the education technology industry. As we rerun this survey over the next few months, we will begin to get a sense of the trends within the industry. Hopefully, we will see an increase in the use of encryption.
Our work with the Privacy Evaluation Initiative begins with a quick triage of the application. In this first triage step, we take a quick look at some basic elements of the application, including whether the site uses encryption to protect data in transit. An initial test for encryption is drop-dead simple: Just look for an “s” in the beginning of the URL. If it’s there (i.e., the URL begins with “https://”), the site uses encryption. If it’s not there (i.e., the URL begins with only “http://”), then the site does not use encryption. We cover testing for encryption in more detail in the “Encryption” chapter of our Information Security Primer.
While triaging applications, we noticed that, in some cases, applications do not use encryption to protect logins. As we have discussed before, encryption is a basic requirement for applications that collect usernames, passwords, and other personally identifiable information. Using encryption to protect data in transit is widely recognized as a best practice and, in some cases, even a legal requirement to satisfy a “reasonable” standard of security. While testing for encryption is not difficult, doing it on a site-by-site basis is more time-consuming than it needs to be. Rather than continue to test sites individually, we decided to automate the process and survey encryption practices in aggregate.
An automated survey is not a replacement for a detailed evaluation performed and reviewed by individuals, but automated surveying allows us to:
- spot encryption trends over 1,000 sites;
- identify individual sites that appear to be following best practices;
- and identify individual sites that may have insufficient encryption.
A related side note: “Encryption,” “privacy,” and “security” are often used interchangeably. This post explains some of the similarities and differences between the terms.
We started with a list of approximately 2,450 web applications used in education or focused on youth. While our list is not exhaustive, it is representative and includes websites from small developers, well-established companies, start-ups, privately held companies, and enterprise-level applications. This survey looked only at login URLs from web applications. We looked explicitly at login URLs (and not, for example, home page URLs) to ensure that we were analyzing the location where information was collected and passed from a user to the web application.
Out of our original list of 2,450 sites, we excluded the following:
- all sites that don’t require a login;
- all sites that are no longer in business;
- and all sites that for whatever technical reason failed to load properly.
This narrowed our list to 1,128 web services with 1,221 login locations. The difference between the number of web services and the number of logins exists because some sites contain different logins for different types of users (i.e., separate login URLs for students, parents, and teachers).
In our survey, we performed three actions at each login location:
- sent an https request to the login URL and checked the http status code (i.e., asked whether https was supported);
- sent an http request to access the login URL to see if the site redirected to https (i.e., asked whether https was required);
- and inspected the site header for the presence of HTTP Strict Transport Security (HSTS) directives.
We ran our test script three times, using two computers over three physical locations and three networks. We replicated the tests in different scenarios and over different networks to ensure that the results of the tests were not affected by connectivity issues or by any idiosyncrasies attributable to the computer running the test. In each case, we had fewer than 10 variations from our control data set of 1,221 login URLs. We manually reviewed these variations to ensure we had accurate results. In several cases, the variation was caused by an SSL certificate that had expired between our first and second test and was subsequently renewed prior to our third test — further indicating that variations in our results were accurately measuring real differences in responses received from login URLs.
After interpreting and discussing our results, we want to be clear that this survey is simply an indicator — rather than an absolute measure — of how sites use encryption. It is an accurate representation of trends across the sites we surveyed, but an authoritative statement about any specific site would require a more detailed evaluation. Overall, the survey has proven to be very accurate at identifying whether or not encryption is used, and it’s a very useful tool for identifying sites that require a more detailed evaluation.
In evaluating the survey results, we paid specific attention to sites that met the following criteria:
- used HSTS directives in their headers;
- both supported and required encryption but didn’t use HSTS directives in their headers;
- supported encryption but didn’t require it;
- or didn’t use encryption at all.
C. What This Survey Misses
Due to the nature of automated testing and the need to focus on the specific login URLs, this survey misses scenarios that are outside standard implementations. The things missed by this survey are potentially issues in their own right. However, some of the elements we missed could have been a result of unknown login URLs that were encrypted, resulting in us giving a pass to sites with weak technical implementations. We highlight these elements here in an effort to provide more detail on what this survey measures and what it does not measure.
In preparing our data set for this survey, we noted which sites served their login URLs via a pop-up window. We also attempted to capture accurate information for when a site redirected to an authentication service (Google, Microsoft, Facebook, Twitter) for login to test the actual location of the login URL. While we made every effort to be 100 percent accurate in preparing our data set, there is always the possibility for some human error in our base list of login URLs.
In a few scenarios, we found that a page might be served without encryption, but a login form presented via a pop-up was encrypted. This practice is not common (an informal review found about one site out of 20 served the site unencrypted but served an encrypted login via a pop-up), but as a follow-up from this initial evaluation, we will be looking more closely at sites that serve their logins via a pop-up to get a more precise understanding of how widespread this practice is. While the practice of embedding an encrypted form in an unencrypted page is less than ideal, we note this option here because our survey records an encrypted form served within a pop-up as an unencrypted login. In general, sites that serve an encrypted login inside a pop-up from an unencrypted page are dealing with larger technical issues and need to upgrade their full sites to support encryption.
Similarly, this survey will also miss sites that have their main webpages served without encryption but that embed a login form that is served with encryption. This is not something we see frequently — an informal review of about 20 sites didn’t show this practice at all, and we have only seen this particular scenario once on a site within the last six weeks. Additionally, the practice of embedding an encrypted form within an unencrypted page is less than ideal for multiple reasons. One of the most obvious issues with this approach is that users cannot use their browsers to verify whether the site is actually using encryption. The best solution is for the vendor to upgrade its full site to support encryption. Even though we found this is not a common practice, we wanted to flag that our test would miss this particular scenario wherein the implementation is not ideal but encryption is technically supported.
Further, this survey will also miss sites that encrypt their login URLs but subsequently fail to encrypt (or fail to require encryption) once a user has logged in. If a vendor fails to require encryption for logged-in users, there exists a higher risk of users having their sessions hijacked by an unauthorized user. This is an unfortunate practice we see on a fairly regular basis; sites that only encrypt user logins but not the data collected after a user is logged in leave people using the site exposed to the risk of a session-hijacking attack.
This survey will also miss sites that do not encrypt their account registration processes. On many occasions, we have seen sites that encrypt logins but do not encrypt the account-registration process. This is a security risk — and this risk is increased when an unencrypted registration process allows users to create a password.
This survey does not include mobile apps. We did not include login URLs or API endpoints for iOS, Android, Kindle, or Microsoft apps, although we plan on expanding the survey to include this information in the future.
This survey also does not look at the quality of a site’s encryption or any other infrastructure details. In addition, there are some outdated encryption methods that are no longer effective, and there are ways of implementing encryption that undercut the protection offered by encryption best practices (Observatory from Mozilla offers a range of insights on these details). For this test, a site that uses outdated encryption would still be considered a pass. We intentionally set a low bar to capture a baseline of usage for any form of encryption. Discussions about the quality of encryption are relevant, but those discussions are outside the scope of this survey.
The scenarios that fall outside our survey are not standard-use cases, and except for the possibility of inaccuracies in our core data set, the elements missed by this survey are all practices that are not ideal (as part of a technically sound, secure implementation) for a variety of reasons. To ensure accuracy in our data set, the information has been manually reviewed by multiple independent individuals, all of whom have concluded with a high degree of confidence the data set’s overall accuracy.
Our results are reported in three sections: “Status Codes,” “Support for Encryption,” and “Support for HTTP Strict Transport Security (HSTS).”
D1. Status Codes
The survey starts by looking at the http status codes. Status codes help provide an initial picture of if, or how, the service supports encryption. As we discuss later in the full summary, the full picture of how a vendor supports encryption comes from looking at a range of factors.
- Two hundred forty-five login URLs returned no response (denoted by “0” in the chart above) for an https request. This indicates that the login URL did not support encryption.
- Eight hundred seventy login URLs returned a 200 response code. This indicates that the login URL supported encryption but might not require it.
- Twenty-eight login URLs returned a 301 response code. This was one of the more odd responses we saw, as our observation indicates that a vendor made an explicit choice to redirect all requests for an encrypted connection to an unencrypted connection. Manual review confirmed the behavior indicated by the test.
- Fifty-eight login URLs returned a 302 response code. Similar to the 301 response code, the vast majority of sites that returned a 302 response code indicated that the vendor had made an explicit choice to redirect a request for an encrypted connection to an unencrypted connection.
- Fifteen login URLs returned a 4xx response code. The 4xx response codes indicated various client-side errors. These responses were possibly due to shortcomings in the assumptions of our automation process. These mistakes remained consistent across tests. We reviewed these sites manually.
- Five login URLs returned a 5xx response code. The 5xx response codes indicated various server errors. These errors remained consistent across tests. We reviewed these sites manually.
From the survey results, the 1,115 login URLs that either returned no response or a 200 response code to a request for an encrypted connection provide the most accurate feedback on what we were testing for: the presence or absence of encryption. This covers approximately 91 percent of our data set.
The login URLs that returned a response with of 301, 302, 4xx, or 5xx (106 total, or 9 percent of all results) indicate potential issues with encryption, issues with assumptions in our automation, and, in some cases, issues with our server infrastructure. As noted above, a manual review of sites that returned a 3xx code indicated that the vendor had made an explicit choice to redirect a request for an encrypted connection to an unencrypted connection, which is precisely the opposite of what should happen. While a manual review of these results indicated issues with encryption that are discussed in this write-up, these sites appear to have additional technical issues that would require an additional evaluation. In short, the results of this survey suggest that, in some cases, testing for the presence or absence of encryption can be used to find indications of other issues in addition to issues related to encryption.
D2. Support for Encryption
- Three hundred ten login URLs (approximately 25 percent) did not support encryption at all. In this analysis, we define “does not support” as not responding to a request for an encrypted (https) page and not redirecting an unencrypted request.
- Two hundred forty-three login URLs (approximately 20 percent) supported, but did not require, encryption. In this analysis, we define “do not require” as a service that responds to an https request but will not redirect an unencrypted request to an encrypted connection.
- Sixty hundred thirty-two login URLs (52 percent) required encryption.
- Thirty-six login URLs (3 percent) returned results that were inconclusive and need review. In this context, “inconclusive” generally looks like a login URL with multiple redirects to a federated or social login that may or may not be encrypted, or a site that forces a redirect to an encrypted connection with an expired SSL certificate. While the result might support some form of encryption, the path to getting to that login is poorly implemented.
It should be noted here that roughly 45 percent of the apps surveyed failed to use and/or require basic encryption to protect information in transit.
D3. Support for HTTP Strict Transport Security (HSTS)
We also surveyed how many sites used HSTS directives in their headers. Of the 1,221 login URLs surveyed, only 14 percent (166 login URLs) of the sites used HSTS directives. If we limit our view to only the sites that require encryption, only 26 percent of these sites were using HSTS directives to protect information sent in transit.
Sites that collect usernames and passwords — and, in most cases, additional information about members — need to protect that information in transit with effective encryption. HSTS directives paired with strong encryption provide a foundation for reasonable protection of sensitive information in transit. While HSTS directives are best practice, especially for sites using cookies, we observed only 14 percent of sites implementing best practices.
If HSTS directives are not implemented, the next best option for a site is requiring https. Just over half the sites we surveyed met this basic requirement. If more of the sites requiring encryption also implemented HSTS directives in their headers, the web would be a more secure place.
While additional steps are required to provide reasonable security for user data, using encryption to protect data in transit between the web browser and the site is a basic step vendors must take.
As our analysis indicates, 45 percent of the applications surveyed don’t support a minimal level of encryption to support data security. Only 14 percent of all sites surveyed implement best practice: requiring encryption, and using HSTS directives. As we discussed in an earlier post, time is rapidly coming to an end for sites and services to not support encryption. Many schools and districts refuse to use software that doesn’t encrypt student data in transit. If you are a vendor who does not support encryption, the writing is on the wall: Drop what you are currently doing, and update your data-security practices.
The following are some of the observations in our data set from products that are used in thousands of school districts:
- A well-known vendor appears to have enabled encryption for districts only in states that have laws requiring reasonable security. This specific vendor has enabled encryption for districts in California but doesn’t appear to support encryption in other states with less specific privacy laws. More research is needed on the extent of this issue.
- A vendor’s products that support students of all ages within K–12 does not support encryption at all in a subset of their product offerings.
- As noted above, multiple vendors take a request for an encrypted connection and explicitly redirect it to an unencrypted connection.
We have decided not to make our list of vendors or their data-security practices public at this time. We are sharing our survey data in the aggregate here because we want to see improvement in how vendors support encryption. Every vendor can and should test their sites to assess how they support encryption. We already indicate in our evaluations when an app doesn’t support encryption, and we are continuing to explore new ways of highlighting issues like these on our app-overview pages. In addition, we will be adding applications to our survey list and rerunning this evaluation approximately every 30 days.
Approximately 90 days from now — in late February or early March of 2017 — we will publish a follow-up post that highlights if, or how, things have changed from our initial survey results. As we attempt to highlight in this post, this survey is most useful as a snapshot of current data-security practices within the education technology industry. As we rerun this survey over the next few months, we will begin to get a sense of the trends within the industry. Hopefully, we will see an increase in the use of encryption. The need for strong and reliable security as the foundation for good software isn't going away. The sooner we embrace it, the better.