Ars Technica·3h ago·🇺🇸United States·Tech

Columbia University Data Breach Exposes Unaffiliated Individuals' Social Security Numbers

8 min read·%70 importance·1504 words

#ColumbiaUniversity#databreach#SocialSecuritynumbers#dataprivacy#identitytheft#SAT#CollegeBoard#classactionlawsuit

Ars Technica

Publisher

Font size

A weird text from my dad in February sent me on a months-long quest to solve a mystery that has been troubling an odd group of victims from a Columbia University data breach last year. That group? People with absolutely no connection to the school.

The text included a photo of a letter from Columbia, informing me that I was a victim of a data breach last June, one that exposed a wide range of sensitive information, including 1.8 million Social Security numbers.

Columbia’s public notices about the breach were addressed exclusively to “members of the Columbia community.” In the notices, Columbia warned that an “unauthorized party obtained information about students and applicants related to admissions, enrollment, and financial aid processes, as well as certain personal information associated with some Columbia employees.” Major news reports that followed only referenced people affiliated with Columbia as victims, while pointing out that the hacktivist behind the breach was reportedly motivated to expose Columbia’s history of “affirmative action-based” admissions.

But I don’t belong to the “Columbia community.” I have never applied for, attended, or worked for the school. And the letter sent to me—which arrived six months after the public notice—did not explain how Columbia obtained and exposed my SSN. All the letter said was that the breach affected “certain personal information about admissions, enrollment, and the financial aid process.” It directed me to sign up for free credit monitoring from Kroll Monitoring, a service Columbia hired to manage the hotline for victims.

It took a nightmare journey through Columbia’s victim support services before a Columbia official finally explained how decades of third-party data collection, combined with multiple unsuccessful data-removal initiatives, had led the school to warehouse data from so many unaffiliated people.

Did taking the SAT expose my SSN?

In my search for information, Kroll’s hotline felt like a dead end. The only option hotline staffers offered victims like me was to escalate the case, and if you called back, they would offer to re-escalate it. Supposedly, escalation would result in a callback with more information. When weeks passed without any follow-up, I tried a different route and contacted Columbia’s IT call center.

The call center responded immediately by email, and I was encouraged when I was told they were “actively looking into why your information was included among the affected data and will get back to you.” They asked for patience while they completed their review, but after a month without any response, I began to wonder whether there was a reason the support systems had no answers—and why Columbia wasn’t talking about unaffiliated victims in its public notices.

In April, I contacted Columbia’s communications office, hoping it could at least clarify whether there was any path for victims like me to get questions answered.

But even the comms team seemed evasive. After weeks of prodding, they offered only a theory: The school might have obtained my SSN back in 2001 when I was a high school junior taking the SAT. That explanation seemed plausible, they suggested, since the stolen data dated back decades. At that time, SSNs were commonly used as student identifiers. I was told that I had likely consented to sharing mine in order to receive admissions or scholarship information from Columbia.

But I had never shopped around for colleges and therefore wouldn’t have knowingly shared my personal information. I certainly never wanted to attend Columbia. I went to high school in Florida, where the state’s “Bright Futures” program covered full tuition for kids with good grades. My parents never talked about paying for school, so I had no idea how the process worked. I love a good deal, so I only applied to one school, and as a result, I sent my SAT scores to only one school: the University of Florida.

So I was skeptical of this theory, and I wasn’t alone. On social media and Reddit, I found dozens of posts from people similarly confused about why they received a breach notice. Some users deduced that their SSNs were likely shared when they took the SAT, the ACT, or the GRE, or possibly when filling out forms for financial aid, like the FAFSA. Others seemed to receive vague explanations from Columbia about testing programs that may have shared their SSNs, and like me, they assumed the College Board, which manages the SAT, had provided that data.

I asked the College Board if this theory could be true. A spokesperson disputed that any student’s SSN would have been shared with Columbia via an opt-in program called “Student Search.” Prior to 2018, when SSNs stopped being shared entirely, the College Board confirmed that the “only circumstance” in which it would have shared my SSN was if I had requested that my SAT scores be sent to Columbia, something I never did.

My frustration grew over four months of dead ends, until I had finally emailed Columbia enough times that it agreed to tell me what was really going on.

Columbia failed to delete SSN database

Columbia had already faced criticism for taking about a week to notify victims of the breach, since each day without notice increases the risk of identity theft. But for victims with no connection to the school, notification took even longer because, as the university explained, it required more time to track down their contact information.

I’m not sure when Columbia first attempted to contact me. The February letter mailed to my dad’s address—where I had not lived since graduating high school—claimed that Columbia had “previously disclosed” the breach to me, though it was my first notification. On Reddit, some users reported that they, too, had gotten notification letters mailed to their parents’ addresses. Others said Columbia managed to find their current addresses.

In discussions with Ars, a university official said that prior to 2012, Columbia received prospective student information, including Social Security numbers, from a wide range of sources. During that period, student recruitment services, scholarship programs, and testing programs often shared SSNs with Columbia, presumably with students’ consent.

A student might consent to share their SSN, the official said, to receive information about various schools or scholarship programs. Or they might directly request that a testing program share their SSN along with their scores. Ars reached out to the College Board and the ACT, which operate two major college testing programs, and confirmed that both stopped sharing SSNs as student identifiers. The College Board ended the practice in 2018, and ACT said it had stopped about a decade ago.

Columbia discontinued its use of SSNs as student identifiers in 2012, the official told Ars. It had also intended to delete SSNs collected before the breach occurred. But despite completing initiatives to remove SSNs and other sensitive personal data from its systems, the official said Columbia inadvertently missed a legacy database containing my SSN.

I’ve been assured that Columbia has since deleted my SSN from its system, and the school has reportedly accelerated its efforts to detect any other sensitive data still on its network. But I doubt the school will ever pinpoint the real source of my data, since the official also confirmed that some of the fields that would help identify data sources in cases like mine had been deleted.

As I now understand it, Columbia’s IT department has been working for months to identify any remaining data to respond to victims’ questions.

And this week, Columbia will finally start following up with victims who reached out to either Kroll or Columbia’s IT call center with questions about their data. Those two resources are still the recommended paths for unaffiliated victims seeking answers, Columbia’s official confirmed.

It’s also possible that some victims in this group may never have received notices. After Ars’ pressing, Columbia confirmed that it would publicly acknowledge this group of victims unaffiliated with the university for the first time in a blog post, which was published on Wednesday. The university also provided Ars with a lengthy statement addressing these victims, saying:

Columbia has been investigating questions raised by individuals with no known connection to the University about how their information came to be in our systems. Based on our examination, we believe that this information came to us through student recruitment services that, at the time, provided this type of information to colleges and universities from students who indicated they wanted to share it, whether to report a test score or to request further information about specific colleges, universities, or scholarship programs.

Investigations of this nature are complex and, unfortunately, take time. The University notified impacted individuals as soon as it was able to identify contact information. We are in the process of responding to individuals, including those with no apparent connection to the University, who have reached out with additional questions about the notification they received.

Columbia may face class action

It took about four months to learn that Columbia will likely never be able to determine how it got my SSN.

It’s unclear how many victims have no connection to Columbia or how many universities may be hoarding stores of sensitive data from the early days of SSN sharing. Columbia did not specify how many unaffiliated victims were affected, nor what portion of the exposed SSNs could be traced to people outside the Columbia community. When asked for an estimate, the official suggested that “the vast majority of notified individuals had a known affiliation with the university.”

As early as 2005, Ars found that as online identity theft began to rise, the Social Security Administration started urging universities to stop using SSNs as student identifiers and to limit their collection of the numbers. Columbia’s case shows that some universities didn’t follow that guidance for years. On Reddit, users reported receiving notifications suggesting their SSNs were likely shared after they took college placement tests in the 1990s.

“Didn’t they get this info on, like, a floppy disk?” one user asked. “Why would it have ever made its way into ‘the cloud’? Is that not the ultimate in gross negligence?”

Another user responded, “Yes! I’ve wondered the same! I guess I bubbled in my SSN on my SAT. How the hell did it get into a Columbia data set in 2025??!!”

A third wondered, “Why would my mid-’90s data ever have been uploaded *anywhere*?”

Many users wondered whether they could join a proposed class action lawsuit alleging that Columbia “failed to prevent the data breach because it did not adhere to commonly accepted security standards and failed to detect that their databases were subject to a security breach.”

Ars was unable to reach the case’s lead attorney to confirm whether victims unaffiliated with the school would be included if the class is certified. But while the named plaintiffs represent only people in the Columbia community, the proposed class definition suggests broader coverage, seeking to include “all persons whose PII was maintained on Defendant’s servers and compromised in the Data Breach.”

Columbia is currently engaged in private mediation with plaintiffs in that suit, and its response isn’t due until August 10. That allows time for a potential settlement outside of court, though such an agreement may not directly address other legal questions about Columbia’s data retention.

Hoarding SSNs for 20 years is “really indicting”

Educational institutions and ed tech companies remain attractive targets for hackers, since schools and firms inevitably store vast amounts of sensitive data.

Columbia’s incident last year was not the largest to rock the education sector. A breach at PowerSchool, which provides K–12 education software, compromised sensitive data belonging to over 60 million students, the nonprofit Electronic Frontier Foundation noted in its annual “Breachies” awards, which recognize the weirdest and most impactful data breaches. But while Columbia’s breach exposed far fewer students’ data, the school still made EFF’s “(dis)honorable mentions” list. Critics blasted the school for holding sensitive records on its own staff and students indefinitely, but nobody knew the school was holding onto even more data.

Bill Budington, a senior staff technologist at EFF, told Ars that it’s unusual that Columbia did not indicate in any public notice that some victims had no connection to the university. That omission stood out, he suggested, because Columbia “has some prestige, some trust that’s imbued in them.”

It’s not “just some shady data broker,” he said.

“It was clear that this was improperly stored data that then, given enough time, inevitably becomes a subject of a data breach,” Budington said. “And that’s something they should… take care to protect, even especially because it includes people that weren’t even affiliated with Columbia, didn’t even place their trust in Columbia in the first place.”

I asked Budington if anything could be done to stop other universities from hoarding historical SSN data in vulnerable online systems. He suggested that a more active Federal Trade Commission might investigate the data retention as an unfair and deceptive business practice.

This article was originally published by Ars Technica.

Columbia University Data Breach Exposes Unaffiliated Individuals' Social Security Numbers

Related Stories

Amazon's Gaming Strategy: A New Focus Amidst Diversified Efforts

Meta's Oversight Board Cites Due Process Concerns in Account Deactivations

Meta Launches AI Creator Assistant on Facebook

Apple WWDC 2026: Siri AI Overhaul, New App Store Features Expected

Cash App Launches $25 Magic Wand for Tap-to-Pay Purchases

Water Scarcity Emerges as a Major Data Center Development Issue