Less than a week ago the world learned of a massive data breach called Collection #1. Speculation soon began about its origins – who got hacked and, perhaps more importantly, when?

The figure that has emerged in most headlines is staggering – 770+ billion login credentials stolen. That would put it at #2 for the whole year had it happened in 2018, rather than in January of this year.

But let’s take it back a bit because there are lots of misunderstandings here, perpetrated by portals in need of a catchy title. What is the Collection #1 data breach really?

Collection #1: Untangling fact from fiction

Let’s begin by saying it’s a misnomer to call Collection #1 “a data breach.” The clue is in the name, given to this massive collection of login credentials by Troy Hunt – the primary authority of all things breach and one of the people behind the awesome Have I Been Pwned (HIBP) service. The facts suggest that this is a collection of old breaches – the huge number of user credentials stolen actually came in 2,890 file directories (87 GB of data), which seem to correspond to various sources. Collection #1 happens to be the name of the parent directory.

collection 1 breach directory list
Image source: Troy Hunt

Not much is factual about this collection, but it seems reasonable to assume that the data breaches responsible for all this leaked data came from different times and probably cover a period of at least a few years.

The number “770+” billion login credentials is also not entirely accurate. According to Hunt, this number actually refers to the number of unique email addresses. Meanwhile, the number of unique plaintext passwords is 21,222,975. The rest are hashed, within strings that contain control characters, or fragments of SQL statements.

Finally, there is disagreement over whether any of the information is actually new.

Beyond Collection #1

sanix data dump collections
Image source: KrebsOnSecurity

The source of Collection #1 has much more data to offer: Collection #2 – #5 (785 GB), as well as directories named “AP MYR&ZABUGOR #2” (25 GB) and “ANTIPUBLIC #1” (102 GB). This could be billions of login credentials. All this data was initially made available for $45 but is currently available for free.

This and other information again suggests that at least the majority of this data is not new. For example, “ANTIPUBLIC #1” more than likely contains the Anti Public Combo List, which was initially released in 2016. Brian Krebs at KrebsOnSecurity got in touch with the seller of Collection #1 and all the other folders via Telegram and learned that the data on this particular collection is (at least mostly) old. However, the seller claimed to have 4 more terabytes of credentials from more recent times.

Needless to say, this should be taken with a liberal grain of salt.

I have been pwned – what now?

First things first, you should check whether your login credentials were stolen. You can do that on the aforementioned Have I Been Pwned website – just enter your email address and the site will tell you whether you have anything to worry about. Perhaps you have an old email address you’re no longer using? Check that as well.

If the site says you have been pwned, first of all:

  • Change your password on that account

This one’s obvious, but it’s not the only thing to consider. Studies show that lots of people the same password across various accounts. Are you one of those people?

  • Change the passwords on related accounts/accounts with the same password

Alas, hackers are smart people – they know that many use the same password over and over again. That’s why credential stuffing is so effective. Basically, since people generally have one primary email address, which they use when registering for various services, all hackers need to do is take your credentials from another site and try them on other popular services – Facebook, Spotify, Netflix, Amazon, whatever it may be.


  • Use a different password on all (or most) of your accounts

Yes, remembering is hard. We use so many different services nowadays! Therefore, you should go one of two ways – use long phrases that you can memorize or use strong passwords and store them in a password manager tool. We have reviewed a few of the better password managers out there.


While the Collection #1 “data breach” is neither a breach nor new, more can come of this story. The Telegram user “Sanixer” who is behind this dump claims to have 4 TB of not yet seen login credentials. Check back to see if there is any news in the Collection #1 leak story.