Share this analysis

Facebook Email Service, Youku and others fall victim of data leaks.

11 April 2021
BREACHAWARE HQ
EMail

A total of 14 breach events were found and analysed resulting in 213,340,517 exposed accounts containing a total of 13 different data types of personal datum . The breaches found publicly and freely available included Facebook Email Service, Youku, Haute Look, BookChor and Cyber Services. Sign in to view the full library of breach events which includes, where available, reference articles relating to each breach.

Categories of Personal Data Discovered

Socia-Demographic Data, Contact Data, Locational Data, Social Relationships Data, Technical Data, Communications Data.

Data Breach Analysis

Among the breached services were Facebook Email Service, Youku, Haute Look, BookChor, and Cyber Services, each serving a markedly different demographic and function online. This breadth provides a snapshot of just how distributed, fragmented, and indiscriminate data exposure has become in the current digital landscape.

The inclusion of something described as the "Facebook Email Service" raises immediate curiosity. While Facebook has undergone many iterations of internal communication systems and messaging layers, the phrase here likely refers to an auxiliary system used either for internal notifications or third-party messaging integration. If legitimate, a breach from such a system could affect large volumes of metadata. Even if the content of the communications was not leaked, this type of metadata holds significant forensic value. It can reveal communication patterns, corporate structures, and personal habits, all of which are valuable for targeted phishing or intelligence gathering.

Youku, a leading Chinese video hosting service owned by Alibaba, introduces a very different threat vector. With millions of users consuming and uploading video content, Youku represents a platform where personal preferences, comment histories, login details, and location data might be stored. If the breach included session tokens or login credentials, it could allow for account takeovers, particularly dangerous in an environment where videos might be personal or politically sensitive. Although Youku is primarily used by a Chinese-speaking audience, its relevance on the global tech stage means that any breach associated with it has geopolitical resonance. Moreover, accounts may be linked to Alibaba Cloud or other Alibaba services, which multiplies the possible points of access for an attacker.

Haute Look, a now-defunct online fashion retail brand that was acquired by Nordstrom, underscores the persistent risk of legacy data. While the service may no longer be actively used by most of its historical customer base, dormant accounts often retain full name, email address, purchase history, and potentially billing addresses. Retail-focused breaches are especially valuable to actors involved in financial fraud or identity theft. Customer profiles built from old accounts can be used to validate fraudulent credit applications, create synthetic identities, or tailor phishing messages that exploit a user’s shopping preferences.

BookChor, an Indian platform dedicated to affordable and second-hand book sales, represents another niche but meaningful source of leaked data. Users of BookChor typically provide standard registration details and often engage in transactional behaviour that includes address submission, mobile verification, and payment method information. The risk here is not necessarily limited to individual exposure. As an e-commerce platform operating in a country with fast-growing internet penetration and uneven enforcement of data protection standards, BookChor's breach could provide insights into regional consumer behaviour, postal code mapping, and other logistic-related data.

Cyber Services, a generic term that could point to multiple entities, is harder to pin down but is nonetheless illustrative. Breaches attributed to ambiguously named platforms often raise concerns about data handling practices and accountability. Whether it's a digital consultancy, VPN service, or security tools vendor, a platform falling under the umbrella of "Cyber Services" would be expected to maintain strong safeguards. If such a platform suffered a breach, it underscores the irony and risk of relying on cybersecurity providers who are themselves vulnerable.

With 13 different data types discovered among these breaches, this cluster stands out not just for its size, but for its complexity. The presence of so many varied data types elevates the risk exponentially. Unlike breaches that only compromise email-password pairs, these incidents provide a full digital profile of many users, suitable for identity theft, account chaining, or manipulation of social trust mechanisms across platforms.

It is also worth considering how attackers might synthesise the data from these 14 breaches to extract further value. If a user appears in multiple datasets, say, registered on both Youku and Haute Look, patterns in language, behaviour, device usage, or even geographic overlap can be identified. From there, social engineering attacks can be fine-tuned. A phishing email that mimics an order confirmation from a long-forgotten Haute Look account could appear more convincing when cross-referenced with delivery location data leaked via BookChor.

The total number of exposed accounts, over 213 million, rivals the size of entire national populations. Even assuming a substantial amount of duplication (users who appear in multiple datasets), the sheer volume of unique accounts compromised underscores the industrial scale of data commodification. In the wake of such breaches, entire generations of users may be walking around unaware that long-dormant accounts from forgotten services continue to circulate in forums, sold and resold like outdated but still active keys.

Another under-discussed dimension of such large-scale breaches is how they interact with automated detection systems. Companies seeking to defend against credential stuffing or account takeover must continually tune their systems to flag suspicious login attempts. But when credentials leaked from these breaches are used with enough finesse, distributed across IPs, mimicking real devices, simulating time zone-appropriate access, detection becomes more challenging. This is especially true if breached data includes browser fingerprints or session token data, which attackers can use to avoid common velocity and anomaly-based flags.

In conclusion, this breach cluster presents an example of digital overextension. Platforms large and small, old and new, across different geographic and functional boundaries, continue to be compromised. And while some of the exposed services may no longer operate at scale or even exist, the data they generated persists, a static risk that grows in complexity as it is combined and leveraged. It is a reminder that in the digital world, decay does not imply disappearance. Once information is out, its half-life is effectively indefinite.

  • Key Stats
  • BREACH EVENTS
    0
  • EXPOSED ACCOUNTS
    0
  • EXPOSED DATUM TYPES
    0