Content Sources

Love in the Time of Big Data: Big Data's Role in Online Dating

“Big data” is a buzzword that has become increasing popular as the Internet continues to reach into our everyday lives. Suddenly everything can be recorded and calculated, from what we say on social media to where we are at any given moment. According to Steve Lohr, big data is information that measures every small detail about an environment or situation. It’s challenging to explain big data using a broad definition because big data itself involves micro-level information. So to simplify it, let’s look big data in the context of an emerging market: online dating.

What’s the Data?

Many popular online dating websites, such as eHarmony and, market themselves on having  complex algorithms that sets up users with similar interests. The algorithm is fueled by the big data of online dating, mostly generated through long questionnaires.  According to eHarmony executive Joseph Essas, qualitative data falls under three categories:

  • Data on psychological compatibility such as traits, values, beliefs
  • Data on interpersonal chemistry such as likes and dislikes and shared hobbies
  • Data on physical attraction such as general preferences like hair color or height

Another group of big data in online dating algorithms includes attribute ranking. includes attribute ranking in its questionnaire, which expands or constricts the data pool depending on the importance of the attribute. So for example, if I identified that being taller than me was an attribute of high importance, the algorithm would only include men over 6 feet tall. Meanwhile, if I said that being taller than me was an attribute of low importance, the algorithm would include men of various heights.

Why Does It Matter?

Amy Webb’s TED Talk “How I hacked online dating” explains that while these algorithms are good, users need to be aware of the data they input. Big data on online dating sites is more dependent on the users being aware of the data they input than any other big data collector. For example, while customer feedback does help retailers determine marketing promotions, big data like sales, economic data, and even weather patterns can affect a marketing plan.

Back to online dating, Amy mentioned that you need to not only consider the result you want (aka, matches with desired characteristics), but also, on the flip side, consider what kind of data (characteristics) your desired matches are interested in. She found that the content of an online dating profile does matter, not only in terms of the characteristics, but also in terms of how you present those characteristics in word choice, profile length, and communication timing.

While now, the quality of the results relies on the quality of the data, online dating services are continuing to fine-tune their algorithms to learn from user communication. Maybe someday, you won’t have to answer that “Are you a cat person or a dog person?” question for the algorithm to find some deep, vague underlying personality. Instead, the algorithm will recognize a picture you post with your dog and note that you probably like dogs.

Where’s All This Content Coming From? Pros & Cons of Various Content Sources

So here’s the good news: you don’t have to create all your own content. But all that content isn’t going to magically appear overnight. The content strategist does not only guide the content development process, but also to arrange pre-existing and future content. According to Halvorson and Rach, this content comes from six different places: all with their own strengths and weaknesses.

Original Content: Make it Extra-Special, Just for You

This should be pretty self-explanatory. If you want the best way to get personalized content tailored to your business and audience, make it yourself. It’s easier said than done though. Making your own content is work and takes time and money. But don’t worry – there are plenty of other ways to fill your site with content.

Co-Created Content: Embrace the Sell-Out

Whenever I put on make-up, I watch beauty vloggers on YouTube. Many of them mention that they receive products from cosmetic companies to review on their channel. This is co-created content. You provide the product and some incentive to a relevant partner who then creates the content for you.

The major benefit is that the partner brings along their audience, who trusts their voice. So, for example, if Kim Kardashian posts about a product she loves, her followers will likely seek out the product because they trust her opinion about make-up. Businesses can take advantage of this by sponsoring Kim Kardashian to showcase their product to her Twitter followers.

The downside to this is people may be distrusting of sponsored content because the opinions may not be genuine. Think about it: how much money would it take for you to say something good about a product? If the product is within reason, probably not that much. Instead, going back to the beauty vlogger example, cosmetic companies are sending vloggers a product and asking for their honest opinion: good or bad. Good reviews bring potential customers, and even a bad review can provide good feedback.

Aggregated Content: Bring in the Extended Family

Aggregated content is content pulled from other sources onto one site. You usually see aggregated content in the form of an RSS feed. uses RSS feeds for weather and trending topics so the homepage doesn’t constantly need to change or update and so Yahoo can personalize the content for readers. That way, someone in Dallas can see the weather in Dallas without having to open up the Yahoo weather page. Neato.

Where aggregated content becomes aggravating content is when the feeds junk up the page. Does your user really need to see your Facebook page that hasn’t been updated in months? No. No they don’t. But as a content strategist, you should be making sure that Facebook page is updated regularly anyway.

Curated Content: Post Only the Best from Across the Web

Curated content is content that already exists that is hand-picked for your website. The content needs to fit your website’s theme and message. If your blog is for sharing pictures of cats, you shouldn’t randomly decide to share an article on U.S. politics. It goes against the message you are trying to share through your content strategy.

Curated content isn’t user-generated. That is, you have to go get the content yourself. Content you ask users to get for you is a whole other subgroup of content. We’ll get to that in a minute.

Licensed Content: Up Your Web Cred

Licensed content is your resources; aka, the trustworthy sources of content that that help users understand your content and makes your content look more credible. This can include articles, videos, or stock photos of all those happy employees that totally look like the people you work with on a Monday morning.

Using licensed content is a debate amongst content strategists. It can increase your web cred when used correctly, but can also seem like generic fluff on your page. More than once, I’ve worked on redesign projects for websites that use obvious stock photos. It makes the company look less genuine, which is the total opposite of what licensed content should be doing.

User-Generated Content: Make Your Audience do the Work

I’ve saved the best and worst for last: the user-generated content. You ask your users for content, and they will deliver: for better or worse. Ideally, you’re getting genuine feedback or creative ideas from your audience at a low cost.

But this is the Internet we’re talking about. These are the people who tried to name a boat “Boaty McBoatface”. When you get serious content, it can be just as helpful as co-created content. But bad reviews and troll comments can be a challenge all on their own.