It is not possible to use the internet without being part of someones’ data experiment

Yes, that is right! Each time we get on the internet, search for information, get entertained, use social media, shop online, buy and sell stocks, pretty much anything, we are part of someones’ data experiment! Whether we like it or not, whether we consider it an infringement of privacy or not, we simply have no other choice.

Next time you search on Google or Amazon, notice that the results URLs are extremely long. For example, this is the URL for the search result “best mutual fund”

https://www.google.co.in/search?ei=3S2ZW8WOPIz0vATXj7q4BQ&q=best+mutual+fund&oq=best+mutual+fund&gs_l=psy-ab.3..0j0i131k1l2j0l7.21789.23961.0.24344.16.16.0.0.0.0.171.1572.9j6.15.0..2..0…1.1.64.psy-ab..1.15.1569…35i39k1j0i67k1.0.64ZOfNH79f4

Now, I do not know what exactly those alphabet and number sequences mean, but I do know for sure, they are labels that are part of some machine learning experiment that Google is conducting. Gone are the days when websites had to be linked in with many other websites to get search traffic. Today Google understands and recognises user engagement.

It knows and understands what kind of websites people click for each search term, how long they spend on it and come back if unsatisfied etc. It has become so smart that it can now direct a search query garbled with Hindi and English alphabet to the right search page. All this, thanks to “big data”

What is big data?

The simplest and most appealing definition is found in the book Big
Data: A Revolution That Will Transform How We Live, Work and Think by  Viktor Mayer-Schoenberger& Kenneth Cukier

big data is the ability of society to harness information in novel ways to produce useful insights or goods and services of significant value

Essentially it means, collecting immense quantities of data, slice it and dice it in a million ways to get insights. This is what I mean by the title. Each time you use the internet, you are part of someones’ experiment  – big data or small data does not matter.

If you have not read the above book, I strongly recommend it. It will blow your mind. The first chapter is free to download (legally). Even this little nugget gives you all examples you need about how useful big data is.

Let us consider an example of big data. This is imaginary (as of now), but will hopefully get the point home. Suppose I own the cafeteria in your office and am interested in cutting unnecessary costs, unnecessary effort and enhancing profit. So the first thing I do is to not just use an electronic billing system that merely gives out bills according to preset prices. I connect it with a computer and store every bill detail in it.

It is not possible to use the internet without being part of someones' data experiment

After a month or so, I sit down and analyze the data. I find out if there are patterns to when people order what. For example, do they drink more coffee on Mondays? Do they enjoy ice cream more on Fridays? If the company has recurring project deadlines, does it show in the food and drink orders? Which kind of food items sold the least and so on. This is pretty basic stuff, but can actually be effective.

How about we take it further. I install cameras around all the tables and watch how people eat. Play around with side-dish quantities and watch how people respond. Do they ask for more or do they leave? What if I cut the salt a bit? Do they then reach for the salt dispenser in the table? What kind of foods gets wasted the most? This is a pretty crude example, but this is exactly how big data works.

Call it spying if you will, call it tracking if you will, but the bottom line is, in some of the most popular websites we go every day to and during some of the most important things that we do online, someone is recording each of our moves and deriving useful insights from it.

Is big data good or bad?

Big data is not only good it is also a useful, life-changing development. For eg. in the chapter linked above, you can read about how Google can track the spread of a virus across a country and help health officials. Here are two articles that can shed more light:  “How Google is Remaking Itself as a Machine-Learning-First Company” and 7 Machine Learning Applications at Google Even these are two years old. I will try and get newer articles.

Naturally with the good is mixed the bad as in everything else. There are tons of web pages that discuss if Google is spying on us, invading our privacy or selling our data. One thing is clear, Google has no need to sell out data. It knows how to make productive use of it. Two well known visible examples are its advertising services Adsense and Adwords. The spying and privacy debate boils down to opinions, but the simple truth is, whether we like it or not, there is no escape from this.

My gut feeling is, this is the reason why Paytm can afford to set up a direct mutual fund portal for free. They get user data in exchange. Unlike Kuvera which sells such data analytics to others, Paytm does not need to that (although at this point, more clarity is required)

Paytm Money Usage of Your Personal Data

The following is an extract from Paytm privacy policy

We use your Personal Data in our business operations for providing our or our partners’ products services and to perform, among other actions, the following:

  • To facilitate the transactions or report on these transactions;
  • To undertake research and analytics for offering or improving our services and their security and service quality;
  • To check and process your requests submitted to us for products/services and/or instructions or requests received from you in respect of these products/services;
  • To share with you, updates on changes to the products/services and their terms and conditions;
  • To take up or investigate any complaints/claims / disputes;
  • To respond to your queries or feedback submitted by you;
  • To verify your identity for us to provide products/services to you;
  • To carry credit checks, screenings or due diligence checks as lawfully required by us;
  • To monitor and review services from time to time;
  • To undertake financial/regulatory / management reporting, and create and maintain various risk management models;
  • For conducting audits and for record keeping purposes;
  • For selective offers and promotions.

We also use your Personal Data to fulfil the requirements of applicable laws/regulations and/or court orders / regulatory directives received by us.

I am no expert on privacy laws and usage, but that is reasonable to me. Assuming Paytm direct MF takes off as they intended, they can use the data to sell other products, influence how people buy and sell mutual funds or other products etc. This is big data at its best. A business exploiting user-data to gain more business.

Freefincal itself serves as another example of such analytics (not big enough to be called big data but the principles are the same).  Suppose I do not collect any kind of analytics and simply write content. Now whether people like me or not, most members of FB group Asan Ideas For Wealth (AIFW – run by the incomparable Ashal Jauhari) know who I am.

That group has almost 49K members as on date. If I go by the number of people who like my posts there (without reading the article) and the number of people who comment on them (without reading the article), I will think “hey many people like my content here, I must be getting most of my traffic from here”.

Truth is that I get only 3% traffic from all of Facebook (including my personal wall and freefincal page). Recently my twitter traffic is inching up (I am @freefincal btw). I get about 40% from Mr. Google and 40% from .. ahem .. people who come visit me directly.

The point is, in the absence of analytics, I will be catering to the wrong sources and worry about what every member of AIFW cribs about. Since I know it is only 3%, I can be a lot more selective about who I take seriously there.

Analytics is not just useful. It is indispensable. For a business, effective online analytics is the difference between thriving and surviving.

Issues about privacy

Now the data collected is divided into personally identifiable information such as name, pan no, aadhaar no, address, mobile no etc and personally non-identifiable information such as age, gender, city and state of residence, products purchased, products browsed etc.

No company that wants to remain in business will sell personally identifiable information. However, employees can do this illegally! How else do you think we can buy mobile and email lists online? How else do you think we get that 11 am all from trading houses or charities?

The sale of personally non-identifiable information does happen (eg Kuvera), but not everyone is comfortable with this. In a poll I conducted at AIFW about the sale of such data (in general), 240 members were okay with it and 173 against. So although a good number of people do not mind in exchange for convenience, a good number do.

Non-identifiable information can also be sold to advertisers on a platform. This is the way Google, Facebook, Twitter, Linkedin, Quora etc operate. So the people who objected above should also have a problem with these sites!

The ads that you see on this page (if you have been kind enough to allow them) also operate on personally non-identifiable information. However, they also use your browsing history. For example, if you look for a product on Amazon and go to an Adsense enable site, you will see Amazon ads for that product! This is personally identifiable, but the person identifying it is only you!  You can prevent this by asking the browser to clear cookies when you exit (if you exit, I never close Chrome until it crashes)   Even this can be disabled for EU users. I hope this is available for all users soon.

By the way, speaking of Chrome, it is one of the biggest sources of data for Google. I head in a talk that Chrome was developed by a team led by Sundar Pichai. Its success is the reason why he is CEO today.

Big data can also be used for internal analytics for generating more sales, increasing efficiency, reducing costs etc. You can find ways to avoid the above, but this – analytics for profit and growth – cannot be avoided.

The case of Aadhaar

Aadhaar also has big data in mind. The privacy concerns regarding Aadhar is two dimensional. On the one hand, people do not like the government using Aadhaar to analyse the distribution and implementation of welfare schemes. On the other hand is the worry that the Aadhaar database will be hacked (and we see reports every day!) and someone will use Aadhaar to steal our identity and/or assets. This is a valid concern and must be addressed by the government in a better way.

The use of big data for better governance is indisputable. What is however disputable is, do we need Aadhaar for such analytics? I think most people will answer – no.

Let me say it again because it sounds nice. Whether we like it or not, whether we think it is a breach of privacy or not: it is not possible to use the internet without being part of someones’ data experiment.

I am sure (well I hope) many privacy, security and data experts will be reading this. If you have anything more to add, correct, especially specifics,  leave a comment below. I will publish select comments in a later post (anonymously if you like!)

Want to conduct a sales-free "basics of money management" session in your office?
I conduct free seminars to employees or societies. Only the very basics and getting-started steps are discussed (no scary math):For example: How to define financial goals, how to save tax with a clear goal in mind; How to use a credit card for maximum benefit; When to buy a house; How to start investing; where to invest; how to invest for and after retirement etc. depending on the audience. If you are interested, you can contact me: freefincal [at] Gmail [dot] com. I can do the talk via conferencing software, so there is no cost for your company. If you want me to travel, you need to cover my airfare (I live in Chennai)

Connect with us on social media


Do check out my books


You Can Be Rich Too with Goal-Based InvestingYou can be rich too with goal based investing

My first book is meant to help you ask the right questions, seek the right answers and since it comes with nine online calculators, you can also create customg solutions for your lifestye!Get it now.  It is also available in Kindle format.

Gamechanger: Forget Startups, Join Corporate & Still Live the Rich Life You Want

Gamechanger: Forget Start-ups, Join Corporate and Still Live the Rich Life you want My second book is meant for young earners to get their basics right from day one! It will also help you travel to exotic places at low cost! Get it or gift it to a youngearner

The ultimate guide to travel by Pranav Surya

Travel-Training-Kit-Cover This is a deep dive analysis into vacation planning, finding cheap flights, budget accommodation, what to do when travelling, how travelling slowly is better financially and psychologically with links to the web pages and hand-holding at every step.  Get the pdf for ₹199 (instant download)

Create a "from start to finish" financial plan with this free robo advisory software template


Free Apps for your Android Phone

All calculators from our book, “You can be Rich Too” are now available on Google Play!
Install Financial Freedom App! (Google Play Store)
Install Freefincal Retirement Planner App! (Google Play Store)
Find out if you have enough to say "FU" to your employer (Google Play Store)

About Freefincal

Freefincal has open-source, comprehensive Excel spreadsheets, tools, analysis and unbiased, conflict of interest-free commentary on different aspects of personal finance and investing. If you find the content useful, please consider supporting us by (1) sharing our articles and (2) disabling ad-blockers for our site if you are using one. We do not accept sponsored posts, links or guest posts request from content writers and agencies.

Blog Comment Policy

Your thoughts are vital to the health of this blog and are the driving force behind the analysis and calculators that you see here. We welcome criticism and differing opinions. I will do my very best to respond to all comments asap. Please do not include hyperlinks or email ids in the comment body. Such comments will be moderated and I reserve the right to delete the entire comment or remove the links before approving them.

15 thoughts on “It is not possible to use the internet without being part of someones’ data experiment

  1. Hello Pattu,

    On a slightly unrelated topic than your today’s post but nevertheless wanted to understand your views. I have been using Zerodha for direct equity investing.
    However, of late they have forayed and are providing small cases as stock clusters that one can buy, do sips, customize the stock picks. Are there certain things to consider when picking these smallcases. Taking a step back, in your view do you think it even makes sense to invest in these small cases. Anything to consider? is it ok to do regular SIPs in them?

    Would be great to get your views on them. I was eyeing 2 specific smallcases that they offer – Straight flush and All Weather Investing. If not specific guidance on these specifc smallcases, even any general guidance/parameters to consider will be very helpful.

  2. Out of box….never thought that I am being targeted.
    Was aware that ads on page comes due cookies in my search history. But didn’t observed the url thing.
    Thanks.

  3. The era of data collection has begun, and it is a double edged sword. Some are okay with it, some are definitely not. As with many things in life, if you are in control of what you will buy what good is the data collected?

      1. Let the internet serve whatever it can, I will decide when to act and when not to. Otherwise I would have searched for best funds to invest instead of reading your blog, so thank you for that.

  4. The experiments are often called A/B testing. You can find in every request either in the URL or in the cookies sent.

  5. @Pattu Sir,
    Well it is possible to be anonymous on the web if we use the right tools and right practises.
    1. Use a browser that always functions in incognito mode, Or deletes cookies on exit.
    2. Use a search engine that doesn’t try to track/profile you (duck-duck-go) , give it a spin you would not regret it. Google’s search quality isn’t worth our souls that we sell to them.
    3. Install AdBlockPlus on your browser.
    4. Do not sign-into any app on phone, create a different id for different devices and make it harder for the platform to join your data across sessions/devices.

    Basically you want to negate that personally identifiable (anonymous) primary-key that joins your activities across sessions, devices in order to profile you.

    1. You can do all that but you cannot prevent websites you sign up with from tracking what parts of the website you use etc.

      1. @PattuSir,
        1. Sir if you signin and use a website, it is only reasonable to expect that your activity will contribute towards improving their services. The problematic thing is the stickiness of the ads that appear off the website, i.e on the network/partner websites.

        2. Most companies tie up with an AdExchange like DoubleClick(Google), AppNexus(MSFT) etc. in order to sell their ads in a more targeted manner. The steps i shared above will help you to avoid this ad targeting. Onsite tracking is arguable that it is in the best interest of both parties vs. the tracking and profiling that happens via the Ad display network which happens continuously where ever you go, which may not be in the best interest of both parties.

        3. To ask for zero onsite tracking would be similar to – say if your students who signed up for your class want no tests no assessments and provide no feedback. Then how would you know effectiveness of your course ? and where to emphasise ? The Ad-Network based tracking equivalent here would be if you enter into a deal with the most popular hangout locations (where students are likely to visit), most popular people (who can gauge the pulse of the students) and via these sources you continue to profile your students and gauge their sentiments about your course even when they are not in your class. That is problematic and is arguable that it is not in the best interest of both parties.

Comments are closed.