Sunday 27 August 2017

Chinese downloads, Firebase, iTunesConnect and a numbers mismatch

In the last few months we have experienced a significant spike in downloads for our app (My Day To-Do), a lot of which came from China and all this was very confusing for me at first. App Analytics in iTunesConnect reported the number of people viewing the product page of our app on the App Store was far less the number of downloads? What was equally confusing was the data being reported in our App Analytics platform (Firebase Analytics), did not match the download numbers? My frustration and confusion lead me to embark on a journey to solve this mystery once and for all. I managed to gather enough knowledge and clues to draw a logical conclusion and formulate a theory that satisfies me to a point that this is no longer my concern. In this blog post, I will share my theory in greater detail and try to justify it based on a few facts and I can only hope it satisfies others confused by all of this.

Background


I have talked about my efforts to capture the Chinese market in my earlier posts (e.g. Trying to capture the Chinese market) and recently we experienced a spike in downloads for our app. Now this was great and I was happy to see the spike in downloads, but...yes there's a but, we observed a number of inconsistencies in the Analytics data, hence we had a new problem.

Problem


There were two major inconsistencies that we observed in the Analytics data i.e.

  1. App Analytics in iTunesConnect we saw more App Units than product page views?
  2. Firebase analytics data which is supposed to paint a more precise picture of how users interact with our app, showed a fraction of the number of users downloading our app?

Solution (My analysis of it all)


Before going any further let's clarify one thing, over 70% of the total downloads for My Day To-Do, were from China (中国). 

App Analytics in iTunesConnect


I think this part of the problem confuses others more than me. At first when I saw more App units downloaded than product page views, it actually did not surprise me as much as everyone else. Why? App Analytics is a relatively recent addition to the iTunesConnect i.e. Apple only added App Analytics to iTunesConnect a few years ago and I know that App Analytics is strictly "opt-in", i.e. the users have to agree to send the data to Apple. I heard a deep discussion on this topic in one of these podcasts when Apple first launched App Analytics.


Why do I see more app units than product page views in App Analytics (iTunes)?


My understanding of what maybe happening here is that a lot of people may 'not agree' (opt-in) to send the data to Apple when they first update or setup their new iPhone or iPad. So if these people choose to download the app, they may go to the product page on the App Store and download the app and their download would be recorded but their visit to the product page will not be recorded. The number of units downloaded is something that Apple has had for a long time, so I doubt the App Units info would be wrong, however the product page views is something recent hence the discrepancy in data. The only thing I know for sure is the delay in updating the App Analytics app units data, there are times when the Sales & Trends data shows more units than that in App Analytics  but that's just a case of a delay in reporting. The data matches after a few days have passed.

Update: Apparently, the 'Product Page Views' are no longer opt-in only! Given how new App Analytics is, my question would be, where is the server physically located where Apple stores this data? Is it in China or somewhere else in the world? The rule (Chinese Govt) is that data on Chinese users must be stored in a server that is physically located within China. Look my point is, at this stage I would not rely on the data reported by App Analytics in iTunesConnect.

Firebase Analytics data shows far less users than those downloading the app?


The confusing thing about Firebase Analytics, in our case was that, iTunesConnect Sales and Trends would tell us that 1000+ users downloaded our app but Firebase would only show us that some 200+ users who actually opened the app and used it. Since a lot of the downloads from our app were coming from China, I was beginning to wonder whether or not they are fake downloads? 

Fake Downloads


I thought about this for sometime but I highly doubt if this was the case! I mean why would people download the app? We are not paying them anything, then what's their incentive? Combine this with the fact that
  1. We have invested a lot time and effort into getting My Day To-Do into the Chinese market. Our efforts for Localisation, Chinese social media sharing, getting user feedback on Chinese forums etc is documented here.
  2. A prompt to ask users to leave a review for the app was only added to My Day To-Do a month ago and we have received at least 4 ratings and 2 reviews from China. 
  3. Lastly, someone who left the review for the Lite version on the App Store, managed to find My Day To-Do's former intern's post on the Chinese blog calling survey participants and asked the intern if he could get a promo code for the Pro version of the app! He/She was clearly interested enough in the app to make the effort to find the forum post.
All of the above tells me that people in China are actually interested in this product, giving me enough reasons to disregard this 'Fake Downloads' theory. Then I heard about something called the Great Firewall of China and people using VPN and the plot thickens...

The Great Firewall of China

I realised that China has this nation wide thing called Great Firewall of China (GFW) which is basically a filter that stops people from accessing internet resources prohibited by the Chinese government. This includes Facebook and Google and since Firebase is now a part of Google, it would make sense for data collection by Firebase to be blocked in China. Now by that logic, we should get absolutely no data from China in our Firebase console, right? Well not really because we would see some session data as a result of usage from China! This was the most confusing thing about this whole process, what's causing some of this data to show up?

That little piece of session data being reported from China, why?


Doing some research on this was the hardest part as often I would come across info on sites in 中文 (Chinese, simplified Chinese) and my Chinese is still not strong enough to read it all. At this point this entire process became a Sherlock Holmes mystery more than anything else. 

Fortunately, my office is in a co-working space in the Computer Science building of one of the best universities of Australia i.e. UNSW and the university has a lot of visiting researchers from China! So how did I use this to my advantage? well for anyone that I would see in my office building a couple of times, I would walk up to them introduce myself and start a conversation. I would generally start the conversation by asking them about what area of Computer Science are they researching in,  ask them about their undergrad which in most cases they would have completed in their home country, followed by how they find Australia etc etc. I would build the conversation to the point I can ask them about what they know about GFW.  

This lead me to the first clue - Virtual Private Network (VPN). I was often told that people in China use VPN to access stuff and bypass the GFW. So the little data that I am seeing in my Firebase console is from people accessing the internet on their phones via VPN. Ok that makes sense but this was not good enough for me. I kept the interface of My Day To-Do clean and simple so it can be used by people with the least technical know-how, therefore I do not expect them to know how to setup VPN and all. So I kept searching for more clues...


Clue 2: How GFW works: In summary the way the GFW works is not 100%. Now when you send a piece of data across the network, among other things it would have a destination for where it needs to go. GFW steps in when the data is being transmitted and changes the destination of where the data is supposed to go and this blocking process is not 100% i.e. it does not always work. So even if it works say 99% of the time some data would slip through and go to it's intended destination. The rule of the Chinese government is that data for any of the Chinese users must be stored on a server physically located within China. 

Ok let's try to look the above example with a real-world example, a user downloads Lite and pushes the 'Read (tasks)' button, at that point the app sends a request to firebase.google.com. Let's think of that as data being sent to a server located in California (USA) so the GFW steps in here and changes the destination of the data to somewhere within China, thereby the data does not go to its intended destination. But as I said this whole changing the destination process does not work 100% of the time, so some data may travel to a server in California as intended

This was a satisfactory analysis for me to stop investing anymore time into this.

p.s. here's the thing about almost all the Chinese people I have met. They are actually very friendly and they have a lot of things to talk about, all you need is the initiative to start the conversation, once you do that they have a lot to share. In my experience they are a great group of people to talk to, their stories are both interesting and often quite funny. 


Summary


Recently I answered a question on Game Dev on StackExchange and here's my answer for it that summaries everything mentioned above.

  1. iTunesConnect App Analytics is "OPT-IN": So the users have to agree to sending data to Apple for these things to show up in the App Analytics in iTunesConnect. 
  2. Analytics software: Now China (中国) has this thing they call the 'Great Firewall' and what it does is change the destination of the data being transmitted to servers outside of China. From what I understand Data for Chinese (中国人) users must be stored within China i.e. the server where the data for Chinese users is stored must be physically within China. I believe Google servers are not, hence it's blocked. Google owns Firebase now, ergo that's blocked too.
  3. My case: Now for my app the downloads have increased overall, with over 60% of those coming from China and we are seeing some data (from China, it's not like we get 0 data from China. We see some sessions and why is that the case? Now after some digging around I found out that the way the Great Firewall(Google/Bing it) works is not 100%. The process of changing the destination doesn't always work i.e. it does not always change the destination and therefore some of it goes through and combine that with some tech savvy users using VPN in China. However for my case, it's unlikely that the users of my app are using VPN.

Conclusion


At this point, I need to work on adding new IAPs to Lite, iOS11 features, build the web version, prepare the files for Japanese localisation, prepare a job ad for a design intern etc etc. There is no end to the work I need to do but when I read that question on Stack Exchange I realised that this problem maybe bothering a lot of people so I had to write a blog post to share my findings. I can only hope it saves others the time and frustration that I had to endure.

Finally, I am working on My Day To-Do full-time right now so if you find my blog posts useful and want to support me you can buy the Pro version of My Day To-Do.

If you know any design interns? please put them in touch with me, we could really use some design help at My Day To-Do.

No comments:

Post a Comment