WomanLog: Research Findings

The WomanLog app, developed by Pro Active App SIA, is a Latvia-based period tracking app with over 10 million downloads that features an 'Intelligent Assistant' chatbot.

Long Read
WomanLog's website advertising the period tracker, it claims 20mn downloads and >1.5mn montly users

Screenshot of WomanLog's website

Go back to the full report page

The WomanLog app, developed by Pro Active App SIA, is a Latvia-based period tracking app with over 10 million downloads that features an 'Intelligent Assistant' chatbot (more on this below). To get started on the app, we completed a short onboarding questionnaire about which app mode we intended to use (e.g., standard) and the length of our cycle and period. Our answers to these questions were sent across the web traffic to the app developer's API:

Figure 5.1. The developer API, represented by the 'proactiveapp.com' URL, requested the app mode ('Tracking').
Figure 5.2. The response assigned a unique 'clientKey' to this particular user.

We were also asked to provide additional information about birth date, weight and height, which was optional but recommended (we skipped this). 

After the questionnaire, we were presented with an advertising network's lengthy consent form similar to Maya’s, for which we manually deselected our consent to the listed data-sharing activities and vendors for the purposes of personalised advertising and analytics:

Figure 5.3. The consent form pop-up from the advertising network for which we selected ‘Manage options’.
Figure 5.4. We manually deselected our consent, including for legitimate interests, for the processing purposes in the advertising network’s consent form.
Figure 5.5. Screenshot of the vendors (e.g., Amazon Ad Server, Pubmatic, Inc.) requesting data processing permission.

After manually deselecting all these options, we were directed to the cycle dashboard to utilise the app, without having to create an account. 

As with our onboarding questionnaire responses, every time we inputted information about our cycle, this data was sent to the app developer's API. This included information such as which medication the user took, which symptoms they logged and the start of their cycle:

Figure 5.6. We can see the API requesting our inputted data, for which we'd marked several 'symptomCountByType' entries

For all of these entries we inputted, the web traffic response returned the user's uniquely identifiable 'clientKey' (Figure 5.7), which links all our inputs with this uniquely identifiable key:

Figure 5.7. This is the same 'clientKey' as the one assigned from the start in Figure 5.2

We note that in response to our findings, WomanLog stated that all communication between the app and their servers is HTTPS encrypted. 

Next, we tested the Intelligent Assistant feature, which is a paid-for period prediction and chatbot service powered by OpenAI (note the disclosure of OpenAI was only provided separately in WomanLog’s Privacy Policy and not in the app itself). Using this feature required us to create an account. 

After creating an account via email, we could see in the web traffic that when we launched the app, our email and password were requested by the API to authorise and identify our log-in:

Figure 5.8. See 'email' and 'password' sent to the API ('proactive.app' URL).

From here, we decided to test the chatbot with two goals in mind: 

1) to see whether the chatbot appeared to internalise and store data we inputted in both the main app dashboard and in the chat conversation; and

2) whether the data we provided in our inputs to the chatbot was intercepted by third parties. 

The chat environment appeared to be strictly controlled by the developer to respond only to ‘questions related to the WomanLog app, period tracking, women’s health, menstruation, and related topics’ The chatbot responded with this sentence every time we asked it a question that it deemed beyond the scope of its response mechanism:

Figure 5.9. Screenshot of our chat window when we asked the chatbot something it could not answer or deemed not relevant to its purpose.]

As for whether the chatbot could read information about our period recorded in the cycle dashboard, we asked it to output questions about our specific cycle:

Figure 5.10. Our 'userTextMessage' for the chatbot in the request.
Figure 5.11. The chatbot's response in 'text', which in full instructs the user to refer to their cycle dashboard page instead.

The chatbot responded (Figure 5.11) that we should refer to the Intelligent Assistant summary breakdown in our dashboard for personalised period predictions. 

Most of our conversations with the chatbot followed a similar pattern whenever we asked the chatbot for personalised queries based on what we recorded in our cycle. However, the one type of personalised data points from our main dashboard the chatbot was able to output was our user's period dates, such as the start date:

Figure 5.12. We asked for the start date of our next period in 'userTextMessage'.
Figure 5.13. Screenshot of the chatlog where we asked for our next period date and the chatbot responded with the actual date.
Figure 5.14. This is a screenshot of the web traffic for the above interaction, where the variable 'infoType' stores our start date that is displayed (Figure 5.12) via the action 'type: GET_INFO'.

We then asked the chatbot to disclose other information like our log-in email:

Figure 5.15. See 'userTextMessage'.
Figure 5.16. See 'text'

Above, the chatbot responded that it cannot access 'personal information' like the user's sign-up email. We then noticed a discrepancy between what the chatbot deemed as ‘personal information’, as we asked the chatbot for personal period-related data like our start date, and it responded that it cannot access 'personal period information' (Figure 5.17).

Figure 5.17. In the chatbot's response ('text'), it says it does not have access to 'personal period information'.

However, when we asked if our next period was considered personal information (which it was able to provide in the form of our start date in Figures 5.13 and 5.14), the chatbot responded that 'NEXT_PERIOD' was considered personal period information:

Figure 5.18. In our 'userTextMessage' we asked if 'next period' was personal information like so
Figure 5.19. In 'text', the chatbot responds that next period is personal information.

As a side note, the headers reported in the web traffic details that all this activity was occurring with a webserver running 'nginx' (Figure 5.17). The server has "server_tokens" enabled, which means it is reporting the specific version number and additionally the operating system it is running on. By exposing the version and operating system, it allows a malicious actor to acquire additional information about the system which could be used for searching for vulnerabilities or exploits. We notified WomanLog of this, but their response to our findings did not address this point. 

We also asked the chatbot if its third-party operator Open AI could access our period data such as 'next period', to which the bot responded, 'No, OpenAI cannot see your next period. The WomanLog app uses your personal data to predict your menstrual cycle, but this information is private and not accessible to external entities':

Figure 5.20. See our input in 'userTextMessage'
Figure 5.21. See the chatbot's response in 'text'

Note the unique 'chatKey' recorded in all of our chatbot exchanges, which suggests each chat (and all its contents) is saved to an identifiable 'chatKey' ID. 

We did not see OpenAI URL paths in the web traffic, but the way a chatbot API like this works is that it is typically operated server-side rather than client-side, thus calls to the OpenAI API would not appear in this device’s web traffic. Note that WomanLog is likely using the API as an OpenAI Enterprise customer, as the restrictive outputs from the WomanLog chatbot suggests that the developers customized guardrails for the chatbot’s responses. 

What we do know is that the endpoint for all those using the OpenAI API is OpenAI’s servers, and the communication is likely happening server-to-server (WomanLog server to OpenAI server) and then being relayed to the client, as far as we can tell. OpenAI said in their response to our findings that, in scenarios where it provides customers access to its models via its API platform, ‘OpenAI acts as a data processor of API inputs and outputs, and the customer acts as the data controller’. This means that the API customer (i.e., WomanLog) is responsible for its implementation of OpenAI’s API, including how their end users’ data is processed by the app. 

OpenAI also reiterated that access to these inputs and outputs is ‘strictly limited to: ‘(1) authorised employees that require access for engineering support, investigating potential platform abuse and legal compliance purposes and; (2) specialised third-party contractors who are bound by confidentiality and security obligations, solely to review for abuse and misuse.’ 

Beyond these in-app interactions, we also observed several appearances of third-party advertising SDKs from Google, which forwarded an ad placement in the app to the 'ADMOB' live bidding platform.

Figure 5.22. See 'platform:ADMOB'

It appears that our device data may be automatically collected and sent to Google Ads. We note that the app's Privacy Policy mentioned that 'general technical data, such as phone model, OS version, country and language, is transferred to WomanLog servers solely for statistical purposes', but there was no mention of Google Ads. It was not made abundantly clear in the consent form in Figure 5.3 what ‘personal data’ would be shared, such as if device data was exempted from the consent agreement. Generally, too, there is a lack of clarity around consent of personal data sharing within apps, as the user must agree to Google Play Store’s terms in order to even download an app from the Play Store, which may entail some degree of device data sharing via Google SDKs

Numerous requests were also sent to Facebook's Graph API, such as requests to integrate Facebook's log-in feature:

Figure 5.23. See 'recovery_message' asking the user to log into the app again to reconnect their Facebook account.

Other requests to Facebook were gatekeeper checks, with the response being specific SDK features that either pass or do not pass the gatekeeper check:

Figure 5.24. See 'field: gatekeepers'.
Figure 5.25. Here, the app lists all the SDK features it uses.

Note in Figure 5.25 the app's call to the Facebook API responds with a list of all the SDK features the app is using, which actually aligns with one of PI’s 2019 recommendations that advocated for apps to more transparently disclose the features of SDKs they were using. 

We also noticed an instance of Facebook's ad network upon launching the app:

Figure 5.26. See 'adnw' in the 'feature_config', which stands for ad network. Even if it returns false, the ad network is still being called.

Firebase also appeared in the web traffic collecting device-related data for similar purposes as we've seen for the above apps, such as for its Crashlytics (crash reporting) tool or other analytics:

Figure 5.27. See 'clientInfo' variables, such as country', 'device', 'locale'.

Note that WomanLog clarified with us that ‘any third-party services integrated into the app (e.g., for analytics or ads) are configured to operate without access to personal or health-related data’. Indeed, in the above screenshots related to Facebook and Firebase we primarily see device data. The app also mentioned that its use of analytics (i.e., Google Analytics or Firebase) are ‘strictly to improve user experience and app performance. These services are configured to anonymise IP addresses and do not collect or transmit sensitive user data’. 

WomanLog's Privacy Policy disclosed the names of some of the third parties we observed, including Firebase (even specifying that Firebase is operated by Google), OpenAI, Apple Health and Google Fit, as well as the server it uses for hosting, which is EU-based Hetzner. However, the Policy did not mention its advertising integrations of Facebook's ad network and Google Ads.

Download the full report

Read more

What can I do?

If you want to make sure we can keep doing work like this you can donate now to make sure PI can keep holding governments and companies to account.  

Related learning resources