kagermanov Follow Author of Kagermanov Blog, Your Homebrew ML Enthusiast

Analyzing the Distribution of Problems in Competition Reviews

In the highly competitive world of mobile app development, it is essential to stay ahead of the curve in terms of optimization and marketing strategies using best app store optimization tools. One key area of focus is understanding the common problems and complaints of competitors’ app reviews, in order to improve upon and address those issues in your own app. With the use of ASO tools like SerpApi’s Apple App Store Search Scraper API and Apple App Store Reviews Scraper API, as well as Natural Language Processing models such as facebook/bart-large-mnli and bert-large-uncased-whole-word-masking-finetuned-squad, we can analyze the distribution of problems in competition’s reviews and gain valuable insights for our own app optimization which could potentially increase download an install rate, the conversion rate from your competition, give key ideas on your app marketing, boost your app visibility, and many more KPI objectives you can achieve with app analytics. You can find the full code at the end of this page.

Collecting Product IDs from the Apple App Store Search Scraper API

SerpApi’s Apple App Store Search Scraper API is a tool used for gathering in-depth metrics, keyword rankings, the number of app downloads, and much more details of an app in the iOS App Store, or Mac App Store. It is an essential tool for keyword tracking and gives you a good insight into how App Store’s ranking algorithm works. The first step we will use in creating this code is the process is to gather the product IDs of apps related to our search term, using the Apple App Store Search Scraper API. This can be done with the following code snippet:

def call_app_store_api(search_term, api_key):
  app_store_url = f"https://serpapi.com/search.json?api_key={api_key}&engine=apple_app_store&term={search_term}"
  response = requests.get(app_store_url)
  product_ids = [organic_result["id"] for organic_result in response.json()["organic_results"]]
  
  return product_ids

This function takes in a search term and API key and returns a list of product IDs for apps related to that search term. You may register to claim free credits and use your API Key in the tool.

Collecting Reviews from the Apple App Store Reviews Scraper API

Once we have the product IDs of the apps related to our search term, we can use the Apple App Store Reviews Scraper API to gather the reviews of each app with automation. This can be done with the following code snippet:

async def call_apple_reviews_api(product_id, api_key):
  apple_reviews_url = f"https://serpapi.com/search.json?api_key={api_key}&engine=apple_reviews&product_id={product_id}&sort=mostcritical"
  response = requests.get(apple_reviews_url)
  reviews = [review["text"] for review in response.json()["reviews"]]
  
  return reviews

This function takes in a product ID of an App Store Listing and SerpApi API key, and returns a list of reviews for that app, sorted from most negative to least negative with the help of sort=mostcritical parameter.

Collecting All Reviews for a Search Term

To gather all reviews for apps related to our search term, we can use the following function in the code snippet:

async def get_product_reviews(product_ids, api_key):
  reviews = await asyncio.gather(*[call_apple_reviews_api(product_id, api_key) for product_id in product_ids])
  reviews = list(chain(*reviews))
  
  return reviews

This function takes in a list of product IDs and SerpApi API key, and returns a list of all reviews for those apps. Notice that it doesn’t store which reviews belong to which app name. It will serve all the reviews of top apps ranked in a relevant keyword to your app.

Classifying Reviews as Positive or Negative

Sometimes, an app could have a little number of reviews, or a user might give a 1-star but a positive review. These kinds of events might prevent you from seeing genuine negative reviews of your competition. With the use of natural language processing models, we can classify the reviews as positive or negative, and return only the absolutely negative reviews in the tool. The following code snippet demonstrates this process:

def get_absolutely_negative_product_reviews(classifier, reviews):
  absolutely_negative_reviews = []
  candidate_labels = ['negative', 'positive']
  for review in reviews:
    result = classifier(review, candidate_labels)
    index = result["scores"].index(max(result["scores"]))
    classification = result["labels"][index]
    if classification == "negative":
      absolutely_negative_reviews.append(review)
  
  return absolutely_negative_reviews

This function takes in a classifier and a list of reviews and returns a list of only the absolutely negative reviews.

Summarizing Absolutely Negative Reviews

To gain a clearer understanding of the specific problems mentioned in the absolutely negative reviews, we can use a question-answering pipeline to summarize the reviews. This is important to find the main concern of the user in a long review, or in a mixed review with positive and negative aspects. Imagine the following review:

Screenshots are nice. But they don’t represent the app.

What we want is the concern for the honesty of the developer from this review which is the latter sentence. So we want the question-answer pipeline to give us keyword suggestions into what the main problem is in the review. The following code snippet demonstrates this extraction process:

def get_summaries(qa_pipeline, absolutely_negative_reviews):
  summaries = []
  for context in absolutely_negative_reviews:
    result = qa_pipeline({
      'context': context,
      'question': 'What is the problem?'
    })["answer"]
    summaries.append(result)
  
  return summaries

This function takes in a question-answering pipeline and a list of absolutely negative reviews and returns a list of summaries for those reviews.

Classifying Summaries and Returning Distribution of Problems

Understanding the ratio of problems in your competition’s reviews is a crucial part of mechanics in app ranking, user acquisition, and overall store intelligence. To understand the distribution of problems mentioned in the summaries, we can classify the summaries with a preset of potential problems and return the distribution of problems. The following code snippet demonstrates this process:

def get_problems_distribution(classifier, possible_problems, summaries):
  summary_dicts = []
  for summary in summaries:
    result = classifier(summary, possible_problems)
    index = result["scores"].index(max(result["scores"]))
    classification = result["labels"][index]
    summmary_dict = {"summary": summary, "classification": classification}
    summary_dicts.append(summmary_dict)

  classifications = [d['classification'] for d in summary_dicts]
  counter = Counter(classifications)
  total = sum(counter.values())
  ratios = {problem: counter[problem]/total for problem in possible_problems}
  ratios = dict(sorted(ratios.items(), key=lambda item: item[1], reverse=True))
  
  return ratios

This function takes in a classifier, a list of possible problems, and a list of summaries, and returns the distribution of problems mentioned in the summaries.

Showing Results

Finally, the following code snippet can be used to print the results of the problem distribution:

print("-----")
print(f"Summary of problems for search term: {term}")
for ratio in ratios:
  print(f"{ratio}: {ratios[ratio]*100:.2f}%")

Here is an example output:

Summary of problems for search term: Coffee
unenjoyable: 29.28%
incompleteness: 27.03%
expensiveness: 12.61%
dishonesty: 9.01%
too many advertisements: 8.11%
inefficiency: 8.11%
bugs: 5.86%

How is App Store Optimization done?

I will explain different aspects of a possible ASO strategy using this tool. With the results of our analysis, we can see that the top problems among the competition’s negative reviews for the search term “Coffee” are unenjoyable experiences, incompleteness, and expensiveness. Here are some specific strategies to address these issues:

Unenjoyable experiences

To address this problem, it is crucial to gather customer feedback and gain a deep understanding of what is causing the negative experiences. This could involve conducting surveys, focus groups, or user testing to gather qualitative data. Additionally, using tools such as sentiment analysis on customer reviews can provide valuable insights into whether the root causes are from technical reasons such as app performance, or from design issues. Once the root causes of the negative experiences have been identified, it is important to develop and implement solutions to improve the overall enjoyment that is supported with app intelligence.

Incompleteness

In order to address this problem, it is essential to have a clear understanding of the features and functionality that customers expect from the app. This could involve conducting market research, surveying customers, localization demands, notifications for bugs or gathering data on the features of competitors’ apps. Once the desired features have been identified, it is important to prioritize and implement them in the app, while also continuously gathering feedback to ensure they are meeting the needs of the customers.

Expensiveness

To address this issue, it is important to review and optimize pricing strategies regularly. This could involve conducting market research to understand the prices of similar apps and identifying opportunities to offer competitive pricing. Additionally, implementing in-app purchase options or subscription models can provide alternative revenue streams and make the app more accessible to a wider range of customers.

Additionally, it is also essential to address other problems such as dishonesty, too many advertisements, inefficiency, and bugs, as these can also negatively impact the customer’s experience. To address these issues, it is crucial to have a robust testing and quality assurance process in place, as well as regular monitoring of customer feedback to identify and address any problems in a timely manner.

Overall, by gathering and analyzing customer feedback, understanding the competition, and implementing solutions to address the identified problems, it is possible to improve the overall performance of the app and outrank the competition in the search results.

Conclusion

In conclusion, outranking the competition in the search results requires a comprehensive approach that involves gathering and analyzing customer feedback, understanding the competition, and implementing solutions to address the identified problems. By using natural language processing models and other tools such as SerpApi’s Apple Search Scraper API, and SerpApi’s Apple App Store Reviews Scraper API, sentiment analysis, question-answering pipelines, and A/B testing, it is possible to gain valuable insights into customer needs and preferences, and to make data-driven decisions to improve the overall performance of the app, and decide according to your App Store Optimization Strategy. Additionally, it is important to stay up to date with the latest trends and technologies, and to continuously gather and analyze customer feedback to identify new areas for improvement such as finding the best keywords to compete in via keyword research, keyword optimization for metadata of your app description, Search Engine Optimization(SEO) for your app page on App Store or Google Play Store (for Android Apps), effective social media campaigns and overall market intelligence, etc. By following these strategies, it is possible to improve the overall performance of the app and outrank the competition in the search results or find the best app title, app keywords to use in the app description, optimize your app for the most organic downloads, and organic installs, gather crucial information on the weak points of the competitive app developers in top apps in a keyword, etc. from the getgo before entering the competition as a startup. I am grateful to the reader for their time and attention. You can find the full code below.

Full Code:

from collections import Counter
from transformers import pipeline
from itertools import chain
import requests
import asyncio

# Collect product_ids from Apple App Store Search Scraper API
def call_app_store_api(search_term, api_key):
  print("-----")
  print(f"Gathering product ids from SerpApi's Apple App Store Scraper API for search term: {search_term}")
  app_store_url = f"https://serpapi.com/search.json?api_key={api_key}&engine=apple_app_store&term={search_term}"
  response = requests.get(app_store_url)
  product_ids = [organic_result["id"] for organic_result in response.json()["organic_results"]]
  
  return product_ids

# Collect reviews of an individual app from most negative to least negative from Apple App Store Reviews Scraper API
async def call_apple_reviews_api(product_id, api_key):
  print("-----")
  print(f"Gathering Reviews from Apple App Store Reviews Scraper API for product id: {product_id}")
  apple_reviews_url = f"https://serpapi.com/search.json?api_key={api_key}&engine=apple_reviews&product_id={product_id}&sort=mostcritical"
  response = requests.get(apple_reviews_url)
  reviews = [review["text"] for review in response.json()["reviews"]]
  
  return reviews

# Collect all reviews of apps in a search term from Apple App Store Reviews Scraper API
async def get_product_reviews(product_ids, api_key):
  reviews = await asyncio.gather(*[call_apple_reviews_api(product_id, api_key) for product_id in product_ids])
  reviews = list(chain(*reviews))
  
  return reviews

# Classify reviews as positive or negative and return absolutely negative reviews
def get_absolutely_negative_product_reviews(classifier, reviews):
  absolutely_negative_reviews = []
  candidate_labels = ['negative', 'positive']
  for review in reviews:
    print("-----")
    print(f"Classifying review: {review}")
    result = classifier(review, candidate_labels)
    index = result["scores"].index(max(result["scores"]))
    classification = result["labels"][index]
    print(f"Classification: {classification}")
    if classification == "negative":
      absolutely_negative_reviews.append(review)
  
  return absolutely_negative_reviews

# Summarize absolutely negative reviews
def get_summaries(qa_pipeline, absolutely_negative_reviews):
  summaries = []
  for context in absolutely_negative_reviews:
    print("-----")
    print(f"Summarizing review: {context}")
    result = qa_pipeline({
      'context': context,
      'question': 'What is the problem?'
    })["answer"]
    print(f"Summary: {result}")
    summaries.append(result)
  
  return summaries

# Classify summaries and return distribution of problems
def get_problems_distribution(classifier, possible_problems, summaries):
  summary_dicts = []
  for summary in summaries:
    print("-----")
    print(f"Classifying summary: {summary}")
    result = classifier(summary, possible_problems)
    index = result["scores"].index(max(result["scores"]))
    classification = result["labels"][index]
    print(f"Classification: {classification}")
    summmary_dict = {"summary": summary, "classification": classification}
    summary_dicts.append(summmary_dict)

  classifications = [d['classification'] for d in summary_dicts]
  counter = Counter(classifications)
  total = sum(counter.values())
  ratios = {problem: counter[problem]/total for problem in possible_problems}
  ratios = dict(sorted(ratios.items(), key=lambda item: item[1], reverse=True))
  
  return ratios

api_key = "<SerpApi Key>"
term = "Coffee"
possible_problems = [
  "bugs",
  "too many advertisements",
  "inefficiency",
  "dishonesty",
  "unenjoyable",
  "incompleteness",
  "expensiveness"
]
classifier = pipeline(
  "zero-shot-classification",
  model="facebook/bart-large-mnli"
)
qa_pipeline = pipeline(
  "question-answering",
  model="bert-large-uncased-whole-word-masking-finetuned-squad"
)

product_ids = call_app_store_api(term, api_key)
reviews = asyncio.run(get_product_reviews(product_ids, api_key))
absolutely_negative_reviews = get_absolutely_negative_product_reviews(classifier, reviews)
summaries = get_summaries(qa_pipeline, absolutely_negative_reviews)
ratios = get_problems_distribution(classifier, possible_problems, summaries)

# Print results
print("-----")
print(f"Summary of problems for search term: {term}")
for ratio in ratios:
  print(f"{ratio}: {ratios[ratio]*100:.2f}%")

23 Jan 2023

« Open Source AI Image Classifier with Automatic Dataset Creator LLMs vs SerpApi: A Comparative Analysis of the Webscraping Capabilities »

Kagermanov Blog