Hall-of-Apps

The Top Android Apps Metadata Archive

Table of contents

  1. Publications
  2. Technologies
  3. Dataset
  4. Metrics and Statistics

The amount of Android apps available for download is constantly increasing, exerting a continuous pressure on developers to publish outstanding apps. Google Play (GP) is the default distribution channel for Android apps, which provides mobile app users with metrics to identify and report apps quality such as rating, amount of downloads, previous users comments, etc. In addition to those metrics, GP presents a set of top charts that highlight the outstanding apps in different categories. Both metrics and top app charts help developers to identify whether their development decisions are well valued by the community. Therefore, app presence in these top charts is a valuable information when understanding the features of top-apps. In this paper we present Hall-of-Apps, a dataset containing top charts’ apps metadata extracted (weekly) from GP, for 4 different countries, during 30 weeks. The data is presented as (i) raw HTML files, (ii) a MongoDB database with all the information contained in app’s HTML files (e.g., app description, category, general rating, etc.), and (iii) data visualizations built with the D3.js framework.

Publications

Coming Soon!

Technologies

To generate the Hall-of-Apps, we used the process below to extract, parse, store and visualize the data:

extractionDiagram

Google Play Scrapper

To achieve the first part, we used a scraper written with NodeJS scripts : The first script did the extraction of the list of apps in all the categories' top chart, while the second one was in charge of going through the aforementioned list with the purpose of extracting the associated HTML file for each element of the list.

The list of possible URLs generated by the first script are the following:

URLs List
id pathName path country
1 editorChoice https://play.google.com/store/apps/collection/promotion_3002800_editors_choice_apps?hl=en&gl=co co
2 general_topFree https://play.google.com/store/apps/collection/topselling_free?hl=en&gl=co co
3 general_topSelling https://play.google.com/store/apps/collection/topselling_paid?hl=en&gl=co co
4 art_and_design_topFree https://play.google.com/store/apps/category/ART_AND_DESIGN/collection/topselling_free?hl=en&gl=co co
5 art_and_design_topSelling https://play.google.com/store/apps/category/ART_AND_DESIGN/collection/topselling_paid?hl=en&gl=co co
6 auto_and_vehicles_topFree https://play.google.com/store/apps/category/AUTO_AND_VEHICLES/collection/topselling_free?hl=en&gl=co co
7 auto_and_vehicles_topSelling https://play.google.com/store/apps/category/AUTO_AND_VEHICLES/collection/topselling_paid?hl=en&gl=co co
8 beauty_topFree https://play.google.com/store/apps/category/BEAUTY/collection/topselling_free?hl=en&gl=co co
9 beauty_topSelling https://play.google.com/store/apps/category/BEAUTY/collection/topselling_paid?hl=en&gl=co co
10 books_and_reference_topFree https://play.google.com/store/apps/category/BOOKS_AND_REFERENCE/collection/topselling_free?hl=en&gl=co co
11 books_and_reference_topSelling https://play.google.com/store/apps/category/BOOKS_AND_REFERENCE/collection/topselling_paid?hl=en&gl=co co
12 business_topFree https://play.google.com/store/apps/category/BUSINESS/collection/topselling_free?hl=en&gl=co co
13 business_topSelling https://play.google.com/store/apps/category/BUSINESS/collection/topselling_paid?hl=en&gl=co co
14 comics_topFree https://play.google.com/store/apps/category/COMICS/collection/topselling_free?hl=en&gl=co co
15 comics_topSelling https://play.google.com/store/apps/category/COMICS/collection/topselling_paid?hl=en&gl=co co
16 communication_topFree https://play.google.com/store/apps/category/COMMUNICATION/collection/topselling_free?hl=en&gl=co co
17 communication_topSelling https://play.google.com/store/apps/category/COMMUNICATION/collection/topselling_paid?hl=en&gl=co co
18 dating_topFree https://play.google.com/store/apps/category/DATING/collection/topselling_free?hl=en&gl=co co
19 dating_topSelling https://play.google.com/store/apps/category/DATING/collection/topselling_paid?hl=en&gl=co co
20 education_topFree https://play.google.com/store/apps/category/EDUCATION/collection/topselling_free?hl=en&gl=co co
21 education_topSelling https://play.google.com/store/apps/category/EDUCATION/collection/topselling_paid?hl=en&gl=co co
22 entertainment_topFree https://play.google.com/store/apps/category/ENTERTAINMENT/collection/topselling_free?hl=en&gl=co co
23 entertainment_topSelling https://play.google.com/store/apps/category/ENTERTAINMENT/collection/topselling_paid?hl=en&gl=co co
24 events_topFree https://play.google.com/store/apps/category/EVENTS/collection/topselling_free?hl=en&gl=co co
25 events_topSelling https://play.google.com/store/apps/category/EVENTS/collection/topselling_paid?hl=en&gl=co co
26 finance_topFree https://play.google.com/store/apps/category/FINANCE/collection/topselling_free?hl=en&gl=co co
27 finance_topSelling https://play.google.com/store/apps/category/FINANCE/collection/topselling_paid?hl=en&gl=co co
28 food_and_drink_topFree https://play.google.com/store/apps/category/FOOD_AND_DRINK/collection/topselling_free?hl=en&gl=co co
29 food_and_drink_topSelling https://play.google.com/store/apps/category/FOOD_AND_DRINK/collection/topselling_paid?hl=en&gl=co co
30 health_and_fitness_topFree https://play.google.com/store/apps/category/HEALTH_AND_FITNESS/collection/topselling_free?hl=en&gl=co co
31 health_and_fitness_topSelling https://play.google.com/store/apps/category/HEALTH_AND_FITNESS/collection/topselling_paid?hl=en&gl=co co
32 house_and_home_topFree https://play.google.com/store/apps/category/HOUSE_AND_HOME/collection/topselling_free?hl=en&gl=co co
33 house_and_home_topSelling https://play.google.com/store/apps/category/HOUSE_AND_HOME/collection/topselling_paid?hl=en&gl=co co
34 libraries_and_demo_topFree https://play.google.com/store/apps/category/LIBRARIES_AND_DEMO/collection/topselling_free?hl=en&gl=co co
35 libraries_and_demo_topSelling https://play.google.com/store/apps/category/LIBRARIES_AND_DEMO/collection/topselling_paid?hl=en&gl=co co
36 lifestyle_topFree https://play.google.com/store/apps/category/LIFESTYLE/collection/topselling_free?hl=en&gl=co co
37 lifestyle_topSelling https://play.google.com/store/apps/category/LIFESTYLE/collection/topselling_paid?hl=en&gl=co co
38 maps_and_navigation_topFree https://play.google.com/store/apps/category/MAPS_AND_NAVIGATION/collection/topselling_free?hl=en&gl=co co
39 maps_and_navigation_topSelling https://play.google.com/store/apps/category/MAPS_AND_NAVIGATION/collection/topselling_paid?hl=en&gl=co co
40 medical_topFree https://play.google.com/store/apps/category/MEDICAL/collection/topselling_free?hl=en&gl=co co
41 medical_topSelling https://play.google.com/store/apps/category/MEDICAL/collection/topselling_paid?hl=en&gl=co co
42 music_and_audio_topFree https://play.google.com/store/apps/category/MUSIC_AND_AUDIO/collection/topselling_free?hl=en&gl=co co
43 music_and_audio_topSelling https://play.google.com/store/apps/category/MUSIC_AND_AUDIO/collection/topselling_paid?hl=en&gl=co co
44 news_and_magazines_topFree https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_free?hl=en&gl=co co
45 news_and_magazines_topSelling https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_paid?hl=en&gl=co co
46 parenting_topFree https://play.google.com/store/apps/category/PARENTING/collection/topselling_free?hl=en&gl=co co
47 parenting_topSelling https://play.google.com/store/apps/category/PARENTING/collection/topselling_paid?hl=en&gl=co co
48 personalization_topFree https://play.google.com/store/apps/category/PERSONALIZATION/collection/topselling_free?hl=en&gl=co co
49 personalization_topSelling https://play.google.com/store/apps/category/PERSONALIZATION/collection/topselling_paid?hl=en&gl=co co
50 photography_topFree https://play.google.com/store/apps/category/PHOTOGRAPHY/collection/topselling_free?hl=en&gl=co co
51 photography_topSelling https://play.google.com/store/apps/category/PHOTOGRAPHY/collection/topselling_paid?hl=en&gl=co co
52 productivity_topFree https://play.google.com/store/apps/category/PRODUCTIVITY/collection/topselling_free?hl=en&gl=co co
53 productivity_topSelling https://play.google.com/store/apps/category/PRODUCTIVITY/collection/topselling_paid?hl=en&gl=co co
54 shopping_topFree https://play.google.com/store/apps/category/SHOPPING/collection/topselling_free?hl=en&gl=co co
55 shopping_topSelling https://play.google.com/store/apps/category/SHOPPING/collection/topselling_paid?hl=en&gl=co co
56 social_topFree https://play.google.com/store/apps/category/SOCIAL/collection/topselling_free?hl=en&gl=co co
57 social_topSelling https://play.google.com/store/apps/category/SOCIAL/collection/topselling_paid?hl=en&gl=co co
58 sports_topFree https://play.google.com/store/apps/category/SPORTS/collection/topselling_free?hl=en&gl=co co
59 sports_topSelling https://play.google.com/store/apps/category/SPORTS/collection/topselling_paid?hl=en&gl=co co
60 tools_topFree https://play.google.com/store/apps/category/TOOLS/collection/topselling_free?hl=en&gl=co co
61 tools_topSelling https://play.google.com/store/apps/category/TOOLS/collection/topselling_paid?hl=en&gl=co co
62 travel_and_local_topFree https://play.google.com/store/apps/category/TRAVEL_AND_LOCAL/collection/topselling_free?hl=en&gl=co co
63 travel_and_local_topSelling https://play.google.com/store/apps/category/TRAVEL_AND_LOCAL/collection/topselling_paid?hl=en&gl=co co
64 video_players_topFree https://play.google.com/store/apps/category/VIDEO_PLAYERS/collection/topselling_free?hl=en&gl=co co
65 video_players_topSelling https://play.google.com/store/apps/category/VIDEO_PLAYERS/collection/topselling_paid?hl=en&gl=co co
66 weather_topFree https://play.google.com/store/apps/category/WEATHER/collection/topselling_free?hl=en&gl=co co
67 weather_topSelling https://play.google.com/store/apps/category/WEATHER/collection/topselling_paid?hl=en&gl=co co
68 editorChoice https://play.google.com/store/apps/collection/promotion_3002800_editors_choice_apps?hl=en&gl=us us
69 general_topFree https://play.google.com/store/apps/collection/topselling_free?hl=en&gl=us us
70 general_topSelling https://play.google.com/store/apps/collection/topselling_paid?hl=en&gl=us us
71 art_and_design_topFree https://play.google.com/store/apps/category/ART_AND_DESIGN/collection/topselling_free?hl=en&gl=us us
72 art_and_design_topSelling https://play.google.com/store/apps/category/ART_AND_DESIGN/collection/topselling_paid?hl=en&gl=us us
73 auto_and_vehicles_topFree https://play.google.com/store/apps/category/AUTO_AND_VEHICLES/collection/topselling_free?hl=en&gl=us us
74 auto_and_vehicles_topSelling https://play.google.com/store/apps/category/AUTO_AND_VEHICLES/collection/topselling_paid?hl=en&gl=us us
75 beauty_topFree https://play.google.com/store/apps/category/BEAUTY/collection/topselling_free?hl=en&gl=us us
76 beauty_topSelling https://play.google.com/store/apps/category/BEAUTY/collection/topselling_paid?hl=en&gl=us us
77 books_and_reference_topFree https://play.google.com/store/apps/category/BOOKS_AND_REFERENCE/collection/topselling_free?hl=en&gl=us us
78 books_and_reference_topSelling https://play.google.com/store/apps/category/BOOKS_AND_REFERENCE/collection/topselling_paid?hl=en&gl=us us
79 business_topFree https://play.google.com/store/apps/category/BUSINESS/collection/topselling_free?hl=en&gl=us us
80 business_topSelling https://play.google.com/store/apps/category/BUSINESS/collection/topselling_paid?hl=en&gl=us us
81 comics_topFree https://play.google.com/store/apps/category/COMICS/collection/topselling_free?hl=en&gl=us us
82 comics_topSelling https://play.google.com/store/apps/category/COMICS/collection/topselling_paid?hl=en&gl=us us
83 communication_topFree https://play.google.com/store/apps/category/COMMUNICATION/collection/topselling_free?hl=en&gl=us us
84 communication_topSelling https://play.google.com/store/apps/category/COMMUNICATION/collection/topselling_paid?hl=en&gl=us us
85 dating_topFree https://play.google.com/store/apps/category/DATING/collection/topselling_free?hl=en&gl=us us
86 dating_topSelling https://play.google.com/store/apps/category/DATING/collection/topselling_paid?hl=en&gl=us us
87 education_topFree https://play.google.com/store/apps/category/EDUCATION/collection/topselling_free?hl=en&gl=us us
88 education_topSelling https://play.google.com/store/apps/category/EDUCATION/collection/topselling_paid?hl=en&gl=us us
89 entertainment_topFree https://play.google.com/store/apps/category/ENTERTAINMENT/collection/topselling_free?hl=en&gl=us us
90 entertainment_topSelling https://play.google.com/store/apps/category/ENTERTAINMENT/collection/topselling_paid?hl=en&gl=us us
91 events_topFree https://play.google.com/store/apps/category/EVENTS/collection/topselling_free?hl=en&gl=us us
92 events_topSelling https://play.google.com/store/apps/category/EVENTS/collection/topselling_paid?hl=en&gl=us us
93 finance_topFree https://play.google.com/store/apps/category/FINANCE/collection/topselling_free?hl=en&gl=us us
94 finance_topSelling https://play.google.com/store/apps/category/FINANCE/collection/topselling_paid?hl=en&gl=us us
95 food_and_drink_topFree https://play.google.com/store/apps/category/FOOD_AND_DRINK/collection/topselling_free?hl=en&gl=us us
96 food_and_drink_topSelling https://play.google.com/store/apps/category/FOOD_AND_DRINK/collection/topselling_paid?hl=en&gl=us us
97 health_and_fitness_topFree https://play.google.com/store/apps/category/HEALTH_AND_FITNESS/collection/topselling_free?hl=en&gl=us us
98 health_and_fitness_topSelling https://play.google.com/store/apps/category/HEALTH_AND_FITNESS/collection/topselling_paid?hl=en&gl=us us
99 house_and_home_topFree https://play.google.com/store/apps/category/HOUSE_AND_HOME/collection/topselling_free?hl=en&gl=us us
100 house_and_home_topSelling https://play.google.com/store/apps/category/HOUSE_AND_HOME/collection/topselling_paid?hl=en&gl=us us
101 libraries_and_demo_topFree https://play.google.com/store/apps/category/LIBRARIES_AND_DEMO/collection/topselling_free?hl=en&gl=us us
102 libraries_and_demo_topSelling https://play.google.com/store/apps/category/LIBRARIES_AND_DEMO/collection/topselling_paid?hl=en&gl=us us
103 lifestyle_topFree https://play.google.com/store/apps/category/LIFESTYLE/collection/topselling_free?hl=en&gl=us us
104 lifestyle_topSelling https://play.google.com/store/apps/category/LIFESTYLE/collection/topselling_paid?hl=en&gl=us us
105 maps_and_navigation_topFree https://play.google.com/store/apps/category/MAPS_AND_NAVIGATION/collection/topselling_free?hl=en&gl=us us
106 maps_and_navigation_topSelling https://play.google.com/store/apps/category/MAPS_AND_NAVIGATION/collection/topselling_paid?hl=en&gl=us us
107 medical_topFree https://play.google.com/store/apps/category/MEDICAL/collection/topselling_free?hl=en&gl=us us
108 medical_topSelling https://play.google.com/store/apps/category/MEDICAL/collection/topselling_paid?hl=en&gl=us us
109 music_and_audio_topFree https://play.google.com/store/apps/category/MUSIC_AND_AUDIO/collection/topselling_free?hl=en&gl=us us
110 music_and_audio_topSelling https://play.google.com/store/apps/category/MUSIC_AND_AUDIO/collection/topselling_paid?hl=en&gl=us us
111 news_and_magazines_topFree https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_free?hl=en&gl=us us
112 news_and_magazines_topSelling https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_paid?hl=en&gl=us us
113 parenting_topFree https://play.google.com/store/apps/category/PARENTING/collection/topselling_free?hl=en&gl=us us
114 parenting_topSelling https://play.google.com/store/apps/category/PARENTING/collection/topselling_paid?hl=en&gl=us us
115 personalization_topFree https://play.google.com/store/apps/category/PERSONALIZATION/collection/topselling_free?hl=en&gl=us us
116 personalization_topSelling https://play.google.com/store/apps/category/PERSONALIZATION/collection/topselling_paid?hl=en&gl=us us
117 photography_topFree https://play.google.com/store/apps/category/PHOTOGRAPHY/collection/topselling_free?hl=en&gl=us us
118 photography_topSelling https://play.google.com/store/apps/category/PHOTOGRAPHY/collection/topselling_paid?hl=en&gl=us us
119 productivity_topFree https://play.google.com/store/apps/category/PRODUCTIVITY/collection/topselling_free?hl=en&gl=us us
120 productivity_topSelling https://play.google.com/store/apps/category/PRODUCTIVITY/collection/topselling_paid?hl=en&gl=us us
121 shopping_topFree https://play.google.com/store/apps/category/SHOPPING/collection/topselling_free?hl=en&gl=us us
122 shopping_topSelling https://play.google.com/store/apps/category/SHOPPING/collection/topselling_paid?hl=en&gl=us us
123 social_topFree https://play.google.com/store/apps/category/SOCIAL/collection/topselling_free?hl=en&gl=us us
124 social_topSelling https://play.google.com/store/apps/category/SOCIAL/collection/topselling_paid?hl=en&gl=us us
125 sports_topFree https://play.google.com/store/apps/category/SPORTS/collection/topselling_free?hl=en&gl=us us
126 sports_topSelling https://play.google.com/store/apps/category/SPORTS/collection/topselling_paid?hl=en&gl=us us
127 tools_topFree https://play.google.com/store/apps/category/TOOLS/collection/topselling_free?hl=en&gl=us us
128 tools_topSelling https://play.google.com/store/apps/category/TOOLS/collection/topselling_paid?hl=en&gl=us us
129 travel_and_local_topFree https://play.google.com/store/apps/category/TRAVEL_AND_LOCAL/collection/topselling_free?hl=en&gl=us us
130 travel_and_local_topSelling https://play.google.com/store/apps/category/TRAVEL_AND_LOCAL/collection/topselling_paid?hl=en&gl=us us
131 video_players_topFree https://play.google.com/store/apps/category/VIDEO_PLAYERS/collection/topselling_free?hl=en&gl=us us
132 video_players_topSelling https://play.google.com/store/apps/category/VIDEO_PLAYERS/collection/topselling_paid?hl=en&gl=us us
133 weather_topFree https://play.google.com/store/apps/category/WEATHER/collection/topselling_free?hl=en&gl=us us
134 weather_topSelling https://play.google.com/store/apps/category/WEATHER/collection/topselling_paid?hl=en&gl=us us
135 editorChoice https://play.google.com/store/apps/collection/promotion_3002800_editors_choice_apps?hl=en&gl=br br
136 general_topFree https://play.google.com/store/apps/collection/topselling_free?hl=en&gl=br br
137 general_topSelling https://play.google.com/store/apps/collection/topselling_paid?hl=en&gl=br br
138 art_and_design_topFree https://play.google.com/store/apps/category/ART_AND_DESIGN/collection/topselling_free?hl=en&gl=br br
139 art_and_design_topSelling https://play.google.com/store/apps/category/ART_AND_DESIGN/collection/topselling_paid?hl=en&gl=br br
140 auto_and_vehicles_topFree https://play.google.com/store/apps/category/AUTO_AND_VEHICLES/collection/topselling_free?hl=en&gl=br br
141 auto_and_vehicles_topSelling https://play.google.com/store/apps/category/AUTO_AND_VEHICLES/collection/topselling_paid?hl=en&gl=br br
142 beauty_topFree https://play.google.com/store/apps/category/BEAUTY/collection/topselling_free?hl=en&gl=br br
143 beauty_topSelling https://play.google.com/store/apps/category/BEAUTY/collection/topselling_paid?hl=en&gl=br br
144 books_and_reference_topFree https://play.google.com/store/apps/category/BOOKS_AND_REFERENCE/collection/topselling_free?hl=en&gl=br br
145 books_and_reference_topSelling https://play.google.com/store/apps/category/BOOKS_AND_REFERENCE/collection/topselling_paid?hl=en&gl=br br
146 business_topFree https://play.google.com/store/apps/category/BUSINESS/collection/topselling_free?hl=en&gl=br br
147 business_topSelling https://play.google.com/store/apps/category/BUSINESS/collection/topselling_paid?hl=en&gl=br br
148 comics_topFree https://play.google.com/store/apps/category/COMICS/collection/topselling_free?hl=en&gl=br br
149 comics_topSelling https://play.google.com/store/apps/category/COMICS/collection/topselling_paid?hl=en&gl=br br
150 communication_topFree https://play.google.com/store/apps/category/COMMUNICATION/collection/topselling_free?hl=en&gl=br br
151 communication_topSelling https://play.google.com/store/apps/category/COMMUNICATION/collection/topselling_paid?hl=en&gl=br br
152 dating_topFree https://play.google.com/store/apps/category/DATING/collection/topselling_free?hl=en&gl=br br
153 dating_topSelling https://play.google.com/store/apps/category/DATING/collection/topselling_paid?hl=en&gl=br br
154 education_topFree https://play.google.com/store/apps/category/EDUCATION/collection/topselling_free?hl=en&gl=br br
155 education_topSelling https://play.google.com/store/apps/category/EDUCATION/collection/topselling_paid?hl=en&gl=br br
156 entertainment_topFree https://play.google.com/store/apps/category/ENTERTAINMENT/collection/topselling_free?hl=en&gl=br br
157 entertainment_topSelling https://play.google.com/store/apps/category/ENTERTAINMENT/collection/topselling_paid?hl=en&gl=br br
158 events_topFree https://play.google.com/store/apps/category/EVENTS/collection/topselling_free?hl=en&gl=br br
159 events_topSelling https://play.google.com/store/apps/category/EVENTS/collection/topselling_paid?hl=en&gl=br br
160 finance_topFree https://play.google.com/store/apps/category/FINANCE/collection/topselling_free?hl=en&gl=br br
161 finance_topSelling https://play.google.com/store/apps/category/FINANCE/collection/topselling_paid?hl=en&gl=br br
162 food_and_drink_topFree https://play.google.com/store/apps/category/FOOD_AND_DRINK/collection/topselling_free?hl=en&gl=br br
163 food_and_drink_topSelling https://play.google.com/store/apps/category/FOOD_AND_DRINK/collection/topselling_paid?hl=en&gl=br br
164 health_and_fitness_topFree https://play.google.com/store/apps/category/HEALTH_AND_FITNESS/collection/topselling_free?hl=en&gl=br br
165 health_and_fitness_topSelling https://play.google.com/store/apps/category/HEALTH_AND_FITNESS/collection/topselling_paid?hl=en&gl=br br
166 house_and_home_topFree https://play.google.com/store/apps/category/HOUSE_AND_HOME/collection/topselling_free?hl=en&gl=br br
167 house_and_home_topSelling https://play.google.com/store/apps/category/HOUSE_AND_HOME/collection/topselling_paid?hl=en&gl=br br
168 libraries_and_demo_topFree https://play.google.com/store/apps/category/LIBRARIES_AND_DEMO/collection/topselling_free?hl=en&gl=br br
169 libraries_and_demo_topSelling https://play.google.com/store/apps/category/LIBRARIES_AND_DEMO/collection/topselling_paid?hl=en&gl=br br
170 lifestyle_topFree https://play.google.com/store/apps/category/LIFESTYLE/collection/topselling_free?hl=en&gl=br br
171 lifestyle_topSelling https://play.google.com/store/apps/category/LIFESTYLE/collection/topselling_paid?hl=en&gl=br br
172 maps_and_navigation_topFree https://play.google.com/store/apps/category/MAPS_AND_NAVIGATION/collection/topselling_free?hl=en&gl=br br
173 maps_and_navigation_topSelling https://play.google.com/store/apps/category/MAPS_AND_NAVIGATION/collection/topselling_paid?hl=en&gl=br br
174 medical_topFree https://play.google.com/store/apps/category/MEDICAL/collection/topselling_free?hl=en&gl=br br
175 medical_topSelling https://play.google.com/store/apps/category/MEDICAL/collection/topselling_paid?hl=en&gl=br br
176 music_and_audio_topFree https://play.google.com/store/apps/category/MUSIC_AND_AUDIO/collection/topselling_free?hl=en&gl=br br
177 music_and_audio_topSelling https://play.google.com/store/apps/category/MUSIC_AND_AUDIO/collection/topselling_paid?hl=en&gl=br br
178 news_and_magazines_topFree https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_free?hl=en&gl=br br
179 news_and_magazines_topSelling https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_paid?hl=en&gl=br br
180 parenting_topFree https://play.google.com/store/apps/category/PARENTING/collection/topselling_free?hl=en&gl=br br
181 parenting_topSelling https://play.google.com/store/apps/category/PARENTING/collection/topselling_paid?hl=en&gl=br br
182 personalization_topFree https://play.google.com/store/apps/category/PERSONALIZATION/collection/topselling_free?hl=en&gl=br br
183 personalization_topSelling https://play.google.com/store/apps/category/PERSONALIZATION/collection/topselling_paid?hl=en&gl=br br
184 photography_topFree https://play.google.com/store/apps/category/PHOTOGRAPHY/collection/topselling_free?hl=en&gl=br br
185 photography_topSelling https://play.google.com/store/apps/category/PHOTOGRAPHY/collection/topselling_paid?hl=en&gl=br br
186 productivity_topFree https://play.google.com/store/apps/category/PRODUCTIVITY/collection/topselling_free?hl=en&gl=br br
187 productivity_topSelling https://play.google.com/store/apps/category/PRODUCTIVITY/collection/topselling_paid?hl=en&gl=br br
188 shopping_topFree https://play.google.com/store/apps/category/SHOPPING/collection/topselling_free?hl=en&gl=br br
189 shopping_topSelling https://play.google.com/store/apps/category/SHOPPING/collection/topselling_paid?hl=en&gl=br br
190 social_topFree https://play.google.com/store/apps/category/SOCIAL/collection/topselling_free?hl=en&gl=br br
191 social_topSelling https://play.google.com/store/apps/category/SOCIAL/collection/topselling_paid?hl=en&gl=br br
192 sports_topFree https://play.google.com/store/apps/category/SPORTS/collection/topselling_free?hl=en&gl=br br
193 sports_topSelling https://play.google.com/store/apps/category/SPORTS/collection/topselling_paid?hl=en&gl=br br
194 tools_topFree https://play.google.com/store/apps/category/TOOLS/collection/topselling_free?hl=en&gl=br br
195 tools_topSelling https://play.google.com/store/apps/category/TOOLS/collection/topselling_paid?hl=en&gl=br br
196 travel_and_local_topFree https://play.google.com/store/apps/category/TRAVEL_AND_LOCAL/collection/topselling_free?hl=en&gl=br br
197 travel_and_local_topSelling https://play.google.com/store/apps/category/TRAVEL_AND_LOCAL/collection/topselling_paid?hl=en&gl=br br
198 video_players_topFree https://play.google.com/store/apps/category/VIDEO_PLAYERS/collection/topselling_free?hl=en&gl=br br
199 video_players_topSelling https://play.google.com/store/apps/category/VIDEO_PLAYERS/collection/topselling_paid?hl=en&gl=br br
200 weather_topFree https://play.google.com/store/apps/category/WEATHER/collection/topselling_free?hl=en&gl=br br
201 weather_topSelling https://play.google.com/store/apps/category/WEATHER/collection/topselling_paid?hl=en&gl=br br
202 editorChoice https://play.google.com/store/apps/collection/promotion_3002800_editors_choice_apps?hl=en&gl=de de
203 general_topFree https://play.google.com/store/apps/collection/topselling_free?hl=en&gl=de de
204 general_topSelling https://play.google.com/store/apps/collection/topselling_paid?hl=en&gl=de de
205 art_and_design_topFree https://play.google.com/store/apps/category/ART_AND_DESIGN/collection/topselling_free?hl=en&gl=de de
206 art_and_design_topSelling https://play.google.com/store/apps/category/ART_AND_DESIGN/collection/topselling_paid?hl=en&gl=de de
207 auto_and_vehicles_topFree https://play.google.com/store/apps/category/AUTO_AND_VEHICLES/collection/topselling_free?hl=en&gl=de de
208 auto_and_vehicles_topSelling https://play.google.com/store/apps/category/AUTO_AND_VEHICLES/collection/topselling_paid?hl=en&gl=de de
209 beauty_topFree https://play.google.com/store/apps/category/BEAUTY/collection/topselling_free?hl=en&gl=de de
210 beauty_topSelling https://play.google.com/store/apps/category/BEAUTY/collection/topselling_paid?hl=en&gl=de de
211 books_and_reference_topFree https://play.google.com/store/apps/category/BOOKS_AND_REFERENCE/collection/topselling_free?hl=en&gl=de de
212 books_and_reference_topSelling https://play.google.com/store/apps/category/BOOKS_AND_REFERENCE/collection/topselling_paid?hl=en&gl=de de
213 business_topFree https://play.google.com/store/apps/category/BUSINESS/collection/topselling_free?hl=en&gl=de de
214 business_topSelling https://play.google.com/store/apps/category/BUSINESS/collection/topselling_paid?hl=en&gl=de de
215 comics_topFree https://play.google.com/store/apps/category/COMICS/collection/topselling_free?hl=en&gl=de de
216 comics_topSelling https://play.google.com/store/apps/category/COMICS/collection/topselling_paid?hl=en&gl=de de
217 communication_topFree https://play.google.com/store/apps/category/COMMUNICATION/collection/topselling_free?hl=en&gl=de de
218 communication_topSelling https://play.google.com/store/apps/category/COMMUNICATION/collection/topselling_paid?hl=en&gl=de de
219 dating_topFree https://play.google.com/store/apps/category/DATING/collection/topselling_free?hl=en&gl=de de
220 dating_topSelling https://play.google.com/store/apps/category/DATING/collection/topselling_paid?hl=en&gl=de de
221 education_topFree https://play.google.com/store/apps/category/EDUCATION/collection/topselling_free?hl=en&gl=de de
222 education_topSelling https://play.google.com/store/apps/category/EDUCATION/collection/topselling_paid?hl=en&gl=de de
223 entertainment_topFree https://play.google.com/store/apps/category/ENTERTAINMENT/collection/topselling_free?hl=en&gl=de de
224 entertainment_topSelling https://play.google.com/store/apps/category/ENTERTAINMENT/collection/topselling_paid?hl=en&gl=de de
225 events_topFree https://play.google.com/store/apps/category/EVENTS/collection/topselling_free?hl=en&gl=de de
226 events_topSelling https://play.google.com/store/apps/category/EVENTS/collection/topselling_paid?hl=en&gl=de de
227 finance_topFree https://play.google.com/store/apps/category/FINANCE/collection/topselling_free?hl=en&gl=de de
228 finance_topSelling https://play.google.com/store/apps/category/FINANCE/collection/topselling_paid?hl=en&gl=de de
229 food_and_drink_topFree https://play.google.com/store/apps/category/FOOD_AND_DRINK/collection/topselling_free?hl=en&gl=de de
230 food_and_drink_topSelling https://play.google.com/store/apps/category/FOOD_AND_DRINK/collection/topselling_paid?hl=en&gl=de de
231 health_and_fitness_topFree https://play.google.com/store/apps/category/HEALTH_AND_FITNESS/collection/topselling_free?hl=en&gl=de de
232 health_and_fitness_topSelling https://play.google.com/store/apps/category/HEALTH_AND_FITNESS/collection/topselling_paid?hl=en&gl=de de
233 house_and_home_topFree https://play.google.com/store/apps/category/HOUSE_AND_HOME/collection/topselling_free?hl=en&gl=de de
234 house_and_home_topSelling https://play.google.com/store/apps/category/HOUSE_AND_HOME/collection/topselling_paid?hl=en&gl=de de
235 libraries_and_demo_topFree https://play.google.com/store/apps/category/LIBRARIES_AND_DEMO/collection/topselling_free?hl=en&gl=de de
236 libraries_and_demo_topSelling https://play.google.com/store/apps/category/LIBRARIES_AND_DEMO/collection/topselling_paid?hl=en&gl=de de
237 lifestyle_topFree https://play.google.com/store/apps/category/LIFESTYLE/collection/topselling_free?hl=en&gl=de de
238 lifestyle_topSelling https://play.google.com/store/apps/category/LIFESTYLE/collection/topselling_paid?hl=en&gl=de de
239 maps_and_navigation_topFree https://play.google.com/store/apps/category/MAPS_AND_NAVIGATION/collection/topselling_free?hl=en&gl=de de
240 maps_and_navigation_topSelling https://play.google.com/store/apps/category/MAPS_AND_NAVIGATION/collection/topselling_paid?hl=en&gl=de de
241 medical_topFree https://play.google.com/store/apps/category/MEDICAL/collection/topselling_free?hl=en&gl=de de
242 medical_topSelling https://play.google.com/store/apps/category/MEDICAL/collection/topselling_paid?hl=en&gl=de de
243 music_and_audio_topFree https://play.google.com/store/apps/category/MUSIC_AND_AUDIO/collection/topselling_free?hl=en&gl=de de
244 music_and_audio_topSelling https://play.google.com/store/apps/category/MUSIC_AND_AUDIO/collection/topselling_paid?hl=en&gl=de de
245 news_and_magazines_topFree https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_free?hl=en&gl=de de
246 news_and_magazines_topSelling https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_paid?hl=en&gl=de de
247 parenting_topFree https://play.google.com/store/apps/category/PARENTING/collection/topselling_free?hl=en&gl=de de
248 parenting_topSelling https://play.google.com/store/apps/category/PARENTING/collection/topselling_paid?hl=en&gl=de de
249 personalization_topFree https://play.google.com/store/apps/category/PERSONALIZATION/collection/topselling_free?hl=en&gl=de de
250 personalization_topSelling https://play.google.com/store/apps/category/PERSONALIZATION/collection/topselling_paid?hl=en&gl=de de
251 photography_topFree https://play.google.com/store/apps/category/PHOTOGRAPHY/collection/topselling_free?hl=en&gl=de de
252 photography_topSelling https://play.google.com/store/apps/category/PHOTOGRAPHY/collection/topselling_paid?hl=en&gl=de de
253 productivity_topFree https://play.google.com/store/apps/category/PRODUCTIVITY/collection/topselling_free?hl=en&gl=de de
254 productivity_topSelling https://play.google.com/store/apps/category/PRODUCTIVITY/collection/topselling_paid?hl=en&gl=de de
255 shopping_topFree https://play.google.com/store/apps/category/SHOPPING/collection/topselling_free?hl=en&gl=de de
256 shopping_topSelling https://play.google.com/store/apps/category/SHOPPING/collection/topselling_paid?hl=en&gl=de de
257 social_topFree https://play.google.com/store/apps/category/SOCIAL/collection/topselling_free?hl=en&gl=de de
258 social_topSelling https://play.google.com/store/apps/category/SOCIAL/collection/topselling_paid?hl=en&gl=de de
259 sports_topFree https://play.google.com/store/apps/category/SPORTS/collection/topselling_free?hl=en&gl=de de
260 sports_topSelling https://play.google.com/store/apps/category/SPORTS/collection/topselling_paid?hl=en&gl=de de
261 tools_topFree https://play.google.com/store/apps/category/TOOLS/collection/topselling_free?hl=en&gl=de de
262 tools_topSelling https://play.google.com/store/apps/category/TOOLS/collection/topselling_paid?hl=en&gl=de de
263 travel_and_local_topFree https://play.google.com/store/apps/category/TRAVEL_AND_LOCAL/collection/topselling_free?hl=en&gl=de de
264 travel_and_local_topSelling https://play.google.com/store/apps/category/TRAVEL_AND_LOCAL/collection/topselling_paid?hl=en&gl=de de
265 video_players_topFree https://play.google.com/store/apps/category/VIDEO_PLAYERS/collection/topselling_free?hl=en&gl=de de
266 video_players_topSelling https://play.google.com/store/apps/category/VIDEO_PLAYERS/collection/topselling_paid?hl=en&gl=de de
267 weather_topFree https://play.google.com/store/apps/category/WEATHER/collection/topselling_free?hl=en&gl=de de
268 weather_topSelling https://play.google.com/store/apps/category/WEATHER/collection/topselling_paid?hl=en&gl=de de

HTML Parser

In addition to the Google Play Scrapper, we developed tool to parse the data in the HTML archives that were collected. For that, it was necessary to do a manual analysis of the HTML files to identify the tags, along with the classes names and identifiers, that contained relevant information and their corresponding data types.

The parser is written in Python using the library Beautiful Soup to search tags and extract its content, in order to store it in a a non-relational database.

Important Findings

Dataset

Hall-of-Apps contains top charts’ apps metadata and reviews, extracted (weekly) from Google Play during 30 weeks, starting on November, 2017 until May, 2018. Each week we extracted the information of the top 100 apps available in the top charts of 33 Google Play categories and the Editor Choice list. The amount of apps per category could be any number, but we chose to extract only the first 100 apps because those apps could have common characteristic to stay in the first positions.

In addition, in order to enhance the capabilities of our dataset, for each week we extracted the aforementioned information from four different countries with languages we could understand: Brazil, Colombia, Germany, and USA.

Future work will include the extraction of more countries with different languages to generate a more complete dataset. Besides, it will include the extraction of a bigger amount of apps per category but storing also the position in order to analyze if there are different characteristics depending the position of the app in the top charts.

The resulting dataset is composed of two storage mechanisms. The first one, which contains the raw HTML files of over 30 weeks, stores those files by week and grouped by month. Each file has a taxonomy to identify the week date in which the apps were extracted, app id, country, top and category where it belongs.

The second component, contains the information extracted from the HTML files via the developed parser. Then, that processed information was used to populate a non-relational database, in this case, a MongoDB. We decided to use a non-relational database due to the huge amount of raw files we had and taking into account not all apps contain the same information.

Instructions to restore the database, run the scrapper and parser, can be found here: Hall-of-apps

Database Structure

Non-relational database diagram.

As shown in the image above, the database consists of three collections. The main collection is called app because it keeps characteristic information about the app and retrieval dates. However, each app could also have reviews, similar apps, and more apps from the same developer, thus, in order to prevent overloading the collection and to make easy to query information about reviews and extra apps, we created two new collections, review and extra app.

Review has information about the user who wrote the review, rating, date and, if the developer wrote a response, it also has the text and date. On the other hand, extra app has information about similar apps and more from developer. To distinguish the group where it belongs, each document has an state field which indicates if the app is a similar app or more from developer app.

Additionally, since an app could appear in different countries, multiple weeks, or even in different tops, we defined ids composed by: retrieved date start and end, app id, category and country. In addition, this was done to keep a relation between app, review and extra app.

Visualizations Scripts

Finally, in order to facilitate the understanding of our dataset, we generated some visualizations, as shown inside this web page, that aim to explain some of the metrics, distributions and statistics of the data in the non-relational database.

The scripts used to generate the visualizations are written in JavaScript using the library Data Driven Documents, or D3, to generate custom charts and diagrams. In detail, each of the scripts contains a function that generates a particular chart/visualization (for example, a Stacked Bar Chart) using as input CSV files that contain synthesized information of various aggregations and/or queries made in the non-relational database.

Metrics and Statistics

We have collected our dataset from the GP website, checking the first 100 apps in the top free, top selling and the editor choice. Each app has a summary, description, rating, amount of stars, reviews, genre, price, last update, version, required android versions, developer, similar apps, more apps from developer, etc.

As we explained previously, our dataset is composed of two storage mechanisms: The first one contains raw HTML files stored week by week over 30 weeks. The following table shows the number of extracted HTML files by month.


Year Month # Extracted Files
2017 November 87700
2017 December 111989
2018 January 90154
2018 February 89286
2018 March 89545
2018 April 115459
2018 May 89908

On the other hand, the second one is a non-relational database with all the processed information. The following figure depicts the amount of apps metadata extracted from the Google Play Store, grouped by month and subdividing it by country:


Toggle sort

The following chart aims to describe the total number of applications in our dataset, per month, as well as the distribution per Country in each month.


General Findings

Furthermore, it's worth noting that we extracted the best apps of the Google Play Store, thus, our non-relational database contains the top free and top selling apps for each category, as well as the editors choice in each country:


Select Input Toggle sort

The following chart aims to describe the total number of applications in our dataset per month, as well as the distribution per Category in each month, filtering by tops that were described previously.

The figure above shows that our dataset contains 34 differents app categories. In order to ease the global analysis of the apps, we added to this page some mapped macro categories. These new macro categories were generated by grouping the original categories by their similarity. The following table depicts the new macro categories, and the figure below it aims to describe the total number of applications in our dataset per month, as well as the subdividing it by Macro Categories in each month, filtering by tops.


Select Input Toggle sort

Mapped Category Original Category
tools_libraries _generalTools
Libraries & Demo
General
entertainment_events_food Entertainment
Events
Food & Drink
social_dating_communicationSocial
dating
communication
health _medical_sportsHealth & Fitness
Medical
Sports
music _video _auto Music & Audio
Video Player
Auto & Vehicles
art _photography_personalizationArt & Design
Photography
Personalization
beauty_shopping_lifestyleBeauty
Shopping
Lifestyle
books _news _comicsBooks & Reference
News & Magazines
Comics
business_finance_productivityBusiness
Finance
Productivity
house _parenting_educationHouse & Home
Parenting
Education
maps _travel _weatherMaps & Navigation
Travel & Local
Weather
EditorChoiceEditorChoice

From the table shown above, it's worth noting that the category EditorChoice is also classified as Top Chart and doesn't have any sort of mapping, thus the information related to this particular category is not represented in the previous visualizations. Instead, a more insightful visualization for this particular category is depicted in the following chart, which aims to describe the total number of applications in our dataset per month, as well as the subdividing it by Countries in each month.

Toggle sort

App Collection Distribution

This collection has 671,041 documents and a total of 32 fields. The following figure depicts the fields data-type distribution.

Select Input

As the figure shows, the String data-type is predominant in this collection, followed by Numeric fields. In the same way, it's possible to evidence the same proportions when looking at each individual country.

In addition to the above, the table below shows the data types of each of the fields of the collection, as well as the percentage of null values.

Atribute Name Type % Null Values
_id Object 0%
amount_more_from_developer_apps Numeric 0%
amount_reviews Numeric 0%
amount_similar_apps Numeric 0%
android_version String 0%
category String 0%
content_rating String 0%
country String 0%
currency String 0%
current_version String 0%
description String 0%
dev_address String 54%
dev_mail String 0.00045%
dev_name String 0%
genre Array 0.005%
id String 0%
last_update Date 0%
name String 0%
num_installs String 0%
price Numeric 0.02%
rating Numeric 0%
rating_1 Numeric 0%
rating_2 Numeric 0%
rating_3 Numeric 0%
rating_4 Numeric 0%
rating_5 Numeric 0%
retrieved_date_end Date 0%
retrieved_date_start Date 0%
summary String 0%
top String 0%
url String 0%
whats_new Array 0%

Review Collection Distribution

This collection has 19,095,412 documents and a total of 16 fields. The following figure depicts the fields data-type distribution.

Select Input

As the figure shows, the String data-type is predominant in this collection, followed by Date fields. In the same way, it's possible to evidence the same proportions when looking at each individual country.

In addition to the above, the table below shows the data types of each of the fields of the collection, as well as the percentage of null values.

Atribute Name Type % Null Values
_id Object 0%
app_id String 0%
app_name String 0%
app_retrieved_date_end Date 0%
app_retrieved_date_start Date 0%
author String 0%
category String 0%
country String 0%
date Date 35.3%
dev_name String 84.9%
dev_reply String 84.9%
dev_reply_date Date 84.9%
new_date Numeric 64.7%
rating Numeric 0%
text String 0%
title String 90.9%

Extra App Collection Distribution

This collection has 11,415,027 documents and a total of 15 fields. The following figure depicts the fields data-type distribution.

Select Input

As the figure shows, the String data-type is predominant in this collection, followed by Date fields. In the same way, it's possible to evidence the same proportions when looking at each individual country, with the exception of de which has a higher amount of null values.

In addition to the above, the table below shows the data types of each of the fields of the collection, as well as the percentage of null values.

Atribute Name Type % Null Values
_id Object 0%
app_id String 0%
app_name String 0%
app_retrieved_date_end Date 0%
app_retrieved_date_start Date 0%
category String 0%
country String 0%
currency String 0.2%
dev_name String 0%
id String 0%
name String 0%
price Numeric 0%
rating Numeric 0%
state String 0%
summary String 0%

From the above, it's worth noting that the majority of the collection fields are unique and, therefore, it wasn't possible to depict predominant values for those fields.