Sunday, December 23, 2012

Part 3 of Exploring Google Analytics data using Clojure, Incanter and MongoDB

Classifying Google Analytics All Traffic Sources helps us to make a better decision on media selection. You may not spend money on a medium that sends above average bouncing visitors. As described in my previous post we use a simple classification method: below average:0, average:1, above average:2.

 ; Let's check column names in :m2                                                                                                                                              
  (fdb :m2 :limit 1)
  [:_id :SourceMedium :Visits :PagesVisit :AvgVisitDuration :NewVisits :BounceRate]
  [# 1 5336 3.75 192 0.41 0.44659999999999994]
  ; Let's classify each metric                                                                                                                                                   
  (classify-row :m2 :PagesVisit)
  (classify-row :m2 :BounceRate)
  (classify-row :m2 :NewVisits)
  (classify-row :m2 :AvgVisitDuration)
  (classify-row :m2 :Visits)
  ; Let's check column names in :m2 again                                                                                                                                        
  (fdb :m2 :limit 1)
  ; We can see that each classified metric has a new column now                                                                                                                  
  [:BounceRate :NewVisits :Visits :SourceMedium :AvgVisitDuration :PagesVisitC :PagesVisit :_id :AvgVisitDurationC :VisitsC :NewVisitsC :BounceRateC]
  [0.6214 0.655 832 2 99 1 2.02 # 1 1 1 0]
  ; Let's use Incanter for charting 3 metrics
  (lm-gchart (fdr :m2 :PagesVisit) (fdr :m2 :AvgVisitDuration) "Pages/Visit" "AvgVisitDuration by BounceRateC"  (fdr :m2 :BounceRateC))

You may try out lm-gchart on your own.

Pages/Visit and AvgVisitDuration scatter-plot groupped by BounceRateC

Pages/Visit and AvgVisitDuration scatter-plot grouped by BounceRateC. Red dot means high-bouncing traffic sources, blue is average and green means below average bouncing traffic sources.

Functions of clojure.math may be your friend while exploring GA data.

;; combinations of two metrics
(clojure.math.combinatorics/combinations [:PagesVisit :BounceRate :Visits :AvgVisitDuration :NewVisits] 2)
;; Result:
((:PagesVisit :BounceRate) (:PagesVisit :Visits) (:PagesVisit :AvgVisitDuration) (:PagesVisit :NewVisits) (:BounceRate :Visits) (:BounceRate :AvgVisitDuration) (:BounceRate :NewVisits) (:Visits :AvgVisitDuration) (:Visits :NewVisits) (:AvgVisitDuration :NewVisits))
;; combinations of three metrics
(clojure.math.combinatorics/combinations [:PagesVisit :BounceRate :Visits :AvgVisitDuration :NewVisits] 3)
;; Result:
((:PagesVisit :BounceRate :Visits) (:PagesVisit :BounceRate :AvgVisitDuration) (:PagesVisit :BounceRate :NewVisits) (:PagesVisit :Visits :AvgVisitDuration) (:PagesVisit :Visits :NewVisits) (:PagesVisit :AvgVisitDuration :NewVisits) (:BounceRate :Visits :AvgVisitDuration) (:BounceRate :Visits :NewVisits) (:BounceRate :AvgVisitDuration :NewVisits) (:Visits :AvgVisitDuration :NewVisits))

No comments:

Post a Comment