Friday, January 4, 2013

Part 4 of Exploring Google Analytics data using Clojure, Incanter and MongoDB

Making data-driven decision on media selection helps increasing Return on Investment. Weighted sort feature of Google Analytics enables you to sort traffic sources while eliminating long-tail problem. In this post we will use an updated weighted sort function that produces the same sort order as Google Analytics' built-in feature.

Weighted Sort table


There is a new wavg (weighted average) function and updated etv-ws and weighted-sort-row functions.
(defn wavg
  "Calculate weighted average of a row"
  [weights row]
  (cond (= (count row) 0) 0
        (= (count row) 1) (first row)
        :else  (/ (reduce +' (map * row weights))
                  (reduce +' weights))))

(defn etv-ws
  "Estimated weighted sort value for y based on x where mx maximum of Xs, ay weighted average of Ys."
  [x y mx ay]
  (let [etv ($= (ay + (x / mx * (y - ay))))

(defn weighted-sort-row
  "Calculate and store weighted sort value in coll using wrow as weight.                                                                                                         
   Store new weigted values in rowW name as new column"
  [coll wrow row]

  (let [oids (fdr coll :_id)
        xdata (fdr coll wrow)
        ydata (fdr coll row)
        ay (wavg xdata ydata)
        mx (reduce max xdata)
    (pmap (fn [y]
            (let [oid y
                  rec (fetch-one coll :where {:_id oid})
                  x (wrow rec)
                  y (row rec)
                  w (etv-ws x y mx ay)
                  log (update! coll rec (merge rec
                                               {(keyword (str (name row) "W")) w}))


Let's use weighted sort on BounceRate using Visits as weight. weighted-sort-row functions stores calculated weights in BounceRateW.
;; Calc weighted sort for BounceRate                                                                                                                                             
(weighted-sort-row :m2 :Visits :BounceRate)
;; Check collection                                                                                                                                                              
(fdb :m2 :limit 1)
;; Result                                                                                                                                                                        
[:BounceRateW :BounceRate :NewVisits :Visits :SourceMedium :AvgVisitDuration :PagesVisitC :PagesVisit :_id :AvgVisitDurationC :VisitsC :NewVisitsC :BounceRateC]
[0.39453393212385945 0.6667000000000001 0.3333 3 47 6 0 1.33 # 0 1 1 0]
;; new :BounceRateW appeared in collection

;; You can compare results with Google Analytics' weighted sort                                                                                                                  
(view ($order :BounceRateW :asc (fdb :m2 :only [:BounceRateW :SourceMedium :Visits])))
;; Name of each SourceMedium                                                                                                                                                     
(def SourceMedium (fdd :m1 :SourceMedium))
;; SourceMedium indexes sort by weighted BounceRate in ascending order                                                                                                           
(def SourceMediumBounceRateW ($ :SourceMedium ($order :BounceRateW :asc (fdb :m2 :only [:BounceRateW :SourceMedium ]))))
;; View sorted traffic sources in human readable form                                                                                                                            
(view (map (fn [y] (nth SourceMedium y)) SourceMediumBounceRateW))

You may use weighted sort for other metrics too. If you reallocate the daily budget of the worst performing traffic sources to the above average performing ones, you have a simple adaptive model for increasing the Return of Investment of marketing budget.

1 comment:

  1. Exploring Google Analytics Data With Clojure, ... Arnold Matyasi posted 4 articles (with Clojure code, charts, and explanations) on how to analyze Google Analytics data locally with Clojure, Incanter, and MongoDB: ... of tech startup entrepreneurship through the eyes of Alex Popescu. this blog post