Being able to sift through the clickstream of thousands of people is interesting in the search of insight is a fun task.
However clickstream is fill of noise:
- Ad servers;
- Iframes
- Analtyics trackers (adsence, commscore etc...)
Ad servers are particularly annoying - they are shape shifters - continually adding changing both domains and subdomains.
However - noise has patterns:
However - noise has patterns:
- Referrers
- Redirect Codes
- HTTP Headers
- In-discriminant appearances across the web.
These are the sort of patterns that machine learning could eat for breakfast!
No comments:
Post a Comment