June 22, 2011
Citrusleaf RTA
Citrusleaf has released an add-on product called Citrusleaf RTA (Real-Time Attribution). It’s to be used when:
- You want to update dashboards within a minute.
- You want to update predictive models fairly quickly (within the hour?), although it’s not clear to me how much the models are being updated or changed with that latency.
The metrics envisioned are:
- 100 or so ad impressions per person …
- … for 1 billion or so people …
- … stored for 30-90 days …
- … where each ad impression is a fairly short record …
- … stored on disk …
- … but indexed in a way so that the index can fit into RAM.
- 50-100,000 writes per second. (I didn’t ask on what amount of hardware.)
- Several hundred reads per second.
A consistent relational schema is NOT assumed.
Citrusleaf’s solution is:
- Have one index entry for each of the 1 billion people.
- Bang each new object/record to disk. Include in it a pointer to the previous object/record for the same person.
- Each time a new object/record is added, update the index in place so that it now points to the new once. Hence, the index is sized according to the number of people, not according to the total number of objects/records.
- Eventually let objects/records age off in the obvious way.
The downside is that when you do read 100 objects/records per person, you might need to do 100 seeks.
Categories: Aerospike, Analytic technologies, Business intelligence, Data models and architecture, Data warehousing, Log analysis, Predictive modeling and advanced analytics, Theory and architecture, Web analytics
Subscribe to our complete feed!
Comments
3 Responses to “Citrusleaf RTA”
Leave a Reply
Today, the advertising industry counts the last impression to determine who should get paid when conversions or other triggering events happen. We have identified a market need for retrieving time-ordered lists of user behavior.
The ability to extract this data soon after the user has taken action, and to then be able to compare it with the user’s past behavior, helps our customers gain a better understanding of mobile and online purchasing behavior. New models are being created to predict and optimize revenue using specific user behavior.
It is pretty exciting to be able to select the entire behavior chain at this scale.
[…] the AeroSpike product story is as I described in two posts last year. At the highest […]
[…] should perhaps buy it as well. Generally, the Aerospike product story is as I described in two posts last year. At the highest […]