Vertica as an analytic platform
Vertica 5.0 is coming out today, and delivering the down payment on Vertica’s analytic platform strategy. In Vertica lingo, there’s now a Vertica SDK (Software Development Kit), featuring Vertica UDT(F)s* (User-Defined Transform Functions). Vertica UDT syntax basics start:
- In this release, Vertica UDTFs can only be written in C++. Other UDTF languages are promised.
- Otherwise, Vertica UDTFs sound pretty flexible; in particular:
- They can ingest and emit any number of rows.
- Their assumed schemas can be defined programmatically (both input and output).
- Vertica UDTFs go in the SELECT clause, not the FROM clause. I must confess to not grasping Vertica’s argument as to why this provides great and important flexibility.
- UDTF syntax mirrors SQL 99 Analytics pretty closely.
*It looks like the “F” is in the official name, but will often be dropped colloquially.
Other Vertica analytic platform highlights include:
- Proper integrated UDT workload management is promised, and there’s a little bit of UDT workload management already.*
- Vertica is delivering some prebuilt functions for aggregation, statistics, etc.
- Vertica has cool temporal and time series features.
- Vertica’s geospatial support seems pretty basic (circles and rectangles).
- Vertica’s NDA plans moving forward are pretty much as one would hope.
*Vertica’s UDT workload management is RAM-only, and “honor system” — i.e., it assumes that the UDTFs declare their resource usage correctly, which Vertica says is the right way to handle in-process C++ routines.
Vertica also argues that fast-performing SQL in and of itself can amount to analytic functionality. For example, Vertica has tried to ensure that it offers great performance in the kinds of self-joins that are used in graph analysis. Since Vertica has plenty of customers among the kinds of Web and telco companies that use graph analysis today, I’m inclined to grant some benefit of the doubt here. That said, Vertica thinks 3 hops is plenty for most kinds of graph analysis people want to do, and I can think of applications (e.g. anti-terrorism) where that’s surely not the case.
Comments
7 Responses to “Vertica as an analytic platform”
Leave a Reply
[…] Vertica has progressed down the analytic platform path, with Monday’s release of Vertica 5.0. […]
[…] series SQL extensions. Vertica explained its version of these to me a few days ago. I imagine Sybase IQ and other serious […]
It is interesting that Vertica seems to be absent from TPC-H rankings. Not that it matters, but it would be interesting to see the database loading times and query execution times in comparison to other products.
If you like, you can set up a cluster, download Vertica, and run a TPC-H yourself.
[…] an essential feature of an analytic platform,” said Curt Monash of Monash Research. He said this feature is valuable because typically, in order to perform complex analysis, the data must be moved out of the database […]
[…] I have some tidbits to add to my June, 2011 coverage of Vertica’s analytic functionality. […]
[…] Gartner correctly praises Vertica’s analytic platform capabilities, but then seems to criticize Vertica’s capabilities in user-defined functions — notwithstanding that Vertica’s analytic platform capabilities are implemented via UDFs. […]