December 5, 2013

Vertica 7

It took me a bit of time, and an extra call with Vertica’s long-time R&D chief Shilpa Lawande, but I think I have a decent handle now on Vertica 7, code-named Crane. The two aspects of Vertica 7 I find most interesting are:

Other Vertica 7 enhancements include:

Overall, two recurring themes in our discussion were:

Also, be warned that there are two entirely different key-value things going on in Vertica 7. I was pretty confused until I realized that.

Vertica Flex Zone basics include:

Basically, Flex Zone is meant to be (among other things) a big bit bucket, perhaps in some cases obviating the need for Hadoop to play the same role.

I have less detail on the new short-request query executor, but I gather that:

I assume this will eventually evolve to the point that you can join a small, broadcasted dimension table to a single node’s portion of a fact table, but Vertica hasn’t actually told me that that kind of functionality is in the works.

Finally, and as is appropriate for a whole-number release, Vertica 7 has a lot of different performance enhancements, in loads, joins, and more. In particular, workload management has been extended from covering just RAM (which is usually Vertica’s scarcest commodity anyhow) to, in a limited sense, CPU as well. Specifically, queries can be “pinned” to specific cores, which for example lets short-request workloads be isolated from their longer-running brethren.

Related link

Comments

4 Responses to “Vertica 7”

  1. Kris Peeters on December 6th, 2013 8:22 am

    Thanks for the insightful post.

    Have you heard anything about their “SQL-on-Hadoop” offering? I’m a bit skeptical whether it’s really SQL-On-Hadoop. Do they store their data in HDFS? Do they use Hadoop nodes to do the processing? Of course not MapReduce. But, like Impala or Presto, still really run on Hadoop? Or do they have a connection with Hadoop and do everything on their own nodes?

    That’s a big difference in my opinion. The first option will slow down Vertica because the way HDFS is built. The second option is not really SQL-On-Hadoop.

  2. Curt Monash on December 6th, 2013 10:31 am

    Vertica does its query execution on its own nodes. I don’t like the label “SQL-on-Hadoop” for that. I’d rather call it SQL/Hadoop integration, of which SQL-on-Hadoop is a particular kind that Vertica doesn’t happen to offer.

    E.g. http://www.dbms2.com/2012/10/17/hadooprdbms-integration-aster-sql-h-and-hadapt/

  3. Using multiple data stores | DBMS 2 : DataBase Management System Services on June 18th, 2014 12:03 pm

    […] multiple-data-models idea has been extended into schema-on-need, which is sometimes but not always housed in […]

  4. An idealized log management and analysis system — from whom? | DBMS 2 : DataBase Management System Services on September 7th, 2014 8:39 am

    […] in innovations relevant to log analysis, including a range of time series/event series features and its own schema-on-need effort. Vertica was also founded by people who were also streaming pioneers (there were heavily […]

Leave a Reply




Feed: DBMS (database management system), DW (data warehousing), BI (business intelligence), and analytics technology Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.