October 18, 2011
Vertica Community Edition
The press release announcing Vertica’s Community Edition is a bit vague. And indeed, much of what I know about Vertica Community Edition is along the lines of “This is what I think will happen, but of course it could still change.” That said, I believe:
- Vertica Community Edition has all of regular Vertica’s features. However …
- … HP Vertica reserves the right to open a feature gap in future releases.
- The license restriction on Vertica Community Edition is that you’re limited to 1 terabyte of data, and 3 nodes. I imagine that’s for one production copy, and you’re perfectly free to also set up mirrors for test, development, disaster recovery, and so on. However …
- … HP Vertica would be annoyed if you stuck a free copy of Vertica on each of 50 nodes and managed the whole thing via, say, Hadapt.
- HP Vertica plans to be very generous with true academic researchers, suspending or waiving limits on database size and node count. Not coincidentally, Vertica Community Edition is being announced at XLDB, where Vertica is also a top-level sponsor. (I introduced Vertica and XLDB’s Jacek Becla to each other as soon as I heard about Vertica’s Community Edition plans.)
- The only support available for Vertica Community Edition is through forums. This could change.
I’m a big supporter of the Vertica Community Edition idea, for four reasons:
- It should now be easier to download and evaluate Vertica.
- Vertica Community Edition could be a big help to academic researchers.
- Vertica could now be more appealing to some of the “Omigod, we’re outgrowing Oracle Standard Edition and we don’t want to pay up for Oracle Enterprise Edition/Exadata” crowd.
- People are under the impression that what Vertica actually charges today resembles its long-ago list prices. This announcement may help puncture Vertica’s outdated pricing image.
Comments
7 Responses to “Vertica Community Edition”
Leave a Reply
[…] You can put >1 petabyte into [name redacted],* among others; [name redacted]* should be out soon with a generously free offering for academic users. Edit: That would be Vertica. […]
Regarding your bullet:
This would violate the CE license agreement as we do not allow this type of sharding multiple CE databases together through any app tier…
Similar to the point above, multiple CE editions connected for DR is not permitted in the CE version- customers would need the enterprise edition for this. A separate copy for development is permitted.
I am a tad confused, within the limits of < 1 terabytes and <= 3 nodes, can the CE edition be used in production/commercial context ?
Yes, if you’re willing to live with what the 3 node limitation implies for disaster recovery and so on.
So where is it? It’s been almost 3 month since it was announced.
And what is data size (that’s limited to 1TB):
– uncompressed size of all projections
– compressed size of all projections
– uncompressed size of all super projections
– total disk usage by Vertica
– something else?
The data sampled for the estimate is treated as if it had been exported from the database in text format (such as printed from vsql). This means that Vertica evaluates the data type footprint sizes as follows:
vsql is a character-based, interactive, front-end utility that lets you type SQL statements and see the results. It also provides a number of meta-commands and various shell-like features that facilitate writing scripts and automating a variety of tasks.
•Strings and binary types (CHAR, VARCHAR, BINARY, VARBINARY) are counted as their actual size in bytes using UTF-8 encoding.
•Numeric data types are counted as if they had been printed. Each digit counts as a byte, as does any decimal point, sign, or scientific notation. For example, -123.456 counts as eight bytes (six digits plus the decimal point and minus sign).
•Date/time data types are counted as if they had been converted to text, including any hyphens or other separators. For example, a timestamp column containing the value for noon on July 4th, 2011 would be 19 bytes. As text, vsql would print the value as 2011-07-04 12:00:00, which is 19 characters, including the space between the date and the time.
NOTE: Each column has an additional byte for the column delimiter.