Infobright 4.0
Infobright is announcing its 4.0 release, with imminent availability. In marketing and product alike, Infobright is betting the farm on machine-generated data. This hasn’t been Infobright’s strategy from the getgo, but it is these days, with pretty good focus and commitment. While some fraction of Infobright’s customer base is in the Sybase-IQ-like data mart market — and indeed Infobright put out a customer-win press release in that market a few days ago — Infobright’s current customer targets seem to be mainly:
- Web companies, many of which are already MySQL users.
- Telecommunication and similar log data, especially in OEM relationships.
- Trading/financial services, especially at mid-tier companies.
Key aspects of Infobright 4.0 include:
- “Rough Query,” which lets you get approximate query results >10X faster than you could get precise ones, which is a good thing for iterative investigative analytics.
- The start of a plan — “DomainExpert” — to compress and otherwise optimize data in specific, commonly machine-generated patterns, such as URLs or CDRs (call detail records).
- “Distributed Load Manager” — i.e., load nodes that are separate from (and more parallelized than) query nodes.
- A Hadoop connector.
- Lots of cleanup and Bottleneck Whack-A-Mole, although I haven’t paid close attention as to which parts of that are truly new, and which were already handled in recent Infobright point releases.
Items on that list focused on the machine-generated data market include:
- DomainExpert — obviously.
- The Hadoop connector — also obviously.
- The Distributed Load Manager — why would you need such load speeds unless the data is flowing in from machines?
To understand Infobright Rough Query, recall the essence of Infobright’s architecture:
Infobright’s core technical idea is to chop columns of data into 64K chunks, called data packs, and then store concise information about what’s in the packs. The more basic information is stored in data pack nodes,* one per data pack. If you’re familiar with Netezza zone maps, data pack nodes sound like zone maps on steroids. They store maximum values, minimum values, and (where meaningful) aggregates, and also encode information as to which intervals between the min and max values do or don’t contain actual data values.
I.e., a concise, imprecise representation of the database is always kept in RAM, in something Infobright calls the “Knowledge Grid.” Rough Query estimates query results based solely on the information in the Knowledge Grid — i.e., Rough Query always executes against information that’s already in RAM.
To me, Rough Query is the most impressive part of the Infobright 4.0 announcement. DomainExpert sounds like it will be somewhat better than straightforward prefix/suffix compression, but Infobright hasn’t yet convinced me that the difference is substantial. Distributed Load Manager is indeed important, but only because Infobright doesn’t have a shared-nothing MPP (Massively Parallel Processing) option at this time. And the rest is mainly catch-up toward Infobright’s larger and more expensive peers.
Comments
8 Responses to “Infobright 4.0”
Leave a Reply
Since you mentioned “the Sybase-IQ-like data mart market,” do note that Sybase IQ capabilities go far beyond Infobright’s, in particular in in-database analytics, large-object storage, and support for extended types including text. Many users would not be able to simply offload from Sybase IQ to Infobright.
(Disclosure: I’ve done paid work for Sybase within the last year.)
Seth, http://twitter.com/sethgrimes
Infobright is most competitive vs. an older and more mature product like IQ where the big issues are price, speed, and/or ease of installation/administration, rather than feature lists.
That said, all the columnar guys are chasing Aster and Netezza when it comes to in-database analytics. And Teradata as well, in many use cases. Or in some cases they’re still approaching the starting blocks …
[…] do anúncio da Infobright, escreveu o analista Curt Monash, da Monash Research, em seu blog nesta terça-feira. A ferramenta é ideal para o que Monash chama de “analítica […]
Curt,
Thanks for the post on our 4.0 announcement. We are seeing lots of interest in Rough Query from our customers and prospects, and I was glad to see your comment as well.
In regard to DomainExpert, it is more than as described above. As you know, Infobright uses the information it creates and stores in the Knowledge Grid to drive fast query response without indexes, partitions, cubes, projections etc. The more intelligence the Knowledge Grid has about the data and the queries run against it, the faster the response. If the Knowledge Grid has the specific information to answer the query, you get sub-second response time as it won’t need to decompress and access any data to get the answer.
So, DomainExpert is about speeding queries (and yes, there are some compression benefits too) by adding more intelligence about the data, and by allowing users to easily do the same without being DBAs. That way a financial user can add information about their stock tick data for example, and the database will automatically use that information to slash query time.
We will extend DomainExpert in upcoming releases as well – so more queries execute in sub-seconds, compression is enhanced (though we believe our software already delivers better data compression than any other database we are aware of), and data will be automatically stored on disk in an optimal fashion.
Regards,
Susan Davis
VP Marketing and Product Management, Infobright
[…] part” of Infobright’s announcement, analyst Curt Monash of Monash Research wrote in a blog post Tuesday. Such a tool is ideal for what Monash calls “investigative analytics,” he […]
[…] do anúncio da Infobright, escreveu o analista Curt Monash, da Monash Research, em seu blog nesta terça-feira. A ferramenta é ideal para o que Monash chama de “analítica […]
Also, ICE or IEE can be used with http://code.google.com/p/Shard-Query for an MPP solution.
[…] do anúncio da Infobright, escreveu o analista Curt Monash, da Monash Research, em seu blog nesta terça-feira. A ferramenta é ideal para o que Monash chama de “analítica […]