Analytic glossary
Draft definitions of various industry terms, which taken together form an analytic glossary.
(Relational) database (management system) — three analytic glossary draft entries
These are three closely-related draft entries for the DBMS2 analytic glossary. Please comment with any ideas you have for their improvement!
1. Database management system (DBMS)
In our definition, a database management system (DBMS) is:
- Software that manages the reading and writing of data …
- … through an application programming interface (API) …
- … that depends solely upon the values of the data and similar logical information.
Commonly, that API takes the form of a data manipulation language (DML) such as SQL or MDX, but our definition allows for APIs as simple as those of key-value stores.
There are two major alternatives to our definition:
- The above could be a definition of “data management software”, with the term “DBMS” reserved for systems with a true DML.
- Many vendors and industry observers abbreviate “database management system” or “data management software” as “database”.
Two important distinctions among categories of DBMS and the processing they’re optimized for are:
- Relational vs. non-relational
- Short-request vs. analytic vs. general-purpose
2. Database
The term database has two common meanings in IT: Read more
Categories: Analytic glossary, Data models and architecture | 3 Comments |
In-memory, (hybrid) memory-centric DBMS — three analytic glossary draft entries
These are three closely-related draft entries for the DBMS2 analytic glossary. Please comment with any ideas you have for their improvement!
1. We coined the term memory-centric data management to comprise several kinds of technology that manage data in RAM (Random Access Memory), including:
- In-memory DBMS (DataBase Management Systems).
- Hybrid memory-centric DBMS.
- Other kinds of in-memory data stores, such as:
- Caching layers.
- In-memory data stores that are tightly tied to specific analytic tools, for example the in-memory data management part of QlikView.
- Complex event/stream processing.
Related link
- Many examples of memory-centric data management (April, 2012)
2. An in-memory DBMS is a DBMS designed under the assumption that substantially all database operations will be performed in RAM (Random Access Memory). Thus, in-memory DBMS form a subcategory of memory-centric data management systems.
Ways in which in-memory DBMS are commonly different from those that query and update persistent storage include: Read more
Categories: Analytic glossary, Cache, In-memory DBMS, Memory-centric data management, Streaming and complex event processing (CEP) | 7 Comments |
In-database analytics — analytic glossary draft entry
This is a draft entry for the DBMS2 analytic glossary. Please comment with any ideas you have for its improvement!
Note: Words and phrases in italics will be linked to other entries when the glossary is complete.
“In-database analytics” is a catch-all term for analytic capabilities, beyond standard SQL, running on the same machine as and under the management of an analytic DBMS. These can run in one or both of two modes:
- In-process or unfenced, i.e. in the same process as the DBMS itself. This option gives maximum performance, but any defects in the analytic code may crash the whole DBMS. Also, it generally requires that the code be in the same language as the DBMS, i.e. C++.
- Out-of-process or fenced, i.e. in a separate process. This option sacrifices performance, in favor of reliability and language flexibility.
In-database analytics may offer great performance and scalability advantages versus the alternative of extracting data and having it be processed on a separate server. This is particularly likely to be the case in MPP (Massively Parallel Processing) analytic DBMS environments.
Examples of in-database analytics include:
- Creating temporary data structures that persist past the life of a query.
- Creating temporary data structures that are non-tabular.
- Predictive modeling that uses all the same nodes in an MPP cluster where the data resides.
- Predictive analytics (scoring only).
Other common domains for in-database analytics include sessionization, time series analysis, and relationship analytics.
Notable products offering in-database analytics include:
- Teradata Aster SQL/MR.
- Multiple other analytic platforms, such as Sybase IQ, Vertica, or IBM Netezza. Indeed, in-database analytics are a defining feature of analytic platforms.
- Fuzzy Logix (for predictive analytics).
Categories: Analytic glossary, Aster Data, Data warehousing, IBM and DB2, MapReduce, Netezza, Parallelization, Predictive modeling and advanced analytics, Sybase, Teradata, Vertica Systems | 8 Comments |
Analytic platform — analytic glossary draft entry
This is a draft entry for the DBMS2 analytic glossary. Please comment with any ideas you have for its improvement!
Note: Words and phrases in italics will be linked to other entries when the glossary is complete.
In our usage, an “analytic platform” is an analytic DBMS with well-integrated in-database analytics, or a data warehouse appliance that includes one. The term is also sometimes used to refer to:
- Any analytic DBMS or data warehouse appliance.
- Other kinds of software, or software/hardware combination, that support broad analytic capabilities.
To varying extents, most major vendors of analytic DBMS or data warehouse appliances have extended their products into analytic platforms; see, for example, our original coverage of analytic platform versions of as Aster, Netezza, or Vertica.
Related posts
- Our original definition of “analytic platform” (February, 2011)
- Our original feature list for analytic platforms (January, 2011)
Categories: Analytic glossary, Aster Data, Data warehouse appliances, Data warehousing, Netezza, Vertica Systems | 3 Comments |
Data warehouse appliance — analytic glossary draft entry
This is a draft entry for the DBMS2 analytic glossary. Please comment with any ideas you have for its improvement!
Note: Words and phrases in italics will be linked to other entries when the glossary is complete.
A data warehouse appliance is a combination of hardware and software that includes an analytic DBMS (DataBase Management System). However, some observers incorrectly apply the term “data warehouse appliance” to any analytic DBMS.
The paradigmatic vendors of data warehouse appliances are:
- Teradata, which embraced the term “data warehouse appliance” in 2008.
- Netezza — now an IBM company — which popularized the term “data warehouse appliance” in the 2000s.
Further, vendors of analytic DBMS commonly offer — directly or through partnerships — optional data warehouse appliance configurations; examples include:
- Greenplum, now part of EMC.
- Vertica, now an HP company.
- IBM DB2, under the brand “Smart Analytic System”.
- Microsoft (Parallel Data Warehouse).
Oracle Exadata is sometimes regarded as a data warehouse appliance as well, despite not being solely focused on analytic use cases.
Data warehouse appliances inherit marketing claims from the category of analytic DBMS, such as: Read more
Categories: Analytic glossary, Data warehouse appliances, Data warehousing, EMC, Exadata, Greenplum, HP and Neoview, IBM and DB2, Microsoft and SQL*Server, Netezza, Oracle, Teradata | 4 Comments |
DBMS2 analytic glossary — a new project
Enterprise software terminology is too often mired in confusion. I hope to lessen that by publishing a series of web pages that define and describe various industry terms, with one or several paragraphs per subject, and plenty of internal and external links.
Absent a better name, I’ll refer to this as an “analytic glossary”. I want users of the analytic glossary to learn or confirm:
- What terms mean.
- Which products exemplify which product categories.
- Which additional subjects to look into.
I will do or closely direct the core writing myself. I may hire outside help for ancillary tasks, such as adding links, or for various kinds of wordsmithing.
All this presupposes a site redesign, which hasn’t begun. But I’ve started to draft the content. As I do, I’ll post it here. And I very much want you to comment.
If you think I got something wrong or left anything out — even if it’s just a nuance — please speak up! Later, when the glossary pages are live, I’ll link them back to the original blog post discussions. Thus, your comments will be part of the permanent glossary record. Read more
Categories: About this blog, Analytic glossary | 5 Comments |