Teradata Unity and the idea of active-active data warehouse replication
Teradata is having its annual conference, Teradata Partners, at the same time as Oracle OpenWorld this week. That made it an easy decision for Teradata to preannounce its big news, Teradata Columnar and the rest of Teradata 14. But of course it held some stuff back, notably Teradata Unity, which is the name chosen for replication technology based on Teradata’s Xkoto acquisition.
The core mission of Teradata Unity is asynchronous, near-real-time replication across Teradata systems. The point of “asynchronous” is performance. The point of “near-real-time” is that it Teradata Unity can be used for high availability and disaster recovery, and further can be used to allow real work on HA and DR database copies. Teradata Unity works request-at-a-time, which limits performance somewhat;* Unity has a lock manager that makes sure updates are applied in the same order on all copies, in cases where locks are needed at all.
*Other options, more suitable for bulk loading and so on, are on the Teradata Unity roadmap.
The idea of doing real work on your high availability or disaster recovery database copies is an important one. Teradata systems are often used for the kinds of mission-critical purposes that call for such extra 2- or 3-way mirroring; so the ability to use all the systems for real work offers, if not exactly 2-3X price/performance savings, at least something significant. Teradata reports low but non-zero penetration in its customer base for active-active replication today. But I’m hopeful that number will increase, as Teradata Unity looks to be a big improvement over the possibilities that existed before.
In theory, the whole workload could be split among mirror-copy systems, although I’m sure we could construct various edge-case scenarios in which doing so would be a Bad Idea. In practice, I’d normally think of using second/third copies of a data warehouse for specific workloads, such as:
- Long-running queries or other analytic exercises.
- Virtual data marts.
- Backups, exports, and so on.
Another possibility to consider is only mirroring part of your database for HA or DR, since not all missions are equally critical. Yet another possibility is to mirror the whole thing, but on systems with different performance characteristics; in case of failover, you might only keep the most crucial applications up, while turning the others off until you can again run on a system powerful enough to handle them.
As Teradata tells it, Teradata Unity has two key aspects:
- Multi-System Synchronization. DDL, DML and DCL (Data Description/Manipulation/Control Language) all are replicated. I.e., data gets copied around, and so does everything else.
- Query Management. Queries get shipped around to appropriate systems based on:
- Which systems manage all the data needed to execute the query.
- Which is up and running at the moment.
- Which is backlogged at the moment. (I get the impression Teradata Unity load balancing is fairly basic in the first release, but there is some.)
Further details may be seen in the slide deck Teradata graciously sent over for posting.
And finally, here’s some Teradata product name housekeeping:
- The first release of Teradata Unity will be numbered 13.10. Not coincidentally, that’s the version number of Teradata’s latest database software. Teradata Unity 13.10 will support Teradata 13 and 13.10.
- Teradata Unity 14 will ship soon after Teradata 14. It will support Teradata 13, 13.10, and 14.
- Teradata Unity runs on a “Managed Server”. “Managed Servers” are nodes inside Teradata’s cabinets, managed by Teradata’s system software and so on, but which do not run Teradata database software.
- In the 14.10 release, Teradata Unity:
- Will scale out across multiple Managed Servers.
- Will be able to serve as a general load facility for Teradata.
- Teradata Unity replaces Teradata Query Direct.
- While I suspect Teradata Unity may replace Teradata Data Mover (bulk data copy) in the future, it surely doesn’t yet.
- Teradata Unity works with Teradata Multi-System Manager, which does things like end-to-end job management).
Comments
2 Responses to “Teradata Unity and the idea of active-active data warehouse replication”
Leave a Reply
Good info. While going through the Teradata’s slide deck it looks like in addition to Data Replication (Active-Active) they are keeping most used data kind of (In-Memory)Teradata Unity system. Good…
Still a bit confused. What is the recommended product/technique for heterogeneous DBMS data replication to the warehouse? Can I hook up Unity to DB2 or Oracle for NRT replication?