JSON in Teradata
I coined the term schema-on-need last month. More precisely, I coined it while being briefed on JSON-in-Teradata, which was announced earlier this week, and is slated for availability in the first half of 2014.
The basic JSON-in-Teradata story is as you expect:
- A JSON document is stuck into a relational field.
(Oddly, Teradata wasn’t yet sure whether the field would be a BLOB or VARCHAR or something else.)Edit: See Dan Graham’s comment below. - Fields within the JSON document can be indexed on.
- Those fields can be referenced in SQL statements much as regular Teradata columns can.
You have to retrieve the whole document.Edit: See Dan Graham’s comment below.- To avert the performance pain of retrieving the whole document, you can of course copy any particular field into a column of its own. (That’s the schema-on-need part of the story.)
JSON virtual columns are referenced a little differently than ordinary physical columns are. Thus, if you materialize a virtual column, you have to change your SQL. If you’re doing business intelligence through a semantic layer, or otherwise have some kind of declarative translation, that’s probably not a big drawback. If you’re coding analytic procedures directly, it still may not be a big drawback — hopefully you won’t reference the virtual column too many times in code before you decide to materialize it instead.
My Bobby McFerrin* imitation notwithstanding, Hadapt illustrates a schema-on-need approach that is slicker than Teradata’s in two ways. First, Hadapt has full SQL transparency between virtual and physical columns. Second, Hadapt handles not just JSON, but anything represented by key-value pairs. Still, like XML before it but more concisely, JSON is a pretty versatile data interchange format. So JSON-in-Teradata would seem to be useful as it stands.
*The singer in the classic 1988 music video Don’t Worry Be Happy. The other two performers, of course, were Elton John and Robin Williams.
Comments
3 Responses to “JSON in Teradata”
Leave a Reply
super article. i got a link about your site while googling i.e without any reference. you insights are amazing. you should really charge for the info provided free of cost.
hats off monash
Curt,
some minor corrections. During our talk a couple weeks ago:
— JSON is stored in a column EITHER as a Varchar or LOB based on the object size. This is handled automatically
— There is no need to retrieve the entire JSON document. Individual name-value pairs are directly accessible.
–Whether its a partially shredded name-value pair or left in JSON, yes, someone has to do something with SQL to access the data. I do not see it as a shortcoming as implied that you have to point SQL at the data. The DBA can use JSON shredded or unshredded.
Hope that helps
[…] But if you look beyond the NoSQL realm, JSON has quietly entered the mainstream: Postgres, Teradata, IBM and Oracle have all either released or are on the verge of releasing support for JSON as a […]