The point of predicate pushdown
Oracle is announcing today what it’s calling “Oracle Big Data SQL”. As usual, I haven’t been briefed, but highlights seem to include:
- Oracle Big Data SQL is basically data federation using the External Tables capability of the Oracle DBMS.
- Unlike independent products — e.g. Cirro — Oracle Big Data SQL federates SQL queries only across Oracle offerings, such as the Oracle DBMS, the Oracle NoSQL offering, or Oracle’s Cloudera-based Hadoop appliance.
- Also unlike independent products, Oracle Big Data SQL is claimed to be compatible with Oracle’s usual security model and SQL dialect.
- At least when it talks to Hadoop, Oracle Big Data SQL exploits predicate pushdown to reduce network traffic.
And by the way — Oracle Big Data SQL is NOT “SQL-on-Hadoop” as that term is commonly construed, unless the complete Oracle DBMS is running on every node of a Hadoop cluster.
Predicate pushdown is actually a simple concept:
- If you issue a query in one place to run against a lot of data that’s in another place, you could spawn a lot of network traffic, which could be slow and costly. However …
- … if you can “push down” parts of the query to where the data is stored, and thus filter out most of the data, then you can greatly reduce network traffic.
“Predicate pushdown” gets its name from the fact that portions of SQL statements, specifically ones that filter data, are properly referred to as predicates. They earn that name because predicates in mathematical logic and clauses in SQL are the same kind of thing — statements that, upon evaluation, can be TRUE or FALSE for different values of variables or data.
The most famous example of predicate pushdown is Oracle Exadata, with the story there being:
- Oracle’s shared-everything architecture created a huge I/O bottleneck when querying large amounts of data, making Oracle inappropriate for very large data warehouses.
- Oracle Exadata added a second tier of servers each tied to a subset of the overall storage; certain predicates are pushed down to that tier.
- The I/O between Exadata’s two sets of servers is now tolerable, and so Oracle is now often competitive in the high-end data warehousing market,
Oracle evidently calls this “SmartScan”, and says Oracle Big Data SQL does something similar with predicate pushdown into Hadoop.
Oracle also hints at using predicate pushdown to do non-tabular operations on the non-relational systems, rather than shoehorning operations on multi-structured data into the Oracle DBMS, but my details on that are sparse.
Related link
- Chris Kanaracus’ coverage of the announcement quotes me at length.
Comments
10 Responses to “The point of predicate pushdown”
Leave a Reply
Your quotes in the Kanaracus article were excellent.
What does it mean to push a predicate to Hadoop? Does it create map/reduce tasks to filter HDFS data? Does this assume HBase is there to evaluate index filters?
Mark,
Thanks for the compliment, and I don’t know the answers to your questions.
In particular, you’ve alerted me to the fact that I don’t really know whether predicates are being pushed down to the Hadoop cluster as a whole, or whether Oracle has written some kind of an engine to filter data directly on every node (presumably with a call to the Hive metadata store, nee’ HCatalog, coming first to help it figure out what it wants to do).
Certainly enough other vendors are putting engines on each Hadoop data node for the latter option to be at least theoretically possible.
Posting my tweet-response to Curt here on request…
Several optimized columnar file-formats are available for data stored in HDFS e.g. ORC in Hive, Parquet etc.
I’d guess that they use columnar projection (scan only required columns) and predicate pushdown (skip rows which don’t match the filter i.e. the query predicate) available in the columnar formats to reduce IO during the scan from HDFS itself.
Details differ, but, for e.g. ORC has light-weight indices inline with the data which allows to filter out ‘stripes’ which won’t match the predicate – “SELECT name FROM customer WHERE age < 35" can skip stripes where we'd know min(age) is 45. Therefore, they wouldn't need an active processing component.
Hope that helps.
The blog mentions ” Smart Scan for Hadoop introduces a new service into to the Hadoop ecosystem, which is co-resident with HDFS DataNodes and YARN NodeManagers. Queries from the new external tables are sent to these services to ensure that reads are direct path and data-local. Reading close to the data speeds up I/O, but minimizing data movement requires that Smart Scan do some things that are, well, smart.”. Also the example on the blog is JSON. Seems to point to an active component
I interpret that quote from the blog[1] to mean there is an active processing component as well — a daemon that plays the same role on an HDFS DataNode that exists today on an Exadata Storage Server. In essence, this is a port of the Exadata cellsrv daemon that likely has common HDFS file format scanners (Parquet, ORCfile, RCfile, Text, etc.) and is able to apply predicate pushdown in the form of 1) column and data skipping for formats like Parquet or ORCfile that support such or 2) post-scan filtration, including the ability to apply bloom filters[2] (like Exadata), and column pruning for other formats. In either case, data is converted into Oracle internal tuple representation on the DataNodes[3] and flows up to the RAC DB nodes the same way it does from Exadata (which also means that Smart Scanning HDFS nodes don’t talk to each other to do JOINs — that is the responsibility of the RAC DB nodes).
[1] https://blogs.oracle.com/datawarehousing/entry/oracle_big_data_sql_one
[2] https://twitter.com/dominic_giles/status/489095115718537216
[3] https://twitter.com/dan_mcclary/status/489081072471130112
Greg is right…
I was in their session at Hadoop Summit, they explained that they pre-install a local agent on every node of their Hadoop appliance… And Exadata can push down projections, filters and also occasionally bloom filters (for joins) to that agent. It is similar to the interaction between Exadata compute nodes (RAC) and storage nodes (Exadata cells).
Of course, everything is locked down, can be used only with Oracle appliances (Exadata + BIg Data Appliance combo).
Ah! The joys of Oracle predicate push-down…
Following on from his success at Netezza, I joined Foster Hinshaw at Dataupia from 2006-09. The proposition was ‘complimentary MPP’.
The idea was that Oracle (and latterly SQL Server and DB2) queries would be re-routed via Oracle’s remote table capability and be executed elsewhere, specifically on our Postgres MPP back-end, with the result set shipped seamlessly back to the Oracle host.
The goal was that the Oracle user would still perceive the system to be an Oracle system, but it would benefit from a scale-out, shared-nothing architecture that Oracle simply couldn’t deliver on its own.
To achieve this we had to deliver a level of Oracle trsansparency so that SQL query re-writes (by the user not the optimiser) were rare. As an aside, this is the core of the work carried out by Enterprise DB – making Postgres compatible with Oracle.
One of the CDR tables we supported soon grew to over 100 billion rows. Predicate pushdown is clearly critical in a system with a single CPU Oracle host connected to a multi-cabinet MPP back-end. The demand on the Oracle server was so light we really did get away with a weedy 2U server.
Predicate pushdown is where the ‘fun’ started. Connecting Oracle to another Oracle system seems to work well. The SQL query is passed from one to the other as expected.
However, connecting Oracle to a non-Oracle system via heterogeneous services (HS) is another ball game altogether.
Oracle seemed to randomly cut the WHERE and GROUP BY clauses out of the submitted SQL, and pass queries to the back-end that were not as submitted by the user.
This came as somewhat of a surprise, to say the least, and brought our whole value proposition into question. Passing billions of rows back to the host to filter and aggregate (a la Exadata) is not pretty.
No doubt Oracle will have a happier time getting Oracle predicate pushdown to work!
Hadoop Summit 2014 slides and video on the topic for posterity:
http://www.slideshare.net/Hadoop_Summit/t-525p211mc-claryfinal
http://youtu.be/wMtXmE–P5g
Apparently more info is now available at: https://blogs.oracle.com/datawarehousing/entry/big_data_sql_2_0
It seems that the “big data sql agent” is responsible for transforming HDFS data into Oracle format and send it back to Exadata as Cell service does the same. The agent makes use of Hive metadata and SerDes if available. Predicate pushdown is basically reading the required columns (projection pruning) and filtering out rows that do not match the predicates. Data skipping or storage indexes only takes place at the Exadata side during query planning. The dictionary entry for the Hadoop external table is attached a storage index data structure that is filled during the first full scan on Hadoop, again similar to Cell service. The index is invalidated when the agent on hadoop lets Exadata know that the underlying content changed. HDFS table-to-block mapping is apparently let known by the Exadata dictionary via the Hadoop agent.
For me what makes me confused about this is calling predicate pushdown to something that is not a predicate, unless I’m wrong what gets push down is the FROM clause which is not a predicate. The where clause then has predicates. And it is pushed up? or is the FROM clause and the joins predicates the ones that get pushed down? maybe is the second case. and by saying join predicates I find the name less wrong.. would like to have someone helping me understand this. confirming or refuting and justifying. the effect of this technology I understand, my problem is with its name. thanks for the article!