Read-your-writes (RYW), aka immediate, consistency
In which we reveal the fundamental inequality of NoSQL, and why NoSQL folks are so negative about joins.
Discussions of NoSQL design philosophies tend to quickly focus in on the matter of consistency. “Consistency”, however, turns out to be a rather overloaded concept, and confusion often ensues.
In this post I plan to address one essential subject, while ducking various related ones as hard as I can. It’s what Werner Vogel of Amazon called read-your-writes consistency (a term to which I was actually introduced by Justin Sheehy of Basho). It’s either identical or very similar to what is sometimes called immediate consistency, and presumably also to what Amazon has recently called the “read my last write” capability of SimpleDB.
This is something every database-savvy person should know about, but most so far still don’t. I didn’t myself until a few weeks ago.
Considering the many different kinds of consistency outlined in the Werner Vogel link above or in the Wikipedia consistency models article — whose names may not always be used in, er, a wholly consistent manner — I don’t think there’s much benefit to renaming read-your-writes consistency yet again. Rather, let’s just call it RYW consistency, come up with a way to pronounce “RYW”, and have done with it. (I suggest “ree-ooh”, which evokes two syllables from the original phrase. Thoughts?)
Definition: RYW (Read-Your-Writes) consistency is achieved when the system guarantees that, once a record has been updated, any attempt to read the record will return the updated value.
Here a “record” can be a row, a key-value pair, or any similar unit of data. An “update” can be whichever of insert/append or true change the system supports.
A conventional relational DBMS will almost always feature RYW consistency. Some NoSQL systems feature tunable consistency, in which — depending on your settings — RYW consistency may or may not be assured.
The core ideas of RYW consistency, as implemented in various NoSQL systems, are:
- Let N = the number of copies of each record distributed across nodes of a parallel system.
- Let W = the number of nodes that must successfully acknowledge a write for it to be successfully committed. By definition, W <= N.
- Let R = the number of nodes that must send back the same value of a unit of data for it to be accepted as read by the system. By definition, R <= N.
- The greater N-R and N-W are, the more node or network failures you can typically tolerate without blocking work.
- As long as R + W > N, you are assured of RYW consistency.
That bolded part is the key point, and I suggest that you stop and convince yourself of it before reading further.
Example: Let N = 3, W = 2, and R = 2. Suppose you write a record successfully to at least two nodes out of three. Further suppose that you then poll all three of the nodes. Then the only way you can get two values that agree with each other is if at least one of them — and hence both — return the value that was correctly and successfully written to at least two nodes in the first place.
In a conventional parallel DBMS, N = R = W, which is to say N-R = N-W = 0. Thus, a single hardware failure causes data operations to fail too. For some applications — e.g., highly parallel OLTP web apps — that kind of fragility is deemed unacceptable.
On the other hand, if W< N, it is possible to construct edge cases in which two or more consecutive failures cause incorrect data values to actually be returned. So you want to clean up any discrepancies quickly and bring the system back to a consistent state. That is where the idea of eventual consistency comes in, although you definitely can — and in some famous NoSQL implementations actually do — have eventual consistency in a system that is not RYW consistent.
Much technology goes into eventual consistency, as well as into the data distribution and polling in the first place. And in tunable systems, the choices of N, R, and W — perhaps on a “table” by “table” basis — can get pretty interesting. I’m ducking all those subjects for now, however, not least because of how much I still have to learn about them.
One point I will note, however, is this — RYW consistency and table joins make for awkward companions. If you want to join two tables, each of them distributed across some kind of parallel cluster, there are only two possibilities:
- In most cases, the data you need to join is co-located on the same nodes.
- You’re going to have an awful lot of network traffic.
In an R = W = N scenario, co-location may be realistic. But when R < N and W < N, a join can return incorrect results even when both of the tables being joined would have been read correctly.
In our example above, we had N = 3 and R = W = 2. Single-table RYW consistency was ensured. But suppose you join two records, each of which had been written correctly to 2 out of 3 nodes — but with only 1 node being correct about both records. Then only that 1 node out of 3 will return a correct value for the join, and badness will ensue.
Any architecture I can think of to circumvent that problem results in — you guessed it — an awful lot of network traffic.
And that, folks, is a big part of why the NoSQL folks are so negative about joins.
Related link
- Query fault-tolerance
- Huan Liu’s skepticism as to whether RYW consistency causes a significant performance hit
- Daniel Abadi’s views on NoSQL design tradeoffs
Comments
24 Responses to “Read-your-writes (RYW), aka immediate, consistency”
Leave a Reply
Is it just joins or does it affect all queries that return more than one row or aggregate over more than one row?
Is it possible to do a
select country, sum(amount)
from sales
group by country
query in a R+W > N system in a “read consistent” way? And with “read consistent” I mean “read consistent” in the way Oracle defines “read consistency” without causing an enormous amount of network traffic.
Oracle means with “read consistency” that if you fire this group-by-query at 8:12:05 PM you will get back the results of how sales was at 8:12:05 PM, even if the query has to scan millions of records and other people are modifying the sales table (with or without commiting those modications) while this long running query is running.
@rc,
These systems are designed for single-record lookup, at least if you want accuracy guarantees. Once you start going after multiple records at once, R+W>N loses its power to guarantee you accurate results.
Ok, so if you can’t do an accurate multiple record lookup in a reasonable amount of time you can’t join because joining means looking up at least two records.
And db’s like mongodb and couchdb circumvent this limitation (only partially but stil useful) by using hierarchical records that you can load with a lot of stuff and they call those records “documents”.
It is al becoming more clear to me. Thanks!
I don’t think it’s so much you get inconsistent joins, but that you don’t really get database side joins at all. If you need to join two records, you would do so in the application, do two lookups, both of which presumably return correct values.
@unholyguy,
My point is that system designers have three choices:
1. Allow incorrect joins
2. Allow really slow joins
3. Don’t allow joins at all
Your point is that they choose #3.
We’re not contradicting each other. 😉
There’s no magic here. NoSQL systems assume that they know up front what joins will be important (or, more directly, they know what combinations of attributes/columns/however you want to describe them) show up in queries of interest. They then make sure that all those combinations are actually stored together. Since queries may have partially overlapping sets of contributing columns, this generally leads to storing the same data more than once. (As a performance optimization, this is fine. But when it leads some of the NoSQL advocates to say that denormalization is somehow a good in and of itself, it’s nonsense.)
Anyway: In getting your read advantage by writing the data multiple times, what happens to your R+W>N algorithm? Do you wait for the data to be “stable” (more than W copies finished) for all the copies? That doesn’t seem practical in these heavily-write-oriented systems. But if you *don’t* do that, you potentially lose RYW consistency *when considering two or more queries*: If the queries hit different write sets for the same data, one may have finished writing a new value while the other didn’t.
Yes, these systems are “eventually consistent”: The writes eventually all complete. (Well, maybe. If a hardware failure makes it impossible to complete the write set for one query while another has already completed, how do we recover? Assume a write log so that we eventually write the data? What if *that* fails? With independent write sets for the same data, you can’t just fall back on “the transaction never happened”.)
While I’m sure there are people who *are* thinking about this, too much of the NoSQL stuff is done by people who don’t think about what correctness conditions they actually have – and in fact make a *point* of not wanting to deal with the formalisms.
— Jerry
[…] a thought-provoking post, Daniel Abadi points out NoSQL-related terminological problems similar to the ones I just railed against, and argues To me, CAP should really be PACELC — if there is a partition (P) how does the […]
[…] Par Curt Monash, DBMS2, le 01 mai 2010. In which we reveal the fundamental inequality of NoSQL, and why NoSQL folks are so negative about joins. Lire l’article […]
[…] to get around 2PC performance issues, they sounded a lot like eventual consistency. Maybe tunable RYW consistency isn’t in the cards, but at least there’s a NoSQL-like possibility with […]
[…] RYW consistency, most commonly with N = 3 and R = W = 2. […]
[…] Sierra is fully ACID-compliant, with no eventual consistency or RYW consistency story. The default number of copies of each datum is two, and they’re kept consistent via […]
[…] Read-your-writes (RYW) consistency […]
This misperception is due to people’s misunderstanding of abdominoplasty surgery.
Excellent blog post. I certainly love this site. Stick with it!
I think that everything cokposed was very reasonable. But, what about this?
suppose you added a little information? I am not suggesting your content is not good., but suppose you
added a post title that grabbed a person’s attention? I ean RYW (Read-Your-Writes) consistency explained | DBMS 2 : DataBaase
Management System Services is a llittle plain. You should peek aat Yahoo’s home
page and see how they create news titless to grab viwers to click.
You might add a related video or a pic oor two to get readers intdrested about everything’ve got to say.
In my opinion, it might bring your posts a little livelier.
Takee a look at my blog post – http://stopgazviganais.org (stopgazviganais.org)
купить погрузчик фронтальный
blog topic
Healthy Intimate Relationship
RYW (Read-Your-Writes) consistency explained | DBMS 2 : DataBase Management System Services
stop thinking about food
RYW (Read-Your-Writes) consistency explained | DBMS 2 : DataBase Management System Services
Hi, for all time i used to check weblog posts here in the early
hours in the daylight, for the reason that i enjoy to learn more
and more. here is to https://kzxhzkjcxtzo.exblog.jp/33493284/
The very best exercise trackers are additionally versatile sufficient for anyy activity whether you’re jogging, swimming or sleeping.
Have a look at mmy blog post Rumus shio
https://exchange.switchcoin.us
Title: “Discover the Best Rates for Bitcoin, Ethereum, and More at Our Premier Crypto Exchange”
Are you looking to dive into the dynamic world of cryptocurrencies?
Whether you’re curious about the latest Bitcoin price, Ethereum price, or even the intriguing Dogecoin price, our crypto exchange
is your one-stop destination. We understand that the crypto
market can be overwhelming, which is why we’ve made it our mission to simplify your journey from
USD to crypto.
Bitcoin Price USD: Your Gateway to Crypto Investments
Bitcoin, the pioneer of cryptocurrencies, continues to be a market leader.
The Bitcoin price USD is a key indicator of market trends and investor sentiment.
By keeping a close eye on the Bitcoin price, both new and experienced traders can make informed decisions.
Ethereum Price: More Than Just a Digital Currency
Ethereum is more than just a cryptocurrency; it’s a platform for decentralized
applications. Monitoring the Ethereum price is
crucial for those interested in the broader scope of blockchain technology.
XRP Price: The Rapid and Scalable Crypto
XRP, often known for its association with Ripple, offers fast and efficient cross-border transactions.
The XRP price reflects its growing acceptance and utility
in the crypto ecosystem.
Dogecoin Price: From Meme to Mainstream
What started as a joke has now become a significant part of the crypto conversation. The Dogecoin price mirrors the community-driven spirit of cryptocurrencies, and it’s fascinating to watch its journey.
Price of Bitcoin: A Benchmark for Crypto Markets
The price of Bitcoin often sets the tone
for the entire crypto market. It’s a benchmark for investor confidence and market health.
By understanding the factors that influence the Bitcoin price,
traders can better navigate the market.
Crypto Exchange: Your Path to Cryptocurrency Trading
Our crypto exchange is designed to offer you a seamless and secure platform
for converting USD to various cryptocurrencies. We provide real-time data on Bitcoin, Ethereum, XRP,
Dogecoin, and more, ensuring you have the information you need to make smart trades.
Why Choose Our Exchange?
Real-Time Market Data: Stay updated with the
latest crypto prices.
User-Friendly Interface: Whether you’re a beginner or a pro, our platform is easy to use.
Secure Transactions: We prioritize your security and privacy.
Join our platform today and start your journey in the exciting world of
cryptocurrencies. With up-to-date information on Bitcoin price, Ethereum
price, and more, you’re poised to make the most of your crypto investments.
Hey very interesting blog! hyperlink best practices https://xvszmrltosr.exblog.jp/33503288/
https://switchcoin.us
Title: “Discover the Best Rates for Bitcoin, Ethereum, and More at Our Premier Crypto Exchange”
Are you looking to dive into the dynamic world of cryptocurrencies?
Whether you’re curious about the latest Bitcoin price, Ethereum price, or
even the intriguing Dogecoin price, our crypto exchange is your one-stop destination. We understand
that the crypto market can be overwhelming, which is why we’ve made it our mission to
simplify your journey from USD to crypto.
Bitcoin Price USD: Your Gateway to Crypto Investments
Bitcoin, the pioneer of cryptocurrencies, continues to be a market leader.
The Bitcoin price USD is a key indicator of market trends and investor sentiment.
By keeping a close eye on the Bitcoin price, both new and
experienced traders can make informed decisions.
Ethereum Price: More Than Just a Digital Currency
Ethereum is more than just a cryptocurrency; it’s a platform
for decentralized applications. Monitoring the Ethereum price
is crucial for those interested in the broader scope of blockchain technology.
XRP Price: The Rapid and Scalable Crypto
XRP, often known for its association with Ripple, offers fast and efficient cross-border
transactions. The XRP price reflects its growing acceptance
and utility in the crypto ecosystem.
Dogecoin Price: From Meme to Mainstream
What started as a joke has now become a significant part of the crypto conversation. The
Dogecoin price mirrors the community-driven spirit of cryptocurrencies, and it’s fascinating to watch its journey.
Price of Bitcoin: A Benchmark for Crypto Markets
The price of Bitcoin often sets the tone for the entire crypto market.
It’s a benchmark for investor confidence and market health.
By understanding the factors that influence the Bitcoin price,
traders can better navigate the market.
Crypto Exchange: Your Path to Cryptocurrency Trading
Our crypto exchange is designed to offer you a seamless
and secure platform for converting USD to various cryptocurrencies.
We provide real-time data on Bitcoin, Ethereum, XRP, Dogecoin, and more, ensuring you have the information you need to make smart trades.
Why Choose Our Exchange?
Real-Time Market Data: Stay updated with the latest crypto prices.
User-Friendly Interface: Whether you’re a beginner or a pro, our platform is
easy to use.
Secure Transactions: We prioritize your security and
privacy.
Join our platform today and start your journey in the exciting world of cryptocurrencies.
With up-to-date information on Bitcoin price, Ethereum price,
and more, you’re poised to make the most of your crypto investments.
Activate the jamming device and all GPS devices wiothin tthe vehicle,
together with your smartphone, shall be utterly disabled.
My website Toto slot