I'm looking into NoSQL for scaling alternatives to a database. What do I do if I want transaction-based things that are sensitive to these kind of things?
Depends on your DB, but ... I would say in general, you can use 'Optimistic transactions' to achieve this but I imagine one should make sure to understand the database implementation's atomicity guarantees (e.g. what kind of write and read operations are atomic).
Generally speaking, NoSQL solutions have lighter weight transactional semantics than relational databases, but still have facilities for atomic operations at some level.
Generally, the ones which do master-master replication provide less in the way of consistency, and more availability. So one should choose the right tool for the right problem.
Many offer transactions at the single document (or row etc.) level. For example with MongoDB there is atomicity at the single document - but documents can be fairly rich so this usually works pretty well -- more info here.
You can always use a NoSQL approach in a SQL DB. NoSQL seems to generally use "key/value data stores": you can always implement this in your preferred RDBMS and hence keep the good stuff like transactions, ACID properties, support from your friendly DBA, etc, while realising the NoSQL performance and flexibility benefits, e.g. via a table such as
Bonus is you can add extra fields here to link your content into other, properly relational tables, while still keeping your bulky content in the main BLOB (or TEXT if apt) field.
Personally I favour a TEXT representation so you're not tied into a language for working with the data, e.g. using serialized Java means you can access the content from Perl for reporting, say. TEXT is also easier to debug and generally work with as a developer.
This is the closest answer I found which would apply to any NoSQL database. It's on a 2007 blog post from Adam Wiggins of Heroku.com:
The old example of using a database transaction to wrap the transfer of money from one bank account to another is total bull. The correct solution is to store a list of ledger events (transfers between accounts) and show the current balance as a sum of the ledger. If you’re programming in a functional language (or thinking that way), this is obvious.
The client (aka member or customer) follows these steps to take out money:
Submit a request to take out money.
Request is sent to server.
Server places it in a queue. The message is: "Take out $5,000."
Client is shown: "Please wait as request is being fulfilled..."
Client machines polls server every 2 seconds asking, "Has the request been fulfilled?"
On server, background workers are fulfilling previous requests from other members in first-in/first-out fashion. Eventually, they get to your client's request to take out money.
Once request has been fulfilled, client is given a message with their new balance.
You can use Heroku.com to create a small mock-up quickly if you are comfortable with Node.js or Ruby/Rack.
The general idea seems pretty easy and much better than using transactions baked into the database that make it super-hard to scale.
Disclaimer: I haven't implemented this in any way yet. I read about these things for curiosity even though I have no practical need for them. Yes, @gbn is right that a RDBMS with transactions would probably be sufficient for the needs of Timmy and me. Nevertheless, it would be fun to see how far you can take NoSQL databases with open-source tools and a how-to website called, "A Tornado of Razorblades".
Just wanted to comment to money transaction advice on this thread. Transactions are something you really want to use with money transfers.
The example given how do que the transfers is very nice and tidy.
But in real life transferring money may include fees or payments to other accounts. People get bonuses for using certain cards that come from another account or they may get fees taken from their account to another account in same system. The fees or payments can vary by financial transaction and you may need to keep up bookkeeping system that shows credit and debit of each transaction as it comes.
This means you want to update more than one row same time since credit on one account can be debit on one or more accounts. First you lock the rows so nothing can change before update then you make sure data written is consistent with the transaction.
That's why you really want to use transactions. If anything goes wrong writing to one row you can rollback whole bunch of updates without the financial transaction data ending inconsistent.
The problem with one transaction and two operations (for example one pay $5,000, second receive $5,000) - is that you have two accounts with same priority. You cannot use one account to confirm second (or in reverse order). In this case you can guaranty only one account will be correct (that is confirmed), second (that confirm) may have fails. Lets look why it can fails (using message aproatch, sender is confirmed by receiver):
Write +$5,000 to receiver account
If success - write -$5,000 to sender account
If fails - try againt or cancel or show message
It will guaranty save for #1. But who guaranty if #2 fails? Same for reverse order.
But this is possible to implements to be safe without transactions and with NoSQL. You are always allowed use third entity that will be confirmed from sender and receiver side and guaranty your operation was performed:
Generating unique transaction id and creating transaction entity
Write +$5,000 to receiver account (with reference to transaction id)
If success - set state of transaction to send
Write -$5,000 to sedned account account (with reference to transaction id)
If success - set state of transaction to receive
This transaction record will guaranty that is was ok for send/receive massages. Now you can check every message by transaction id and if it has state received or completed - you take it in account for user balance.
That's why I'm creating a NoSQL Document store solution to be able to use "real" transactions on Enterprise applications with the power of unstructured data approach. Take a look at http://djondb.com and feel free to add any feature you think could be useful.
NoSQL covers a diverse set of tools and services, including key-value-, document, graph and wide-column stores. They usually try improving scalability of the data store, usually by distributing data processing.
Transactions require ACID properties of how DBs perform user operations. ACID restricts how scalability can be improved: most of the NoSQL tools relax consistency criteria of the operatioins to get fault-tolerance and availability for scaling, which makes implementing ACID transactions very hard.
A commonly cited theoretical reasoning of distributed data stores is the CAP theorem: consistency, availability and partition tolerance cannot be achieved at the same time. SQL, NoSQL and NewSQL tools can be classified according to what they give up; a good figure might be found here.
A new, weaker set of requirements replacing ACID is BASE ("basically avalilable, soft state, eventual consistency"). However, eventually consistent tools ("eventually all accesses to an item will return the last updated value") are hardly acceptable in transactional applications like banking. Here a good idea would be to use in-memory, column-oriented and distributed SQL/ACID databases, for example VoltDB; I suggest looking at these "NewSQL" solutions.
You can implement optimistic transactions on top of NoSQL solution if it supports compare-and-set. I wrote an example and some explanation on a GitHub page how to do it in MongoDB, but you can repeat it in any suitable NoSQL solution.