如何模拟数据库进行测试(Java) ?

我用 Java 编程,我的应用程序大量使用 DB。因此,能够轻松地测试数据库使用情况对我来说很重要。
DB 测试是关于什么的? 对我来说,它们应该提供两个简单的要求:

  1. 验证 SQL 语法。
  2. 更重要的是,根据给定的情况检查数据是否被正确选择/更新/插入。

看来我只需要一具尸体。
但实际上,我不希望这样,因为使用 DB 进行测试没有什么困难:

  • “给自己弄个测试数据库,能有多难”在我工作的地方,私人测试数据库是不可能的。您必须使用“公共”数据库,这对每个人都是可访问的。
  • “这些测试肯定不快... ...”-DB 测试往往比通常的测试慢。进行缓慢的测试并不理想。
  • “这个程序应该处理任何情况!”- 它变得有点恼人,甚至不可能尝试和模拟每一个案件在数据库。对于每种情况,都应该执行一定数量的插入/更新查询,这很烦人,而且需要时间。
  • “等等,你怎么知道那张表里有542行?”测试的主要原则之一,就是能够以一种不同于测试代码的方式来测试功能。当使用数据库时,通常有一种方法来执行某些操作,因此测试与核心代码完全相同。

所以,当涉及到测试的时候,你可以发现我不喜欢 DBs (当然,在某些时候我将不得不这样做,但是我宁愿稍后在我的测试中,在我使用其余的测试方法发现大多数 bug 之后再去做)。但我要找什么?

我正在寻找一种方法来模拟数据库,模拟数据库,使用文件系统或只是虚拟内存。我认为也许有一个 Java 工具/包,允许每个测试简单地构造(使用代码接口)一个 DB 模拟,带有模拟的表和行,带有 SQL 验证,并带有监视其状态的代码接口(而不是使用 SQL)。

你熟悉这种工具吗?


编辑: 谢谢你的回答!虽然我要求一个工具,你也提供了一些关于这个问题的提示我:)这将花费我一些时间来检查你的报价,所以我现在不能说你的答案是否令人满意不是。

无论如何,这里有一个我正在寻找的更好的视图——假设一个名为 DBMonitor 的类,其特性之一是查找表中的行数。下面是我想要如何使用 JUnit 测试该特性的想象代码:

public class TestDBMonitor extends TestCase {


@Override
public void setUp() throws Exception {


MockConnection connection = new MockConnection();


this.tableName = "table1";
MockTable table = new MockTable(tableName);


String columnName = "column1";
ColumnType columnType = ColumnType.NUMBER;
int columnSize = 50;
MockColumn column = new MockColumn(columnName, columnType, columnSize);
table.addColumn(column);


for (int i = 0; i < 20; i++) {
HashMap<MockColumn, Object> fields = new HashMap<MockColumn, Object>();
fields.put(column, i);
table.addRow(fields);
}


this.connection = connection;
}


@Test
public void testGatherStatistics() throws Exception {


DBMonitor monitor = new DBMonitor(connection);
monitor.gatherStatistics();
assertEquals(((MockConnection) connection).getNumberOfRows(tableName),
monitor.getNumberOfRows(tableName));
}


String tableName;
Connection connection;
}

我希望这段代码足够清晰,能够理解我的想法(请原谅我的语法错误,我在没有亲爱的 Eclipse: P 的情况下手动输入)。

顺便说一下,我部分使用 ORM,我的原始 SQL 查询非常简单,不应该因为平台不同而有所不同。

113392 次浏览

I've used Hypersonic for this purpose. Basically, it's a JAR file (a pure Java in-memory database) that you can run in its own JVM or in your own JVM and while it's running, you have a database. Then you stop it and your database goes away. I've used it -- so far -- as a purely in-memory database. It's very simple to start and stop via Ant when running unit tests.

Java comes with Java DB.

That said, I would advise against using a different type of DB than what you use in production unless you go through an ORM layer. Otherwise, your SQL might not be as cross-platform as you think.

Also check out DbUnit

Well to begin with ,are you using any ORM Layer for DB access?
If not : then what you are thinking would be of no use.What's the use of testing when you are not sure that SQL you are firing will work with your DB in production as in test cases you are using something else.
If yes:Then you can have look at various options pointed out.

We're creating a database test environment at work right now. We feel we must use a real database management system with simulated data. One problem with a simulated DBMS is that SQL never really totally gelled as a standard, so an artificial testing environment would have to faithfully support our production database's dialect. Another problem is that we make extensive use of column value constraints, foreign key constraints, and unique constraints, and since an artificial tool probably wouldn't implement these, our unit tests could pass but our system tests would fail when they first hit the real constraints. If tests take too long, this indicates an implementation error and we would tune our queries (typically test data sets are miniscule compared to production).

We've installed a real DBMS on each developer machine and on our continuous integration and test server (we use Hudson). I don't know what your work policy restrictions are, but it's pretty easy to install and use PostgreSQL, MySQL, and Oracle XE. These are all free for development use (even Oracle XE), so there's no rational reason to prohibit their use.

The key issue is how do you guarantee that your tests always start out with the database in a consistent state? If the tests were all read-only, no problem. If you could engineer mutating tests to always run in transactions that never commit, no problem. But typically you need to worry about reversing updates. To do this you can export the initial state to a file, then importing it back post-test (Oracle's exp and imp shell commands do this). Or you can use a checkpoint/rollback. But a more elegant way is to use a tool like dbunit, which works well for us.

The key advantage to this is that we catch many more bugs up front where they're far easier to fix and our real system testing doesn't get blocked while developers feverishly try to debug problems. This means we produce better code faster and with less effort.

We recently switched to JavaDB or Derby to implement this. Derby 10.5.1.1 now implements an in-memory representation so it runs very fast, it doesn't need to go to disk: Derby In Memory Primer

We design our application to run on Oracle, PostgreSQL and Derby so we don't get too far down the road on any one platform before finding out that one database supports a feature that other ones don't.

If you are using Oracle at work you can use the Restore Point in Flashback Database feature to make the database return to a time before your tests. This will clear away any changes you personally made to the DB.

See:

https://docs.oracle.com/cd/E11882_01/backup.112/e10642/flashdb.htm#BRADV71000

If you need a test database for use with Oracle production/work then lookup the XE, express edition database from Oracle. This is free for personal use, with a limit of database less than 2gb in size.

There are lots of points of view on how to test integration points such as the Database connection via SQL. My personal set of rules that has worked well for me is as follows:

1) Separate out the Database accessing logic and functions from general business logic and hide it behind an interface. Reason: In order to test the grand majority of logic in the system it is best to use a dummy/stub in place of the actual database as its simpler. Reason 2: It is dramatically faster

2) Treat tests for the database as integration tests that are separated from the main body of unit tests and need to run on a setup database Reason: Speed and quality of tests

3) Every developer will need their own distinct database. They will need an automated way to update its structure based on changes from their team mates and introduce data. See points 4 and 5.

4) Use a tool like http://www.liquibase.org to manage upgrades in your databases structure. Reason: Gives you agility in the ability to change the existing structure and move forward in versions

5) Use a tool like http://dbunit.sourceforge.net/ to manage the data. Set up scenario files (xml or XLS) for particular test cases and base data and only clear down what is needed for any one test case. Reason: Much better than manually inserting and deleting data Reason 2: Easier for testers to understand how to adjust scenarios Reason 3: Its quicker to execute this

6) You need functional tests which also have DBUnit like scenario data, but this are far larger sets of data and execute the entire system. This completes the step of combining the knowledge that a) The unit tests run and hence the logic is sound b) That the integration tests to the database run and SQL is correct resulting in "and the system as a whole works together as a top to bottom stack"

This combination has served me well so far for achieving a high quality of testing and product as well as maintaining speed of unit test development and agility to change.

"Just get yourself a testing DB, how hard could it be?" - Well, in my working place, to have a personal testing DB is pretty impossible. You have to use a "public" DB, which is accessible for everyone.

Sounds like you've got cultural problems at work that are providing a barrier to you being able to do your job to the fullest of your abilities and the benefit of your product. You might want to do something about that.

On the other hand, if your database schema is under version control then you could always have a test build that creates a database from the schema, populates it with test data, runs your tests, gathers the results and then drops the database. It'd only be in existence for the duration of the tests. It can be a new database on an existing installation if hardware is a problem. This is similar to what we do where I work.

I agree with banjollity. Setting up isolated development and test environments should be a high priority. Every database system I've used is either open source or has a free developer edition you can install on your local workstation. This lets you develop against the same database dialect as production, gives you full admin access to development databases and is faster than using a remote server.

Try to use derby. It is easy and portable. With Hibernate your app becomes flexible. Test on derby, production on anything you like and trust.

I think my Acolyte framework can be used for such DB mock up: https://github.com/cchantep/acolyte .

It allows to run existing Java (for testing) with connections you man query/update handling: returning appropriate resultsets, update count or warning according execution cases.

new answer to old question (but things have moved forward a bit):

How to simulate a DB for testing (Java)?

you don't simulate it. you mock your repositiories and you don't test them or you use the same db in your tests and you test your sqls. All the in-memory dbs are not fully compatible so they won't give you full coverage and reliability. and never ever try to mock/simulate the deep db objects like connection, result set etc. it gives you no value at all and is a nightmare to develop and maintain

to have a personal testing DB is pretty impossible. You have to use a "public" DB, which is accessible for everyone

unfortunately a lot of companies still use that model but now we have docker and there are images for almost every db. commercial products have some limitations (like up to a few gb of data) that are non-important for tests. also you need your schema and structure to be created on this local db

"These tests sure ain't fast..." - DB tests tend to be slower than usual tests. It's really not ideal to have slow tests.

yes, db tests are slower but they are not that slow. I did some simple measurements and a typical test took 5-50ms. what takes time is the application startup. there are plenty of ways to speed this up:

  • first DI frameworks (like spring) offers a way run only some part of your application. if you write your application with a good separation of db and non-db related logic, then in your test you can start only the db part
  • each db have plenty of tuning options that makes it less durable and much faster. that's perfect for testing. postgres example
  • you can also put the entire db into tmpfs

  • another helpful strategy is to have groups of tests and keep db tests turned off by default (if they really slows your build). this way if someone is actually working on db, he needs to pass additional flag in the cmd line or use IDE (testng groups and custom test selectors are perfect for this)

For each case a certain amount of insert/update queries should be made, which is annoying and takes time

'takes time' part was discussed above. is it annoying? I've seen two ways:

  • prepare one dataset for your all test cases. then you have to maintain it and reason about it. usually it's separated from code. it has kilobytes or megabytes. it's to big to see on one screen, to comprehend and to reason about. it introduces coupling between tests. because when you need more rows for test A, your count(*) in test B fails. it only grows because even when you delete some tests, you don't know which rows were used only by this one test
  • each tests prepares its data. this way each test is completely independent, readable and easy to reason about. is it annoying? imo, not at all! it let you write new tests very quickly and saves you a lot of work in future

how do you know there are 542 rows in that table?" - One of the main principles in testing, is to be able to test the functionality in a way different from that of your tested-code

uhm... not really. the main principle is to check if your software generates desired output in response to specific input. so if you call dao.insert 542 times and then your dao.count returns 542, it means your software works as specified. if you want, you can call commit/drop cache in between. Of course, sometimes you want to test your implementation instead of the contract and then you check if your dao changed the state of the db. but you always test sql A using sql B (insert vs select, sequence next_val vs returned value etc). yes, you'll always have the problem 'who will test my tests', and the answer is: no one, so keep them simple!

other tools that may help you:

  1. testcontainers will help you provide real db.

  2. dbunit - will help you clean the data between tests

    cons:

    • a lot of work is required to create and maintain schema and data. especially when your project is in a intensive development stage.
    • it's another abstraction layer so if suddenly you want to use some db feature that is unsupported by this tool, it may be difficult to test it
  3. testegration - intents to provide you full, ready to use and extensible lifecycle (disclosure: i'm a creator).

    cons:

    • free only for small projects
    • very young project
  4. flyway or liquibase - db migration tools. they help you easily create schema and all the structures on your local db for tests.

You could HSQLDB for in memory db testing. Starting the in memory data base and running tests on it is pretty straightforward.
http://hsqldb.org/

jOOQ is a tool that apart from offering SQL abstraction also has small tools built in such as an SPI that allows for mocking the entirety of JDBC. This can work in two ways as documented in this blog post:

By implementing the MockDataProvider SPI:

// context contains the SQL string and bind variables, etc.
MockDataProvider provider = context -> {


// This defines the update counts, result sets, etc.
// depending on the context above.
return new MockResult[] { ... }
};

In the above implementation, you can programmatically intercept every SQL statement and return a result for it, even dynamically by "parsing" the SQL string to extract some predicates / table information, etc.

By using the simpler (but less powerful) MockFileDatabase

... which has a format like the following (a set of statement / result pairs):

select first_name, last_name from actor;
> first_name last_name
> ---------- ---------
> GINA       DEGENERES
> WALTER     TORN
> MARY       KEITEL
@ rows: 3

The above file can then be read and consumed as follows:

import static java.lang.System.out;
import java.sql.*;
import org.jooq.tools.jdbc.*;


public class Mocking {
public static void main(String[] args) throws Exception {
MockDataProvider db = new MockFileDatabase(
Mocking.class.getResourceAsStream("/mocking.txt");


try (Connection c = new MockConnection(db));
Statement s = c.createStatement()) {


out.println("Actors:");
out.println("-------");
try (ResultSet rs = s.executeQuery(
"select first_name, last_name from actor")) {
while (rs.next())
out.println(rs.getString(1)
+ " " + rs.getString(2));
}
}
}
}

Notice how we're using the JDBC API directly, without actually connecting to any database.

Do note, I work for the vendor of jOOQ so this answer is biased.

Beware, at some point, you're implementing an entire database

The above works for simple cases. But beware that, eventually, you will be implementing an entire database. You want:

  1. Verify SQL syntax.

OK, by mocking the database as shown above, you can "verify" syntax, because each syntax that you haven't foreseen in the exact version as listed above will be rejected by any such mocking approach.

You could implement a parser that parses SQL (or, again, use jOOQ's), and then transform the SQL statement into something you can more easily recognise and produce a result for. But ultimately, this just means implementing an entire database.

  1. More importantly, check that the data is selected/updated/inserted correctly, according to a given situation.

This makes things even harder. If you run an insert and then update, the result is obviously different from update first, then insert, as the update may or may not affect the inserted row.

How do you make sure this happens when "mocking" a database? You need a state machine that remembers the state of each "mocked" table. In other words, you'll implement a database.

Mocking will only take you this far

As piotrek mentioned, too, mocking will only take you this far. It is useful in simple cases when you need to intercept only a few very well known queries. It is impossible, if you want to mock the database for an entire system. In that case, use an actual database, ideally the same product that you're using in production.

H2 Database

"many Java developers’ favourite integration test database."

(Hypersonic became HSQLDB which was rewritten as H2)

https://blog.jooq.org/2015/08/18/jooq-tuesdays-thomas-muller-unveils-how-hsqldb-evolved-into-the-popular-h2-database/