将 mongo 查询的输出重定向到一个 csv 文件

我在32位 Windows7机器上使用 MongoDB2.2.2。中有一个复杂的聚合查询。Js 文件。我需要在 shell 上执行这个文件,并将输出指向一个 CSV 文件。我确保查询返回一个“扁平的”json (没有嵌套的键) ,因此它本质上可以转换为一个整洁的 csv。

I know about load() and eval(). eval() requires me to paste the whole query into the shell and allows only printjson() inside the script, while I need csv. And, the second way: load()..It prints the output on the screen, and again in json format.

有没有一种方法 Mongo 可以实现从 json 到 csv 的转换?(我需要 csv 文件准备图表的数据)。我在想: < br > < br > 1. 任何一个 mongo 都有一个我现在找不到的内置命令。
2.Mongo 无法为我做到这一点; 我最多只能将 json 输出发送到一个文件,然后我需要自己将其转换为 csv。
3.Mongo 可以将 json 输出发送到一个临时集合,其内容可以很容易地从 mongoexported转换为 csv 格式。但我认为只有 map-reduce 查询支持输出集合。是吗?我需要它来进行聚合查询。

谢谢你的帮助:)

141744 次浏览

Have a look at this

for outputing from mongo shell to file. There is no support for outputing csv from mongos shell. You would have to write the javascript yourself or use one of the many converters available. Google "convert json to csv" for example.

I know this question is old but I spend an hour trying to export a complex query to csv and I wanted to share my thoughts. First I couldn't get any of the json to csv converters to work (although this one looked promising). What I ended up doing was manually writing the csv file in my mongo script.

This is a simple version but essentially what I did:

print("name,id,email");
db.User.find().forEach(function(user){
print(user.name+","+user._id.valueOf()+","+user.email);
});

This I just piped the query to stdout

mongo test export.js > out.csv

where test is the name of the database I use.

Here is what you can try:

print("id,name,startDate")
cursor = db.<collection_name>.find();
while (cursor.hasNext()) {
jsonObject = cursor.next();
print(jsonObject._id.valueOf() + "," + jsonObject.name + ",\"" + jsonObject.stateDate.toUTCString() +"\"")


}

Save that in a file, say "export.js". Run the following command:

mongo <host>/<dbname> -u <username> -p <password> export.js > out.csv

Mongo's in-built export is working fine, unless you want to any data manipulation like format date, covert data types etc.

Following command works as charm.

    mongoexport -h localhost -d databse -c collection --type=csv
--fields erpNum,orderId,time,status
-q '{"time":{"$gt":1438275600000}, "status":{"$ne" :"Cancelled"}}'
--out report.csv

Extending other answers:

I found @GEverding's answer most flexible. It also works with aggregation:

test_db.js

print("name,email");


db.users.aggregate([
{ $match: {} }
]).forEach(function(user) {
print(user.name+","+user.email);
}
});

Execute the following command to export results:

mongo test_db < ./test_db.js >> ./test_db.csv

Unfortunately, it adds additional text to the CSV file which requires processing the file before we can use it:

MongoDB shell version: 3.2.10
connecting to: test_db

But we can make mongo shell stop spitting out those comments and only print what we have asked for by passing the --quiet flag

mongo --quiet test_db < ./test_db.js >> ./test_db.csv

Just weighing in here with a nice solution I have been using. This is similar to Lucky Soni's solution above in that it supports aggregation, but doesn't require hard coding of the field names.

cursor = db.<collection_name>.<my_query_with_aggregation>;


headerPrinted = false;
while (cursor.hasNext()) {
item = cursor.next();
    

if (!headerPrinted) {
print(Object.keys(item).join(','));
headerPrinted = true;
}


line = Object
.keys(item)
.map(function(prop) {
return '"' + item[prop] + '"';
})
.join(',');
print(line);
}

Save this as a .js file, in this case we'll call it example.js and run it with the mongo command line like so:

mongo <database_name> example.js --quiet > example.csv

I use the following technique. It makes it easy to keep the column names in sync with the content:

var cursor = db.getCollection('Employees.Details').find({})


var header = []
var rows = []


var firstRow = true
cursor.forEach((doc) =>
{
var cells = []
    

if (firstRow) header.push("employee_number")
cells.push(doc.EmpNum.valueOf())


if (firstRow) header.push("name")
cells.push(doc.FullName.valueOf())


if (firstRow) header.push("dob")
cells.push(doc.DateOfBirth.valueOf())
    

row = cells.join(',')
rows.push(row)


firstRow =  false
})


print(header.join(','))
print(rows.join('\n'))

When executing a script in a remote server. Mongo will add its own logging output, which we might want to omit from our file. --quiet option will only disable connection related logs. Not all mongo logs. In such case we might need to filter out unneeded lines manually. A Windows based example:

mongo dbname --username userName --password password --host replicaset/ip:port --quiet printDataToCsv.js | findstr /v "NETWORK" > data.csv

This will pipe the script output and use findstr to filter out any lines, which have NETWORK string in them. More information on findstr: https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/findstr

A Linux version of this would use grep.