How to delete all data from solr and hbase

如何通过命令从 solr删除所有数据? 我们使用 solrlilyhbase

如何从 hbase 和 solr 中删除数据?

Http://lucene.apache.org/solr/4_10_0/tutorial.html#deleting+data

173312 次浏览

如果要清除 Solr 索引-

你可以点击 http://url-

http://host:port/solr/[core name]/update?stream.body=<delete><query>*:*</query></delete>&commit=true

(将 [core name]替换为要删除的核心的名称)。或者在发布 xml 数据时使用:

<delete><query>*:*</query></delete>

确保使用 commit=true提交更改

但是对于清除 hbase 数据没有太多的想法。

我已经使用这个请求来删除我所有的记录,但有时有必要提交这个。

为此,请在请求中添加 &commit=true:

http://host:port/solr/core/update?stream.body=<delete><query>*:*</query></delete>&commit=true

如果你想通过 SolrJ 删除 Solr 的所有数据,可以这样做。

public static void deleteAllSolrData() {
HttpSolrServer solr = new HttpSolrServer("http://localhost:8080/solr/core/");
try {
solr.deleteByQuery("*:*");
} catch (SolrServerException e) {
throw new RuntimeException("Failed to delete data in Solr. "
+ e.getMessage(), e);
} catch (IOException e) {
throw new RuntimeException("Failed to delete data in Solr. "
+ e.getMessage(), e);
}
}

If you want to delete all of the data in HBase do something like this.

public static void deleteHBaseTable(String tableName, Configuration conf) {
HBaseAdmin admin = null;
try {
admin = new HBaseAdmin(conf);
admin.disableTable(tableName);
admin.deleteTable(tableName);
} catch (MasterNotRunningException e) {
throw new RuntimeException("Unable to delete the table " + tableName
+ ". The actual exception is: " + e.getMessage(), e);
} catch (ZooKeeperConnectionException e) {
throw new RuntimeException("Unable to delete the table " + tableName
+ ". The actual exception is: " + e.getMessage(), e);
} catch (IOException e) {
throw new RuntimeException("Unable to delete the table " + tableName
+ ". The actual exception is: " + e.getMessage(), e);
} finally {
close(admin);
}
}

When clearing out a Solr index, you should also do a commit and optimize after running the delete-all query. Full steps required (curl is all you need): http://www.alphadevx.com/a/365-Clearing-a-Solr-search-index

我来这里是为了删除太阳实例中的所有文档。使用 SolrNet 的 Net 框架。我是这样做到的:

Startup.Init<MyEntity>("http://localhost:8081/solr");
ISolrOperations<MyEntity> solr =
ServiceLocator.Current.GetInstance<ISolrOperations<MyEntity>>();
SolrQuery sq = new SolrQuery("*:*");
solr.Delete(sq);
solr.Commit();

所有文件都清除了

在浏览器里点击这个

http://localhost:8983/solr/update?stream.body=<delete><query>*:*</query></delete>&commit=true 此命令将删除 solr 中的索引中的所有文档

If you need to clean out all data, it might be faster to recreate collection, e.g.

solrctl --zk localhost:2181/solr collection --delete <collectionName>
solrctl --zk localhost:2181/solr collection --create <collectionName> -s 1

在 delete by query 命令中使用“ match all docs”查询: :

您还必须在运行 delete 之后提交,因此,要清空索引,请运行以下两个命令:

curl http://localhost:8983/solr/update --data '<delete><query>*:*</query></delete>' -H 'Content-type:text/xml; charset=utf-8'


curl http://localhost:8983/solr/update --data '<commit/>' -H 'Content-type:text/xml; charset=utf-8'

我做了一个 JavaScript 书签,在 Solr Admin UI 添加了删除链接

javascript: (function() {
var str, $a, new_href, href, upd_str = 'update?stream.body=<delete><query>*:*</query></delete>&commit=true';
$a = $('#result a#url');
href = $a.attr('href');
str = href.match('.+solr\/.+\/(.*)')[1];
new_href = href.replace(str, upd_str);
$('#result').prepend('<a id="url_upd" class="address-bar" href="' + new_href + '"><strong>DELETE ALL</strong>   ' + new_href + '</a>');
})();

enter image description here

可以使用以下命令删除。 在 delete by query 命令中使用“ match all docs”查询: < br >

'<delete><query>*:*</query></delete>

您还必须在运行 delete 之后提交,因此,要清空索引,请运行以下两个命令:

curl http://localhost:8983/solr/update --data '<delete><query>*:*</query></delete>' -H 'Content-type:text/xml; charset=utf-8'
curl http://localhost:8983/solr/update --data '<commit/>' -H 'Content-type:text/xml; charset=utf-8'

另一个策略是在浏览器中添加两个书签:

http://localhost:8983/solr/update?stream.body=<delete><query>*:*</query></delete>
http://localhost:8983/solr/update?stream.body=<commit/>


来自 SOLR 的源文档:
Https://wiki.apache.org/solr/faq#how_can_i_delete_all_documents_from_my_index.3f

我已经使用这个查询删除了所有记录。

http://host/solr/core-name/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E&commit=true

上面的 curl 示例在我从 cygwin 终端运行时都失败了。当我运行脚本示例时,出现了这样的错误。

curl http://192.168.2.20:7773/solr/CORE1/update --data '<delete><query>*:*</query></delete>' -H 'Content-type:text/xml; charset=utf-8'
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">1</int></lst>
</response>
<!--
It looks like it deleted stuff, but it did not go away
maybe because the committing call failed like so
-->
curl http://192.168.1.2:7773/solr/CORE1/update --data-binary '' -H 'Content-type:text/xml; charset=utf-8'
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">400</int><int name="QTime">2</int></lst><lst name="error"><str name="msg">Unexpected EOF in prolog
at [row,col {unknown-source}]: [1,0]</str><int name="code">400</int></lst>
</response>

I needed to use the delete in a loop on core names to wipe them all out in a project.

下面的查询在 Cygwin 终端脚本中对我有效。

curl http://192.168.1.2:7773/hpi/CORE1/update?stream.body=<delete><query>*:*</query></delete>&commit=true
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">1</int></lst>
</response>

这一行使得数据消失并且更改保持不变。

如果您正在使用 Cloudera 5.x,这里的文档中提到 Lily 也维护 Real Time 更新和删除。

Configuring the Lily HBase NRT Indexer Service for Use with Cloudera Search

当 HBase 对 HBase 表单元格应用插入、更新和删除时, 索引器使 Solr 与 HBase 表内容保持一致,使用 标准 HBase 复制。

不确定是否也支持 truncate 'hTable'

否则,您将创建一个 Trigger 或 Service 来清除来自 Solr 和 HBase 的特定事件或任何事件的数据。

发布 json 数据(例如使用 curl)

curl -X POST -H 'Content-Type: application/json' \
'http://<host>:<port>/solr/<core>/update?commit=true' \
-d '{ "delete": {"query":"*:*"} }'

Solr 我不确定,但是你可以使用截断命令从 hbase 中删除所有数据,如下所示:

truncate 'table_name'

It will delete all row-keys from hbase table.

从命令行使用:

 bin/post -c core_name -type text/xml -out yes -d $'<delete><query>*:*</query></delete>'

我试了下面的步骤,效果很好。

  • 请确保 SOLR 服务器正常运行
  • Just click the link 删除所有 SOLR 数据 which will hit and delete all your SOLR indexed datas then you will get the following details on the screen as output.

    <response>
    <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">494</int>
    </lst>
    </response>
    
  • if you are not getting the above output then please make sure the following.

    • I used the default host (localhost) and port (8080) on the above link. please alter the host and port if it is different in your end.
    • The default core name should be collection / collection1. I used collection1 in the above link. please change it too if your core name is different.

要删除 Solr 集合的所有文档,可以使用以下请求:

curl -X POST -H 'Content-Type: application/json' --data-binary '{"delete":{"query":"*:*" }}' http://localhost:8983/solr/my_collection/update?commit=true

它使用 JSON 主体。