nordlys.core.retrieval.elastic_cache module

Elastic Cache

This is a cache for elastic index stats; a layer between an index and retrieval. The statistics (such as document and term frequencies) are first read from the index and stay in the memory for further usages.

Usage hints

  • Only one instance of Elastic cache needs to be created.
  • If running out of memory, you need to create a new object of ElasticCache.
  • The class also caches termvectors. To further boost efficiency, you can load term vectors for multiple documents using ElasticCache.multi_termvector().
Author:Faegheh Hasibi
class nordlys.core.retrieval.elastic_cache.ElasticCache(index_name)[source]

Bases: nordlys.core.retrieval.elastic.Elastic

avg_len(field)[source]

Returns average length of a field in the collection.

coll_length(field)[source]

Returns length of field in the collection.

coll_term_freq(term, field, tv=None)[source]

Returns collection term frequency for the given field.

doc_count(field)[source]

Returns number of documents with at least one term for the given field.

doc_freq(term, field, tv=None)[source]

Returns document frequency for the given term and field.

doc_length(doc_id, field)[source]

Returns length of a field in a document.

multi_termvector(doc_ids, field, batch=50)[source]

Returns term vectors for a given document and field.

num_docs()[source]

Returns the number of documents in the index.

num_fields()[source]

Returns number of fields in the index.

term_freq(doc_id, field, term)[source]

Returns frequency of a term in a given document and field.

term_freqs(doc_id, field, tv=None)[source]

Returns term frequencies for a given document and field.