knncolle_annoy
Annoy nearest neighbors in knncolle
Loading...
Searching...
No Matches
knncolle_annoy Namespace Reference

Approximate nearest neighbor search with Annoy. More...

Classes

class  AnnoyBuilder
 Perform an approximate nearest neighbor search with Annoy. More...
 
struct  AnnoyOptions
 Options for AnnoyBuilder(). More...
 
struct  AnnoyPrebuiltTypes
 Template types of a saved Annoy index. More...
 

Functions

AnnoyPrebuiltTypes load_annoy_prebuilt_types (const std::filesystem::path &dir)
 
template<typename Index_ , typename Data_ , typename Distance_ , class AnnoyDistance_ , typename AnnoyIndex_ = Index_, typename AnnoyData_ = float, class AnnoyRng_ = Annoy::Kiss64Random, class AnnoyThreadPolicy_ = Annoy::AnnoyIndexSingleThreadedBuildPolicy>
auto load_annoy_prebuilt (const std::filesystem::path &dir)
 
template<typename AnnoyDistance_ >
const char * get_distance_name ()
 
template<class AnnoyIndex_ >
std::function< void(const std::filesystem::path &)> & custom_save_for_annoy_index ()
 
template<class AnnoyData_ >
std::function< void(const std::filesystem::path &)> & custom_save_for_annoy_data ()
 
template<class AnnoyDistance_ >
std::function< void(const std::filesystem::path &)> & custom_save_for_annoy_distance ()
 

Detailed Description

Approximate nearest neighbor search with Annoy.

Function Documentation

◆ custom_save_for_annoy_data()

template<class AnnoyData_ >
std::function< void(const std::filesystem::path &)> & knncolle_annoy::custom_save_for_annoy_data ( )

Define a saving function to preserve AnnoyData_ type information when saving a prebuilt Annoy index in knncolle::Prebuilt::save(). Users should define their own function here to handle an AnnoyData_ type that is unknown to knncolle::get_numeric_type(). The action of setting/unsetting to the global function is not thread-safe and should be done in a serial section.

The sole argument of the global function is the same dir provided to knncolle::Prebuilt::save(). If set, the global function is generally expected to write information about AnnoyData_ to files inside dir. It is recommended that the names of such files should not start with an upper-case letter to avoid conflicts with files generated by save().

Template Parameters
AnnoyData_Floating-point type for data in the Annoy index.
Returns
Reference to a global function for saving information about AnnoyData_. By default, the global function is not set. If set, the global function will be called by the knncolle::Prebuilt::save() method for the Annoy Prebuilt subclass.

◆ custom_save_for_annoy_distance()

template<class AnnoyDistance_ >
std::function< void(const std::filesystem::path &)> & knncolle_annoy::custom_save_for_annoy_distance ( )

Define a saving function to preserve AnnoyDistance_ type information when saving a prebuilt Annoy index in knncolle::Prebuilt::save(). Users should define their own function here to handle an AnnoyDistance_ type that is unknown to get_distance_name(). The action of setting/unsetting to the global function is not thread-safe and should be done in a serial section.

The sole argument of the global function is the same dir provided to knncolle::Prebuilt::save(). If set, the global function is generally expected to write information about AnnoyDistance_ to files inside dir. It is recommended that the names of such files should not start with an upper-case letter to avoid conflicts with files generated by save().

Template Parameters
AnnoyDistance_An Annoy-compatible class to compute the distance between vectors.
Returns
Reference to a global function for saving information about AnnoyDistance_. By default, the global function is not set. If set, the global function will be called by the knncolle::Prebuilt::save() method for the Annoy Prebuilt subclass.

◆ custom_save_for_annoy_index()

template<class AnnoyIndex_ >
std::function< void(const std::filesystem::path &)> & knncolle_annoy::custom_save_for_annoy_index ( )

Define a global function to preserve AnnoyIndex_ type information when saving a prebuilt Annoy index in knncolle::Prebuilt::save(). Users should define their own function here to handle an AnnoyIndex_ type that is unknown to knncolle::get_numeric_type(). The action of setting/unsetting the global function is not thread-safe and should be done in a serial section.

The sole argument of the global function is the same dir provided to knncolle::Prebuilt::save(). If set, the global function is generally expected to write information about AnnoyIndex_ to files inside dir. It is recommended that the names of such files should not start with an upper-case letter to avoid conflicts with files generated by save().

Template Parameters
AnnoyIndex_Integer type for the observation indices in the Annoy index.
Returns
Reference to a global function for saving information about AnnoyIndex_. By default, the global function is not set. If set, the global function will be called by the knncolle::Prebuilt::save() method for the Annoy knncolle::Prebuilt subclass.

◆ get_distance_name()

template<typename AnnoyDistance_ >
const char * knncolle_annoy::get_distance_name ( )
Template Parameters
AnnoyDistance_An Annoy-compatible class to compute the distance between vectors, as used in AnnoyBuilder().
Returns
Name of the distance metric, e.g., "euclidean", "manhattan". This is taken from AnnoyDistance_::name() if such a method exists, otherwise "unknown" is returned.

For unknown distances, consider using custom_save_for_annoy_distance() to add more information to the on-disk representation during a knncolle::Prebuilt::save() call.

◆ load_annoy_prebuilt()

template<typename Index_ , typename Data_ , typename Distance_ , class AnnoyDistance_ , typename AnnoyIndex_ = Index_, typename AnnoyData_ = float, class AnnoyRng_ = Annoy::Kiss64Random, class AnnoyThreadPolicy_ = Annoy::AnnoyIndexSingleThreadedBuildPolicy>
auto knncolle_annoy::load_annoy_prebuilt ( const std::filesystem::path & dir)

Helper function to define a knncolle::LoadPrebuiltFunction for Annoy in knncolle::load_prebuilt_raw().

To load an Annoy index from disk, users are expected to define and register an Annoy-specific knncolle::LoadPrebuiltFunction. In this function, users should call load_annoy_prebuilt_types() to figure out the saved index's AnnoyDistance_, AnnoyIndex and AnnoyData_. Then, they should call load_annoy_prebuilt() with the appropriate types to return a pointer to a knncolle::Prebuilt object. This user-defined function should be registered in load_prebuilt_registry() with the key in knncolle_annoy::annoy_prebuilt_save_name.

We do not define a default function for loading Annoy indices as there are too many possible combinations of types. Instead, the user is responsible for deciding which combinations of types should be handled. This avoids binary bloat from repeated instantiations of the Annoy template classes, if the user's application only deals with a certain subset of combinations.

For unknown types or distances, users can set custom_save_for_annoy_index(), custom_save_for_annoy_data() and/or custom_save_for_annoy_distance(). Each custom function saves additional information about its type to disk during a knncolle::Prebuilt::save() call. That information can then be parsed in the user-defined knncolle::LoadPrebuiltFunction to recover an Annoy index with the appropriate template types.

Template Parameters
Index_Integer type for the observation indices.
Data_Numeric type for the input and query data.
Distance_Floating-point type for the distances.
AnnoyDistance_An Annoy-compatible class to compute the distance between vectors. This should be the same as the distance reported in AnnoyPrebuiltTypes::distance.
AnnoyIndex_Integer type for the observation indices in the Annoy index. This should be the same as the type reported by AnnoyPrebuiltTypes::index.
AnnoyData_Floating-point type for data in the Annoy index. This should be the same as the type reported by AnnoyPrebuiltTypes::data.
AnnoyRng_An Annoy class for random number generation. This is provided for completeness and has no effect as the index is already built.
AnnoyThreadPolicy_An Annoy class for the threadedness of Annoy index building. This is provided for completeness and has no effect as the index is already built.
Parameters
dirPath to a directory in which a prebuilt Annoy index was saved. An Annoy index would typically be saved by calling the knncolle::Prebuilt::save() method of the Annoy subclass instance.
Returns
Pointer to a knncolle::Prebuilt Annoy index.

◆ load_annoy_prebuilt_types()

AnnoyPrebuiltTypes knncolle_annoy::load_annoy_prebuilt_types ( const std::filesystem::path & dir)
inline
Parameters
dirPath to a directory in which a prebuilt Annoy index was saved. An Annoy index would typically be saved by calling the knncolle::Prebuilt::save() method of the Annoy subclass instance.
Returns
Template types of the saved instance of a knncolle::Prebuilt Annoy subclass. This is typically used to choose template parameters for load_annoy_prebuilt().