MDEV-35327: Add VEC_DISTANCE_MANHATTAN (L1) distance function#4654
MDEV-35327: Add VEC_DISTANCE_MANHATTAN (L1) distance function#4654TQSH-Dev wants to merge 4 commits intoMariaDB:mainfrom
Conversation
vuvova
left a comment
There was a problem hiding this comment.
please add tests. with and without a vector index
Thanks for the review,will do it by tommorow. |
sql/item_vectorfunc.cc
Outdated
| switch (kind) { | ||
| case EUCLIDEAN: calc_distance= calc_distance_euclidean; break; | ||
| case COSINE: calc_distance= calc_distance_cosine; break; | ||
| case MANHATTAN: calc_distance= calc_distance_manhattan;break; |
There was a problem hiding this comment.
nit: additional space before break
sql/item_vectorfunc.h
Outdated
|
|
||
| public: | ||
| enum distance_kind { EUCLIDEAN, COSINE, AUTO } kind; | ||
| enum distance_kind { EUCLIDEAN, COSINE, MANHATTAN,AUTO } kind; |
| }; | ||
|
|
||
|
|
||
| Create_func_vec_distance_manhattan Create_func_vec_distance_manhattan::s_singleton; |
There was a problem hiding this comment.
nit: 1 blank line before and 2 blank lines after for consistency.
gkodinov
left a comment
There was a problem hiding this comment.
This is a preliminary review: I have no new remarks aside from what others already did.
added |
|
The build failure in rpl.rpl_row_foreign_key_mdl appears to be a random timeout unrelated to my changes, I am unsure as to why this is happening. My added test main.mdev_35327 passed successfully. |
mysql-test/main/mdev_35327.test
Outdated
| --echo # | ||
| --echo # With Vector Index | ||
| --echo # | ||
| CREATE VECTOR INDEX idx ON t1(v); |
There was a problem hiding this comment.
This is an index created with the euclidean distance. It cannot be used for searches with manhattan distance.
You didn't implement support for manhattan distance at all. It's rather more complex, than adding a new Item class.
There was a problem hiding this comment.
Thanks for the feedback. You are absolutely correct,I dug into the EXPLAIN plans and realized that while the VEC_DISTANCE_MANHATTAN function returns the correct values, the optimizer is still performing a full table scan. I'll work on this.
6af8462 to
5131eae
Compare
359e824 to
36d3ef8
Compare
|
Fixed the part_of_sortkey loop to correctly identify Manhattan distance for index utilization.New test cases in mdev_35327.test confirm the mhnsw index is now being utilized for Manhattan distance queries. |
Implemented the Manhattan distance function as described in the Jira task. Added item_vectorfunc implementation and registered the function in item_create