diff options
Diffstat (limited to '.venv/lib/python3.12/site-packages/pgvector-0.3.6.dist-info/METADATA')
-rw-r--r-- | .venv/lib/python3.12/site-packages/pgvector-0.3.6.dist-info/METADATA | 554 |
1 files changed, 554 insertions, 0 deletions
diff --git a/.venv/lib/python3.12/site-packages/pgvector-0.3.6.dist-info/METADATA b/.venv/lib/python3.12/site-packages/pgvector-0.3.6.dist-info/METADATA new file mode 100644 index 00000000..c2f6dda8 --- /dev/null +++ b/.venv/lib/python3.12/site-packages/pgvector-0.3.6.dist-info/METADATA @@ -0,0 +1,554 @@ +Metadata-Version: 2.1 +Name: pgvector +Version: 0.3.6 +Summary: pgvector support for Python +Author-email: Andrew Kane <andrew@ankane.org> +License: MIT +Project-URL: Homepage, https://github.com/pgvector/pgvector-python +Requires-Python: >=3.8 +Description-Content-Type: text/markdown +License-File: LICENSE.txt +Requires-Dist: numpy + +# pgvector-python + +[pgvector](https://github.com/pgvector/pgvector) support for Python + +Supports [Django](https://github.com/django/django), [SQLAlchemy](https://github.com/sqlalchemy/sqlalchemy), [SQLModel](https://github.com/tiangolo/sqlmodel), [Psycopg 3](https://github.com/psycopg/psycopg), [Psycopg 2](https://github.com/psycopg/psycopg2), [asyncpg](https://github.com/MagicStack/asyncpg), and [Peewee](https://github.com/coleifer/peewee) + +[](https://github.com/pgvector/pgvector-python/actions) + +## Installation + +Run: + +```sh +pip install pgvector +``` + +And follow the instructions for your database library: + +- [Django](#django) +- [SQLAlchemy](#sqlalchemy) +- [SQLModel](#sqlmodel) +- [Psycopg 3](#psycopg-3) +- [Psycopg 2](#psycopg-2) +- [asyncpg](#asyncpg) +- [Peewee](#peewee) + +Or check out some examples: + +- [Embeddings](https://github.com/pgvector/pgvector-python/blob/master/examples/openai/example.py) with OpenAI +- [Binary embeddings](https://github.com/pgvector/pgvector-python/blob/master/examples/cohere/example.py) with Cohere +- [Sentence embeddings](https://github.com/pgvector/pgvector-python/blob/master/examples/sentence_transformers/example.py) with SentenceTransformers +- [Hybrid search](https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search/rrf.py) with SentenceTransformers (Reciprocal Rank Fusion) +- [Hybrid search](https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search/cross_encoder.py) with SentenceTransformers (cross-encoder) +- [Sparse search](https://github.com/pgvector/pgvector-python/blob/master/examples/sparse_search/example.py) with Transformers +- [Late interaction search](https://github.com/pgvector/pgvector-python/blob/master/examples/colbert/exact.py) with ColBERT +- [Image search](https://github.com/pgvector/pgvector-python/blob/master/examples/image_search/example.py) with PyTorch +- [Image search](https://github.com/pgvector/pgvector-python/blob/master/examples/imagehash/example.py) with perceptual hashing +- [Morgan fingerprints](https://github.com/pgvector/pgvector-python/blob/master/examples/rdkit/example.py) with RDKit +- [Topic modeling](https://github.com/pgvector/pgvector-python/blob/master/examples/gensim/example.py) with Gensim +- [Implicit feedback recommendations](https://github.com/pgvector/pgvector-python/blob/master/examples/implicit/example.py) with Implicit +- [Explicit feedback recommendations](https://github.com/pgvector/pgvector-python/blob/master/examples/surprise/example.py) with Surprise +- [Recommendations](https://github.com/pgvector/pgvector-python/blob/master/examples/lightfm/example.py) with LightFM +- [Horizontal scaling](https://github.com/pgvector/pgvector-python/blob/master/examples/citus/example.py) with Citus +- [Bulk loading](https://github.com/pgvector/pgvector-python/blob/master/examples/loading/example.py) with `COPY` + +## Django + +Create a migration to enable the extension + +```python +from pgvector.django import VectorExtension + +class Migration(migrations.Migration): + operations = [ + VectorExtension() + ] +``` + +Add a vector field to your model + +```python +from pgvector.django import VectorField + +class Item(models.Model): + embedding = VectorField(dimensions=3) +``` + +Also supports `HalfVectorField`, `BitField`, and `SparseVectorField` + +Insert a vector + +```python +item = Item(embedding=[1, 2, 3]) +item.save() +``` + +Get the nearest neighbors to a vector + +```python +from pgvector.django import L2Distance + +Item.objects.order_by(L2Distance('embedding', [3, 1, 2]))[:5] +``` + +Also supports `MaxInnerProduct`, `CosineDistance`, `L1Distance`, `HammingDistance`, and `JaccardDistance` + +Get the distance + +```python +Item.objects.annotate(distance=L2Distance('embedding', [3, 1, 2])) +``` + +Get items within a certain distance + +```python +Item.objects.alias(distance=L2Distance('embedding', [3, 1, 2])).filter(distance__lt=5) +``` + +Average vectors + +```python +from django.db.models import Avg + +Item.objects.aggregate(Avg('embedding')) +``` + +Also supports `Sum` + +Add an approximate index + +```python +from pgvector.django import HnswIndex, IvfflatIndex + +class Item(models.Model): + class Meta: + indexes = [ + HnswIndex( + name='my_index', + fields=['embedding'], + m=16, + ef_construction=64, + opclasses=['vector_l2_ops'] + ), + # or + IvfflatIndex( + name='my_index', + fields=['embedding'], + lists=100, + opclasses=['vector_l2_ops'] + ) + ] +``` + +Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance + +## SQLAlchemy + +Enable the extension + +```python +session.execute(text('CREATE EXTENSION IF NOT EXISTS vector')) +``` + +Add a vector column + +```python +from pgvector.sqlalchemy import Vector + +class Item(Base): + embedding = mapped_column(Vector(3)) +``` + +Also supports `HALFVEC`, `BIT`, and `SPARSEVEC` + +Insert a vector + +```python +item = Item(embedding=[1, 2, 3]) +session.add(item) +session.commit() +``` + +Get the nearest neighbors to a vector + +```python +session.scalars(select(Item).order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5)) +``` + +Also supports `max_inner_product`, `cosine_distance`, `l1_distance`, `hamming_distance`, and `jaccard_distance` + +Get the distance + +```python +session.scalars(select(Item.embedding.l2_distance([3, 1, 2]))) +``` + +Get items within a certain distance + +```python +session.scalars(select(Item).filter(Item.embedding.l2_distance([3, 1, 2]) < 5)) +``` + +Average vectors + +```python +from pgvector.sqlalchemy import avg + +session.scalars(select(avg(Item.embedding))).first() +``` + +Also supports `sum` + +Add an approximate index + +```python +index = Index( + 'my_index', + Item.embedding, + postgresql_using='hnsw', + postgresql_with={'m': 16, 'ef_construction': 64}, + postgresql_ops={'embedding': 'vector_l2_ops'} +) +# or +index = Index( + 'my_index', + Item.embedding, + postgresql_using='ivfflat', + postgresql_with={'lists': 100}, + postgresql_ops={'embedding': 'vector_l2_ops'} +) + +index.create(engine) +``` + +Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance + +## SQLModel + +Enable the extension + +```python +session.exec(text('CREATE EXTENSION IF NOT EXISTS vector')) +``` + +Add a vector column + +```python +from pgvector.sqlalchemy import Vector +from sqlalchemy import Column + +class Item(SQLModel, table=True): + embedding: Any = Field(sa_column=Column(Vector(3))) +``` + +Also supports `HALFVEC`, `BIT`, and `SPARSEVEC` + +Insert a vector + +```python +item = Item(embedding=[1, 2, 3]) +session.add(item) +session.commit() +``` + +Get the nearest neighbors to a vector + +```python +session.exec(select(Item).order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5)) +``` + +Also supports `max_inner_product`, `cosine_distance`, `l1_distance`, `hamming_distance`, and `jaccard_distance` + +Get the distance + +```python +session.exec(select(Item.embedding.l2_distance([3, 1, 2]))) +``` + +Get items within a certain distance + +```python +session.exec(select(Item).filter(Item.embedding.l2_distance([3, 1, 2]) < 5)) +``` + +Average vectors + +```python +from pgvector.sqlalchemy import avg + +session.exec(select(avg(Item.embedding))).first() +``` + +Also supports `sum` + +Add an approximate index + +```python +from sqlalchemy import Index + +index = Index( + 'my_index', + Item.embedding, + postgresql_using='hnsw', + postgresql_with={'m': 16, 'ef_construction': 64}, + postgresql_ops={'embedding': 'vector_l2_ops'} +) +# or +index = Index( + 'my_index', + Item.embedding, + postgresql_using='ivfflat', + postgresql_with={'lists': 100}, + postgresql_ops={'embedding': 'vector_l2_ops'} +) + +index.create(engine) +``` + +Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance + +## Psycopg 3 + +Enable the extension + +```python +conn.execute('CREATE EXTENSION IF NOT EXISTS vector') +``` + +Register the vector type with your connection + +```python +from pgvector.psycopg import register_vector + +register_vector(conn) +``` + +For [async connections](https://www.psycopg.org/psycopg3/docs/advanced/async.html), use + +```python +from pgvector.psycopg import register_vector_async + +await register_vector_async(conn) +``` + +Create a table + +```python +conn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))') +``` + +Insert a vector + +```python +embedding = np.array([1, 2, 3]) +conn.execute('INSERT INTO items (embedding) VALUES (%s)', (embedding,)) +``` + +Get the nearest neighbors to a vector + +```python +conn.execute('SELECT * FROM items ORDER BY embedding <-> %s LIMIT 5', (embedding,)).fetchall() +``` + +Add an approximate index + +```python +conn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)') +# or +conn.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)') +``` + +Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance + +## Psycopg 2 + +Enable the extension + +```python +cur = conn.cursor() +cur.execute('CREATE EXTENSION IF NOT EXISTS vector') +``` + +Register the vector type with your connection or cursor + +```python +from pgvector.psycopg2 import register_vector + +register_vector(conn) +``` + +Create a table + +```python +cur.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))') +``` + +Insert a vector + +```python +embedding = np.array([1, 2, 3]) +cur.execute('INSERT INTO items (embedding) VALUES (%s)', (embedding,)) +``` + +Get the nearest neighbors to a vector + +```python +cur.execute('SELECT * FROM items ORDER BY embedding <-> %s LIMIT 5', (embedding,)) +cur.fetchall() +``` + +Add an approximate index + +```python +cur.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)') +# or +cur.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)') +``` + +Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance + +## asyncpg + +Enable the extension + +```python +await conn.execute('CREATE EXTENSION IF NOT EXISTS vector') +``` + +Register the vector type with your connection + +```python +from pgvector.asyncpg import register_vector + +await register_vector(conn) +``` + +or your pool + +```python +async def init(conn): + await register_vector(conn) + +pool = await asyncpg.create_pool(..., init=init) +``` + +Create a table + +```python +await conn.execute('CREATE TABLE items (id bigserial PRIMARY KEY, embedding vector(3))') +``` + +Insert a vector + +```python +embedding = np.array([1, 2, 3]) +await conn.execute('INSERT INTO items (embedding) VALUES ($1)', embedding) +``` + +Get the nearest neighbors to a vector + +```python +await conn.fetch('SELECT * FROM items ORDER BY embedding <-> $1 LIMIT 5', embedding) +``` + +Add an approximate index + +```python +await conn.execute('CREATE INDEX ON items USING hnsw (embedding vector_l2_ops)') +# or +await conn.execute('CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100)') +``` + +Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance + +## Peewee + +Add a vector column + +```python +from pgvector.peewee import VectorField + +class Item(BaseModel): + embedding = VectorField(dimensions=3) +``` + +Also supports `HalfVectorField`, `FixedBitField`, and `SparseVectorField` + +Insert a vector + +```python +item = Item.create(embedding=[1, 2, 3]) +``` + +Get the nearest neighbors to a vector + +```python +Item.select().order_by(Item.embedding.l2_distance([3, 1, 2])).limit(5) +``` + +Also supports `max_inner_product`, `cosine_distance`, `l1_distance`, `hamming_distance`, and `jaccard_distance` + +Get the distance + +```python +Item.select(Item.embedding.l2_distance([3, 1, 2]).alias('distance')) +``` + +Get items within a certain distance + +```python +Item.select().where(Item.embedding.l2_distance([3, 1, 2]) < 5) +``` + +Average vectors + +```python +from peewee import fn + +Item.select(fn.avg(Item.embedding).coerce(True)).scalar() +``` + +Also supports `sum` + +Add an approximate index + +```python +Item.add_index('embedding vector_l2_ops', using='hnsw') +``` + +Use `vector_ip_ops` for inner product and `vector_cosine_ops` for cosine distance + +## History + +View the [changelog](https://github.com/pgvector/pgvector-python/blob/master/CHANGELOG.md) + +## Contributing + +Everyone is encouraged to help improve this project. Here are a few ways you can help: + +- [Report bugs](https://github.com/pgvector/pgvector-python/issues) +- Fix bugs and [submit pull requests](https://github.com/pgvector/pgvector-python/pulls) +- Write, clarify, or fix documentation +- Suggest or add new features + +To get started with development: + +```sh +git clone https://github.com/pgvector/pgvector-python.git +cd pgvector-python +pip install -r requirements.txt +createdb pgvector_python_test +pytest +``` + +To run an example: + +```sh +cd examples/loading +pip install -r requirements.txt +createdb pgvector_example +python3 example.py +``` |