Coherence
This notebook covers how to get started with the Coherence
vector store.
Coherence is an in-memory data grid that provides a distributed, fault-tolerant, and scalable platform for managing and accessing data. It is primarily used for high-performance, mission-critical enterprise applications that require low-latency access to large datasets. In addition to the commercially available product, Oracle also offers Coherence CE (Community Edition)
Setup
To access Coherence
vector stores you'll need to install the langchain-coherence
integration package.
pip install langchain_coherence
Initialization
Usage
Before using LangChain's CoherenceVectorStore you must ensure that a Coherence server (Coherence CE 25.03+ or Oracle Coherence 14.1.2+) is running
For local development, we recommend using the Coherence CE container image:
docker run -d -p 1408:1408 ghcr.io/oracle/coherence-ce:25.03.2
Basic Initialization
from langchain_core.documents import Document
from langchain_core.embeddings import Embeddings
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from coherence import NamedMap, Session
from langchain_coherence import CoherenceVectorStore
session: Session = await Session.create()
try:
named_map: NamedMap[str, Document] = await session.get_map("my-map")
embedding: Embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-l6-v2"
)
# this embedding generates vectors of dimension 384
cvs: CoherenceVectorStore = await CoherenceVectorStore.create(
named_map, embedding, 384
)
# other operations on the CoherenceVectorStore can be done
finally:
await session.close()
---------------------------------------------------------------------------
``````output
AioRpcError Traceback (most recent call last)
``````output
File ~/work/coherence/github/dhirupandey/langchain/libs/partners/coherence/.venv/lib/python3.9/site-packages/coherence/client.py:113, in _Handshake.handshake(self)
112 try:
--> 113 await stream.write(RequestFactoryV1.init_sub_channel())
114 response = await asyncio.wait_for(stream.read(), self._session.options.request_timeout_seconds)
``````output
File ~/work/coherence/github/dhirupandey/langchain/libs/partners/coherence/.venv/lib/python3.9/site-packages/grpc/aio/_call.py:526, in _StreamRequestMixin.write(self, request)
525 self._raise_for_different_style(_APIStyle.READER_WRITER)
--> 526 await self._write(request)
``````output
File ~/work/coherence/github/dhirupandey/langchain/libs/partners/coherence/.venv/lib/python3.9/site-packages/grpc/aio/_call.py:495, in _StreamRequestMixin._write(self, request)
494 if self.done():
--> 495 await self._raise_for_status()
497 serialized_request = _common.serialize(
498 request, self._request_serializer
499 )
``````output
File ~/work/coherence/github/dhirupandey/langchain/libs/partners/coherence/.venv/lib/python3.9/site-packages/grpc/aio/_call.py:272, in Call._raise_for_status(self)
271 if code != grpc.StatusCode.OK:
--> 272 raise _create_rpc_error(
273 await self.initial_metadata(), await self._cython_call.status()
274 )
``````output
AioRpcError: <AioRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "connections to all backends failing; last error: UNKNOWN: ipv4:127.0.0.1:1408: Failed to connect to remote host: connect: Connection refused (61)"
debug_error_string = "UNKNOWN:Error received from peer {grpc_status:14, grpc_message:"connections to all backends failing; last error: UNKNOWN: ipv4:127.0.0.1:1408: Failed to connect to remote host: connect: Connection refused (61)"}"
>
``````output
The above exception was the direct cause of the following exception:
``````output
RuntimeError Traceback (most recent call last)
``````output
Cell In[5], line 8
5 from coherence import NamedMap, Session
6 from langchain_coherence import CoherenceVectorStore
----> 8 session: Session = await Session.create()
9 try:
10 named_map: NamedMap[str, Document] = await session.get_map("my-map")
``````output
File ~/work/coherence/github/dhirupandey/langchain/libs/partners/coherence/.venv/lib/python3.9/site-packages/coherence/client.py:1932, in Session.create(session_options)
1930 session: Session = Session(session_options)
1931 await session._set_ready(False)
-> 1932 await session._handshake.handshake()
1933 if session._protocol_version > 0:
1934 COH_LOG.info(
1935 f"Session(id={session.session_id}, connected to [{session._session_options.address}]"
1936 f" proxy-version={session._proxy_version}, protocol-version={session._protocol_version}"
1937 f" proxy-member-id={session._proxy_member_id})"
1938 )
``````output
File ~/work/coherence/github/dhirupandey/langchain/libs/partners/coherence/.venv/lib/python3.9/site-packages/coherence/client.py:129, in _Handshake.handshake(self)
127 return
128 else:
--> 129 raise RuntimeError(
130 f"Unexpected error, {e}, when attempting to handshake with proxy: {e.details()}"
131 ) from e
132 except asyncio.TimeoutError as e:
133 raise RuntimeError("Handshake with proxy timed out") from e
``````output
RuntimeError: Unexpected error, <AioRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "connections to all backends failing; last error: UNKNOWN: ipv4:127.0.0.1:1408: Failed to connect to remote host: connect: Connection refused (61)"
debug_error_string = "UNKNOWN:Error received from peer {grpc_status:14, grpc_message:"connections to all backends failing; last error: UNKNOWN: ipv4:127.0.0.1:1408: Failed to connect to remote host: connect: Connection refused (61)"}"
>, when attempting to handshake with proxy: connections to all backends failing; last error: UNKNOWN: ipv4:127.0.0.1:1408: Failed to connect to remote host: connect: Connection refused (61)
Manage vector store
Add Documents and retrieve them:
from langchain_core.documents import Document
from langchain_core.embeddings import Embeddings
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from coherence import NamedMap, Session
from langchain_coherence import CoherenceVectorStore
session: Session = await Session.create()
try:
named_map: NamedMap[str, Document] = await session.get_map("my-map")
embedding: Embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-l6-v2"
)
# this embedding generates vectors of dimension 384
cvs: CoherenceVectorStore = await CoherenceVectorStore.create(
named_map, embedding, 384
)
d1: Document = Document(id="1", page_content="apple")
d2: Document = Document(id="2", page_content="orange")
documents = [d1, d2]
await cvs.aadd_documents(documents)
ids = [doc.id for doc in documents]
l = await cvs.aget_by_ids(ids)
assert len(l) == len(ids)
print("====")
for e in l:
print(e)
finally:
await session.close()
Delete Documents:
from langchain_core.documents import Document
from langchain_core.embeddings import Embeddings
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from coherence import NamedMap, Session
from langchain_coherence import CoherenceVectorStore
session: Session = await Session.create()
try:
named_map: NamedMap[str, Document] = await session.get_map("my-map")
embedding: Embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-l6-v2"
)
# this embedding generates vectors of dimension 384
cvs: CoherenceVectorStore = await CoherenceVectorStore.create(
named_map, embedding, 384
)
d1: Document = Document(id="1", page_content="apple")
d2: Document = Document(id="2", page_content="orange")
documents = [d1, d2]
await cvs.aadd_documents(documents)
ids = [doc.id for doc in documents]
await cvs.adelete(ids)
finally:
await session.close()
## Query vector store
Similarity Search:
from langchain_core.documents import Document
from langchain_core.embeddings import Embeddings
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from coherence import NamedMap, Session
from langchain_coherence import CoherenceVectorStore
def test_data():
d1: Document = Document(id="1", page_content="apple")
d2: Document = Document(id="2", page_content="orange")
d3: Document = Document(id="3", page_content="tiger")
d4: Document = Document(id="4", page_content="cat")
d5: Document = Document(id="5", page_content="dog")
d6: Document = Document(id="6", page_content="fox")
d7: Document = Document(id="7", page_content="pear")
d8: Document = Document(id="8", page_content="banana")
d9: Document = Document(id="9", page_content="plum")
d10: Document = Document(id="10", page_content="lion")
documents = [d1, d2, d3, d4, d5, d6, d7, d8, d9, d10]
return documents
async def test_asimilarity_search():
documents = test_data()
session: Session = await Session.create()
try:
named_map: NamedMap[str, Document] = await session.get_map("my-map")
embedding: Embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-l6-v2"
)
# this embedding generates vectors of dimension 384
cvs: CoherenceVectorStore = await CoherenceVectorStore.create(
named_map, embedding, 384
)
await cvs.aadd_documents(documents)
ids = [doc.id for doc in documents]
l = await cvs.aget_by_ids(ids)
assert len(l) == 10
result = await cvs.asimilarity_search("fruit")
assert len(result) == 4
print("====")
for e in result:
print(e)
finally:
await session.close()
## Usage for retrieval-augmented generation
Similarity Search by vector :
from langchain_core.documents import Document
from langchain_core.embeddings import Embeddings
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from coherence import NamedMap, Session
from langchain_coherence import CoherenceVectorStore
def test_data():
d1: Document = Document(id="1", page_content="apple")
d2: Document = Document(id="2", page_content="orange")
d3: Document = Document(id="3", page_content="tiger")
d4: Document = Document(id="4", page_content="cat")
d5: Document = Document(id="5", page_content="dog")
d6: Document = Document(id="6", page_content="fox")
d7: Document = Document(id="7", page_content="pear")
d8: Document = Document(id="8", page_content="banana")
d9: Document = Document(id="9", page_content="plum")
d10: Document = Document(id="10", page_content="lion")
documents = [d1, d2, d3, d4, d5, d6, d7, d8, d9, d10]
return documents
async def test_asimilarity_search_by_vector():
documents = test_data()
session: Session = await Session.create()
try:
named_map: NamedMap[str, Document] = await session.get_map("my-map")
embedding: Embeddings = HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-l6-v2"
)
# this embedding generates vectors of dimension 384
cvs: CoherenceVectorStore = await CoherenceVectorStore.create(
named_map, embedding, 384
)
await cvs.aadd_documents(documents)
ids = [doc.id for doc in documents]
l = await cvs.aget_by_ids(ids)
assert len(l) == 10
vector = cvs.embeddings.embed_query("fruit")
result = await cvs.asimilarity_search_by_vector(vector)
assert len(result) == 4
print("====")
for e in result:
print(e)
finally:
await session.close()
## API reference
Related
- Vector store conceptual guide
- Vector store how-to guides