Troubleshooting & FAQ
This section provides solutions to common problems and answers to frequently asked questions about Grizabella.
Troubleshooting
Installation Issues
- Common problems with
poetry install
: - Ensure you have Poetry installed correctly. Refer to the official Poetry documentation.
- Make sure your
pyproject.toml
file is not corrupted and all dependencies are resolvable. - Try running
poetry lock --no-update
and thenpoetry install
if you suspect a lock file issue. -
Check for network connectivity issues if packages fail to download.
-
Missing Arrow C++ libraries for
lancedb
: - Grizabella uses
lancedb
, which in turn requires Apache Arrow C++ libraries. - On Debian/Ubuntu, install them using:
sudo apt-get install -y libarrow-dev
- On other systems, refer to the Apache Arrow installation guide for installing the C++ libraries.
-
Alternatively, you might be able to install a pre-built
pyarrow
wheel that includes these, but system-wide installation is often more reliable forlancedb
. -
Python version incompatibilities:
- Grizabella is developed and tested with Python 3.9+. Using older versions might lead to unexpected errors.
- Ensure your Poetry environment is configured to use a compatible Python version. You can specify this in your
pyproject.toml
or when creating the environment withpoetry env use <python_version>
.
Connection Issues (UI/API)
- "Cannot connect to database" / Default database not found:
- Check Path: Verify that the database path specified (or the default path
~/.grizabella/default_db
) exists and is accessible. - Permissions: Ensure Grizabella has read/write permissions for the database directory and its files.
- Default Database: If you haven't specified a custom path, Grizabella looks for a database at
~/.grizabella/default_db
. If this is your first time running Grizabella or if this directory was removed, it might not exist. The application should create it on first valid schema definition, but if you encounter issues, ensure the parent directory~/.grizabella
is writable.
Schema Definition Errors
- Common mistakes when defining
ObjectTypeDefinition
,EmbeddingDefinition
,RelationTypeDefinition
: - Referencing non-existent types: Ensure that any
ObjectType
referenced in aRelationTypeDefinition
(asfrom_type
orto_type
) or in anEmbeddingDefinition
(asobject_type_name
) has already been defined. - Incorrect property types: Property types in
ObjectTypeDefinition
must be valid Python types (e.g.,str
,int
,float
,bool
,list
,dict
) or Pydantic models. - Naming conventions: While not strictly errors, adhere to consistent naming for clarity.
- Embedding source fields: Ensure the
source_fields
in anEmbeddingDefinition
exist as properties in the correspondingObjectTypeDefinition
.
Embedding Generation Failures
- SentenceTransformer model download issues:
- Network: Check your internet connection. SentenceTransformer models are downloaded from Hugging Face Hub.
- Model name typo: Verify the model identifier (e.g.,
all-MiniLM-L6-v2
) is correct. A typo will prevent the model from being found. -
Firewall/Proxy: If you are behind a corporate firewall or proxy, ensure it allows connections to Hugging Face.
-
Dimension mismatches:
- If you are using pre-computed embeddings or a custom model, ensure the embedding dimensions match what
lancedb
expects or what your downstream tasks require. The default models have specific output dimensions.
Query Issues
- Complex query JSON parsing errors (UI input):
- If inputting complex queries directly as JSON in the UI (if such a feature is available), ensure the JSON is well-formed. Validate it with a JSON linter.
-
Pay attention to quotes, commas, and brackets.
-
No results returned:
- Check filters: Ensure your query filters are not too restrictive or unintentionally excluding all data.
- Data existence: Verify that data matching your query criteria actually exists in the database.
- Embedding similarity: If performing a vector search, the query vector might not be similar enough to any stored embeddings. Try broadening the search or using a more general query.
- Case sensitivity: Some text-based filters might be case-sensitive by default.
FAQ (Frequently Asked Questions)
- Q: How do I specify a custom database location?
-
A: You can set the
GRIZABELLA_DB_PATH
environment variable to your desired directory path before launching Grizabella. Alternatively, some Grizabella components or the API client might allow specifying the path programmatically. -
Q: What embedding models are supported?
-
A: By default, Grizabella is configured to use a model like
Colbert
(or a similar high-quality SentenceTransformer model). However, you can specify any SentenceTransformer model identifier from the Hugging Face Hub when defining anEmbeddingDefinition
. Grizabella will then attempt to download and use that model. -
Q: Can I have multiple embedding definitions for the same object type?
-
A: Yes. You can define multiple
EmbeddingDefinition
s for a singleObjectType
. Each definition can use different source fields and/or a different embedding model, allowing for diverse semantic representations of your objects. -
Q: How is data consistency maintained across the three layers (SQLite, LanceDB, Kùzu)?
-
A: The Grizabella API and its core
DBManager
orchestrate operations across the different database layers. When an object is created, updated, or deleted,DBManager
ensures that corresponding changes are made in the structured data store (SQLite), the vector store (LanceDB for embeddings), and the graph store (Kùzu for relations) as needed, based on your schema definitions. -
Q: Where can I report bugs or ask for help?
- A: Please report bugs, suggest features, or ask for help by creating an issue on our GitHub repository:
https://github.com/pwilkin/grizabella/issues
(Note: This is a placeholder URL).